Bing's Programming Experiences and Tips on C++: June 2011

Saturday, June 11, 2011

Fedora 15!

Released at 24 May, the Lovelock is another great step to a better user experience. I installed it today and was shocked by the revolutionary changes to the GNOME desktop, now it's the GNOME shell that I'm interacting with.

It may seem unfamiliar and even irritating when I found no minimize button and everything in my Desktop folder does not show up on the desktop and I have to find applications in another place. But, but, but after I read the release note which is well prepared for the situation that a user like me can have, it answered me all my questions, and in one hour, I started to love it! Fedora name this 15th release Lovelock, does that mean something?

The Super key is really handy, it calls up immediately all windows opened for you to select one. Also it brings the launch bar to front so launching an application can be done instantly with the advantage that it does not take any space when you are working in the application.

Also the Firefox 4, just great! I love it, especially the Sync.

As a programmer, I have the gcc 4.6 in this new release, I have vim 7.3, I have sons 2.0.1 ...

All in all, I love the Lovelock, better than ever.

Introduction to Algorithms - The Reading Notes: Part 4, Maximun Sum Sub-Array (BF)

Maximum Sub Sub-Array is an interesting problem. It goes like this, I'm given a series of values and I'm asked to find a continuous sub-series that has the maximum sum.

For example, if I'm given a series of value -3, 2, 5, 6, -1, 2, -5, 5, then I find that the sub-series 2,5,6 gives 13 which quite large, but 2, 5, 6, -1, 2 sums up even more, it gives 14. If I look further, I find that the sub-series 2, 5, 6, -1, 2, -5, 5 gives the same maximum of 14. So we may have more than one maximum and our goal is to find any one of them.

ITA mentions the brutal force algorithm that has Ω(n^2) running time, it says the algorithm is to calculate sums of all sub-series possible (there is O(n^2) number of sub-series) and if we can evaluate each sub-series in constant time then we can have Ω(n^2) algorithm, but it does not say how. ITA introduces also a very fast (O(nlgn) algorithm based on DaC (Divide and Conquer).

I understand that this ITA chapter is for DaC so no need to talk more details on this brutal force algorithm and it's put in the exercise. So, let's finish the Ω(n^2) brutal force algorithm first then we will see how to implement the DaC one.

So you already know the number of sub-series is O(n^2), suppose we have a n-length series, but why? The answer is this number equals to the number of ways you have when you try to select any two numbers in the series (those two numbers defines the start and end of the sub-series), that is n(n-1)/2.

Now comes to the question how we calculate the sum of each sub-series. If we just simply calculate sums for all sub-series, then we have to sum n sub-series of length 1, n-1 sub-series of length 2, n-2 sub-series of length 3, ..., 1 sub-series of length n. In total, we need to do add n+2(n-1)+3(n-2)...+ n times, which is equivalent to sum(for i = 1 to n) (i*(n-i+1)), which renders our algorithm O(n^3) in terms of time consumption.

Now we need to do more to get a faster algorithm. Just one more glance at our previous O(n^3) algorithm we find that the sub-series have many common parts and we add those parts repeatedly many times once for each sub-series. That's the key to reduce running time.

I will now use a n*n matrix to save sums of each sub-series, say, [0-1], [0-2]... [1,2], [1,3]... Also each single element represents also a sub-series of length 1, they are also stored in the matrix [0,0], [1,1] ... So only the upper triangle of the matrix is used. The advantage to save the sums in the matrix is that it's very easy to find a sub-series that can be used to build the next sum, because we know that:
sum(sub[i, j]) = sum(sub[i, j-1]) + v[j] = sum(sub[i, j-1]) + sub[j, j]

OK, that's enough for the theory, let's see the code. But wait a moment, before implementing the algorithm, let's put in place the unit test!

I'll use another class for the MaxSubArray unit test, separated from that of the sort algorithms. With CppUnit, I just simply create a .h and .cpp for the new tests, no other configuration is needed.

MiscTest.h

   1:  #ifndef MISCTEST_H

   2:  #define MISCTEST_H

   3:  #include <cppunit/extensions/HelperMacros.h>

4:

5:

   6:  class MiscTest : public CppUnit::TestFixture {

7:

   8:      CPPUNIT_TEST_SUITE( MiscTest);

   9:      CPPUNIT_TEST( testMaxSubArray_0);

  10:      CPPUNIT_TEST_SUITE_END();

11:

  12:      private:

  13:      public:

14:

  15:      void setUp();

  16:      void tearDown();

17:

  18:      // Test cases.

  19:      void testMaxSubArray_0();

20:

  21:  };

  22:  #endif

MiscTest.cpp

   1:  #include "MiscTest.h"

   2:  #include "MaxSubArray.h"

3:

   4:  using namespace Common::Algorithm::MaxSubArray;

5:

   6:  // Registers the fixture into the 'registry'

   7:  CPPUNIT_TEST_SUITE_REGISTRATION( MiscTest );

8:

   9:  void MiscTest::setUp()

  10:  {

  11:  }

  12:  void MiscTest::tearDown()

  13:  {

  14:  }

15:

  16:  // Test normal case.

  17:  void MiscTest::testMaxSubArray_0()

  18:  {

  19:      int iArray[] = {-10, 5, 7, -2, 19, 50, -30, 32, -24, 18};

  20:      std::vector<int> v(iArray, iArray + sizeof(iArray) / sizeof(int));

21:

  22:      int rStart = 0;

  23:      int rEnd = 0;

  24:      int rSum = 0;

  25:      MaxSubArray::CalcBF(v, rStart, rEnd, rSum);

26:

  27:      CPPUNIT_ASSERT_EQUAL( 1, rStart);

  28:      CPPUNIT_ASSERT_EQUAL( 7, rEnd);

  29:      CPPUNIT_ASSERT_EQUAL(81, rSum);

30:

  31:  }

Finally, I can start implementing the algorithm. Then I run the unit test each time I think my algorithm will work, then fix bugs and retry. Here is the final version that passes the test above:

   1:  void MaxSubArray::CalcBF(std::vector<int> &v, int &oStart, int &oEnd, int &oSum)

   2:  {

   3:      // Trivial cases, size = 0 or 1.

   4:      if (v.size() == 0)

   5:      {

   6:      oStart = 0;

   7:      oEnd = 0;

   8:      oSum = 0;

   9:      return;

  10:      }

11:

  12:      if (v.size() == 1)

  13:      {

  14:      oStart = 0;

  15:      oEnd = 0;

  16:      oSum = v[0];

  17:      return;

  18:      }

19:

  20:      oSum= 0;

  21:      oStart = 0;

  22:      oEnd = 0;

  23:      std::vector<int> vResult;

24:

  25:      // Initialize vResult.

  26:      for (size_t i = 0; i < v.size(); i++)

  27:      {

  28:      for (size_t j = 0; j < v.size(); j++)

  29:      {

  30:          if (i == j)

  31:          {

  32:          vResult.push_back(v[i]);

  33:          }

  34:          else

  35:          {

  36:          vResult.push_back(0);

  37:          }

  38:      }

  39:      }

40:

41:

  42:      // Calculate the results.

  43:      // We use a matrix to store sum of all intervals

  44:      // 00   01   02   03   04

  45:      // 10   11   12   13   14

  46:      // 20   21   22   23   24

  47:      // 30   31   32   33   34

  48:      // 40   41   42   43   44

  49:      // M[i,i] = v[i]

  50:      // M[i,j] = sum(i,j) for 0 <= i < j <= v.size().

  51:      // It takes O(n^2) time.

  52:      for (size_t i = 0; i < v.size(); i++)

  53:      {

  54:      for (size_t j = 0; j < v.size(); j++)

  55:      {

  56:          if (i < j)

  57:          {

  58:          vResult[ i * v.size() + j] = vResult[ i * v.size() + j - 1 ] +

  59:                                       vResult[ j * v.size() + j];

  60:          if (oSum < vResult[i * v.size() +j])

  61:          {

  62:              oSum = vResult[i * v.size() + j];

  63:              oStart = i;

  64:              oEnd = j;

  65:          }

  66:          }

  67:      }

  68:      }

  69:  }

Let discuss the faster version that takes only O(nlgn) time, in the next post. Thank you and like to know your feedback.

Thursday, June 2, 2011

Introduction to Algorithms - The Reading Notes: Part 3, Merge Sort

Merge sort is a better sorting algorithm than insertion sort in terms of worst case running time. It's also the algorithm that I ran into when I have my interview for my current job.

The ITA book uses it as an example of the Divide and Conquer philosophy which basically divides a big problem into several smaller ones then solve each of them first, when the smaller problems are solved, the big problem becomes easier to solve. This usually results in a recursive algorithm as the case of the merge sort. I'm not sure it's a good place to introduce this concept in the beginning part of the book, the learning curve is quite steep.

The merge sort contains two parts, the divide or recursive part and the merge part. The idea is to divide the whole list to sort into two lists each of half the original size and sort the two lists separately, then merge the two sorted smaller lists into one sorted list to get the problem solved. The magic is to repeat this procedure for the two smaller lists too, so they became four even smaller lists, and so on.

The merge part is quite straightforward, compare the elements at the head position of two sorted list, and put the larger (or smaller) into the result list until one of the list contains no element, then we can safely put the rest of the other list into the result list. Only one thing to remark, the vector constructor who takes to iterators s and e means to construct a vector that contains elements from s to e-1 of the original vector.

   1:  void MergeSort::Merge(std::vector<int> &v, int p, int q, int r, bool bNonDecreasing)

   2:  {

   3:      std::cout << "Merging sub array: " << p << ":"<< q-1 << " and " << q << ":" << r-1 << std::endl;

   4:      std::vector<int> vSub1(v.begin()+p, v.begin()+q);

   5:      std::vector<int> vSub2(v.begin()+q, v.begin()+r);

   6:      std::cout << "vSub1.size: " << vSub1.size() << ", vSub2.size: "<< vSub2.size() << std::endl;

   7:      int iLastValue = (bNonDecreasing ? INT_MAX : INT_MIN);

8:

   9:      vSub1.push_back(iLastValue);

  10:      vSub2.push_back(iLastValue);

11:

  12:      int i = 0;

  13:      int j = 0;

  14:      for (int k = p; k < r; k++)

  15:      {

  16:      if ( bNonDecreasing && (vSub1[i] < vSub2[j]) ||

  17:          !bNonDecreasing && (vSub1[i] > vSub2[j]))

  18:      {

  19:         v[k] = vSub1[i];

  20:         i++;

  21:      }

  22:      else

  23:      {

  24:          v[k] = vSub2[j];

  25:          j++;

  26:      }

27:

  28:      }

  29:  }

The recursive part is a bit tricky. Personally, I prefer to give a unique interface to the users of my sorting library, so I'd like to call my interface MergeSort::Sort(input_vector_to_sort, order). Then I found that I need a helper method that can be called recursively and I named it as DoSort(). DoSort() takes two additional parameters, the start and the end position in the list that defines the sub-list to be sorted. It starts by dividing the input vector into two smaller ones then it calls itself recursively for each smaller vector, finally it merges the two smaller vector (sorted now).

   1:  void MergeSort::DoSort(std::vector<int> &v, int p, int r, bool bNonDecreasing)

   2:  {

   3:      std::cout << "Sorting sub array: " << p << ":" << r << std::endl;

   4:      if (p < r)

   5:      {

   6:      int q = (p+r)/2;

   7:      std::cout << "Dividing sub array: " << p << ":" << q  << ", " << q+1 << ":" << r << std::endl;

   8:      DoSort(v, p, q, bNonDecreasing);

   9:      DoSort(v, q+1, r, bNonDecreasing);

  10:      Merge(v, p, q+1, r+1, bNonDecreasing);

  11:      }

  12:  }

Finally, the interface method is fairly simple:

   1:  void MergeSort::Sort(std::vector<int> &v, bool bNonDecreasing)

   2:  {

   3:      if (v.size() < 1 )

   4:      {

   5:      return;

   6:      }

7:

   8:      DoSort(v, 0, v.size()-1, bNonDecreasing);

   9:  }

Bing's Programming Experiences and Tips on C++

Saturday, June 11, 2011

Fedora 15!

Introduction to Algorithms - The Reading Notes: Part 4, Maximun Sum Sub-Array (BF)

Thursday, June 2, 2011

Introduction to Algorithms - The Reading Notes: Part 3, Merge Sort

Blog Archive

Labels

Followers

Bing's Programming Experiences and Tips on C++

Saturday, June 11, 2011

Fedora 15!

Introduction to Algorithms - The Reading Notes: Part 4, Maximun Sum Sub-Array (BF)

Thursday, June 2, 2011

Introduction to Algorithms - The Reading Notes: Part 3, Merge Sort

Blog Archive

Labels

Followers

Subscribe To