Thursday, May 5, 2016

An Updated Reading List


Mathematics, Data Structures & Algorithms

Discrete Mathematics and Its Applications (Fifth Edition), Kenneth H.Rosen

Concrete Mathematics : A Foundation for Computer Science (Second Edition), Ronald L. Graham / Donald E. Knuth / Oren Patashnik


* Introduction to Algorithms (Second Edition), Thomas H. Cormen / Charles E. Leiserson / Ronald L. Rivest / Clifford Stein

Data Structures, Algorithms, and Applications in C++, Sartaj Sahni
Data Structures & Program Design In C (Second Edition), Robert Kruse / C.L. Tondo / Bruce Leung


The Art of Computer Programming
Donald E.Knuth

Compiler

Compilers: Principles, Techniques, and Tools (2nd Edition)
[Alfred V. Aho / Ravi Sethi / Jeffrey D. Ullman]

Advanced Compiler Design and Implementation
[Steven S. Muchnick]

Operating System

Operating System Concepts (Now we have the 7th Edition)
[Abraham Silberschatz / Peter Baer Galvin / Greg Gagne]

Operating Systems : Design and Implementation (Now we have the 3rd Edition)
[Andrew S. Tanenbaum / Albert S. Woodhull]


C++

C++ Primer
Lippman / Josee Lajoie]

*The C++ Programming Language (Special Edition)
[Bjarne Stroustrup]


*Effective C++ (Now we have the 3rd Edition)
*More Effective C++
*Effective Modern C++: 42 Specific Ways to Improve Your Use of C++11 and C++14 (New)
Scott Meyers



Modern C++ Design: Generic Programming and Design Patterns Applied
Andrei Alexandrescu

Go
The Go Programming Language
Alan A. A. Donovan

ASM

The Art of Assembly Language
Randall Hyde  (x86)

Introduction to 80x86 Assembly Language and Computer Architecture
Richard C. Detmer

UNIX Programming

Advanced Programming in the UNIX Environment
[W. Richard Stevens]

The UNIX Programming Environment
[Brianw. Kernighan]

UNIX Network Programming, Volume 1 : The Sockets Networking API (Third Edition)
[W. Richard Stevens / Bill Fenner / Andrew M. Rudoff]

UNIX Network Programming Volume 2 : Interprocess Communications (Second Edition)
[W. Richard Stevens]

Computer Networks 

Computer Networks (Fourth Edition)
[Andrew S. Tanenbaum]

TCP/IP Illustrated, Volume 1 : The Protocols
TCP/IP Illustrated, Volume 2 : The Implementation
TCP/IP Illstrated, Volume 3 : TCP for Transactions, HTTP, NNTP, and the UNIX Domain Protocols
[W. Richard Stevens]


Internetworking with TCP/IP Vol I : Principles, Protocols, and Architecture (Third Edition)
Internetworking with TCP/IP Vol II : Design, Implementation, and Internals (Second Edition)
Internetworking with TCP/IP Vol III : Client-Server Programming and Applications, BSD Socket Version (Second Edition)
Internetworking with TCP/IP Vol III : Client-Server Programming and Applications, Windows Sockets Version
[Douglas E. Comer / David L. Stevens]

Database

An Introduction to Database Systems (Seventh Edition)
[C. J.Date]

Database System Concepts (Fourth Edition)
[Abraham Silberschat / Henry F.Korth / S.Sudarshan]

Software Engineering / Craftsmanship

Code Complete: A Practical Handbook of Software Construction, Second Edition, Steven C. McConnell

**Pragmatic programmer: from journeyman to masterAndrew Hunt and David Thomas
Peopleware:Productive Projects and Teams, Tom DeMarco, Timothy Lister 

**The Software Craftsman: Professionalism, Pragmatism, Pride (Robert C. Martin Series)

Clean Code: A Handbook of Agile Software Craftsmanship

**Soft Skills: The software developer's life manual

**The Lean Startup: How Today's Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses


Data Science
***Data Science from Scratch: First Principles with Python, Joel Grus



Saturday, April 11, 2015

Fedora 21 Installed!

Haven't written anything since three years and now I come back, with my beloved Fedora.

Just installed the latest Fedora 21 and now I'm enjoying typing with it.

Fedora 21 provides a Scientific spin, very cool, C/C++, Python, LaTex and a lot of other new and old friends!

Wednesday, September 5, 2012

A Real-life Story about exception specification

Today, just before I thought I'd finish my day and go home, I got a urgent problem. One of our component kept coring, making services unavailable to clients.

Investigation started right away and we found the cause after a while. The cause is quite simple but unexpected.

The reason is we have a function specified with a exception specification. For those who are not familiar with the exception specification, it's the famous throw(A) that you can put after a function declaration or definition. It's used to tell compiler what exceptions should be allow to be thrown in the function, particularly, if nothing is in the (), that means the function is not allowed to throw anything.

Well we have one such function that with a throw() however, it DO throw an exception. So what happens? The exception is caught and treated as unexpected, then terminate() is called, then terminate() calls abort(), program aborts with a core (SIGABRT 6).

It's NOT hard to understand all of this, but we had a painful time to find out that it's actually this exception specification had caused the problem, it's in the same line as the function definition, far away from when the line where actually throws something.

It's a shame that after years of debate on this topic, we still fail on that. Here are some links:
http://www.gotw.ca/publications/mill22.htm
http://www.gotw.ca/gotw/082.htm
http://www.boost.org/development/requirements.html#Exception-specification
All those gurus suggest NOT USING EXCEPTION SPECIFICATION !

After beaten senselessly, now I remember, never write a exception specification.

Also I have to put another point on that, a function with exception specification is just a trap for other programmers who is working on it, it may not notice the very existence of the throw() and throw something in the function body, or even harder to prevent, he may just call a function that throws inside.  I'm such a victim, modified a function with throw() but throw something.






Saturday, June 30, 2012

Fedora 17 Installed!

Another step towards a better OS, Fedora 17 has been released quite some time ago but only I have a chance to upgrade to it for my PC. So now I'm posting on my new Fedora 17!

It has a new kernel version and most recent development tools. The only little thing is the reboot panic but solutions are out there.


Friday, April 27, 2012

A Real-life Story about Floating Point Value Comparison

I got a problem today in our software. In general, when we have two input double values like v1= 800 and v2= 750, it's working correctly while if v1=1000 and v2 = 900, it's not working as expected.

The logic is based on the comparison result of the two values. While for both two cases above we have v1 > v2, the results are not the same, why?

After tracking down to where the comparison is made, I found out that it's essentially written like this:
if (FormatToString(v1) > FormatToString(v2)) DoSomething();

I'm pretty sure, at the first look to this line, it's not doing what whoever wrote it wanted to do. The above line is actually doing a lexical comparison of the two strings converted from two double values.

Take a look at how the operator > is overloaded for string type in the basic_string.h :

  /**
   *  @brief  Test if string follows string.
   *  @param lhs  First string.
   *  @param rhs  Second string.
   *  @return  True if @a lhs follows @a rhs.  False otherwise.
   */
  template
    inline bool
    operator>(const basic_string<_CharT, _Traits, _Alloc>& __lhs,
          const basic_string<_CharT, _Traits, _Alloc>& __rhs)
    { return __lhs.compare(__rhs) > 0; }


Notice that in the comments, it's stated as "Test if string follows string"  and the implementation is using the basic_string::compare() method which by default will do a lexical comparison like this : it returns 0 if all the characters in the compared contents compare equal, a negative value if the first character that does not match compares to less (the character's ASCII value) in the object than in the comparing string, and a positive value in the opposite case.

So come to our example:
"800" > "750" : Ture, but
"1000" > "900" : False because '1' < '9'.

Fix it is simple, but the question is why the guy wrote this ? A simple mistake (and has been there since several years without being detected and corrected) ? I don't know exactly but tracking up in the code I found some clue that might explain.

Some lines before this buggy line, I found a similar comparison but it's only testing the equality of two double values and there, the double values are formatted into strings then compared for equality:

if (FormatToString(v1) != FormatToString(v2)) DoSomething();


This works fine and hides all the double value maths complexity behind the FormatToString method who knows exactly how many digits should be considered.

I guess later on, there comes another need to test not only for equality but also to know which value is larger then the guy took a simple copy-paste-modifiy way which results in this bug.


The the question comes back to how to compare two double values ? This is a hard question to answer, read this paper and you will know why. However, in our case, a really simple solution does exist. Guess what?

Just patch the shorter one of the two strings with leading '0's, then the lexical comparison will be an equivalent to the value comparison!

"1000" > "900"  : False, but
"1000" > "0900": True !


Friday, November 25, 2011

Introduction to Algorithms - The Reading Notes: Part 7, The Quick Sort

Now comes our famous QuickSort. It's famous because it's quick, so how quick it is? The answer from the theoretical analysis is Θ(n ln n) in average. OK, you may argue that the MergeSort and HeapSort we talked before also run in O(n ln n) , why this one is quicker? The answer is hidden in the big O notation, we usually don't write the constant factor in the function. In fact, QuickSort has a smaller hidden factor than others which distinguishes in practice QuickSort from the others. However, QuickSort can run, in the worst case, in O(n2) time. Now let's see how to write some code to do QuickSort.

As a starting point, we write our unit tests. As usual, we only need to add test cases in SortingTest.h and implement them in SortingTest.cpp.

    CPPUNIT_TEST( testQuickSort_0 );
    CPPUNIT_TEST( testQuickSort_1 );
    CPPUNIT_TEST( testQuickSort_Dec_0 );
    CPPUNIT_TEST( testQuickSort_Dec_1 );
...
    void testQuickSort_0();
    void testQuickSort_1();
    void testQuickSort_Dec_0();
    void testQuickSort_Dec_1();

The implementation of the tests which is quite straightforward, actually, same
as tests for other sorting algorithms except the method used to sort, here we 
use QuickSort::Sort(v): 



    // Setup Array.
...
    QuickSort::Sort(v);
...
    // Test Assertions.

Now if we try to compile, we will get some compiling errors, of course. We do
not have our QuickSort class created. So now it's the time to create it. 



    class QuickSort 
    {
        public:
            static void Sort(std::vector &v, bool bNonDecreasing = true);
        protected:
            static int  Partition(std::vector &v, int p, int r, bool bNonDecreasing);
            static void DoSort(std::vector &v, int p, int r, bool bNonDecreasing);
    };

The QuickSort provides exactly the same interface as other sorting algorithms which is the Sort method. As usual, we allow to sort in both non-decreasing and decreasing orders.

The DoSort method will recursively call itself to sort two sub-arrays partitioned by the method Partition.

Write the QuickSort class, compile it with scons, then run the unit tests. Failed!! Double check the code, then fix all the bugs until all unit tests pass. Below is the resulting code.

void QuickSort::Sort(std::vector &v, bool bNonDecreasing) 
{
    if (v.size() <= 1 )
    {
        return;
    }

    DoSort(v, 0, v.size()-1, bNonDecreasing);
}

int QuickSort::Partition(std::vector &v, int p, int r, bool bNonDecreasing)
{
    int pivot = v[r];
    int i = p-1; // i marks the end of the left partition.

    // j markes the end of the right partition
    for (int j = p; j <= r-1;  j++)
    {
        if ((v[j] < pivot &&  bNonDecreasing) ||
            (v[j] > pivot && !bNonDecreasing))
        {
            i++;
            int t = v[i];
            v[i]  = v[j];
            v[j]  = t;
        }
    }
    v[r] = v[i+1];
    v[i+1] = pivot;

    return i+1;
}

void QuickSort::DoSort(std::vector &v, int p, int r, bool bNonDecreasing)
{
    std::cout << "Sorting sub array: " << p << ":" << r << std::endl;
    if (p < r)
    {
        int q = Partition(v, p, r, bNonDecreasing);
        DoSort(v, p, q-1, bNonDecreasing);
        DoSort(v, q+1, r, bNonDecreasing);
    }
}


About this document ...

ITA Reading Notes This document was generated using the LaTeX2HTML translator Version 2008 (1.71)
Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html -html_version 4.0,math -no_navigation -scalable_fonts -split 0 -no_auto_link ITAReadingNotes-Main
The translation was initiated by Bing Han on 2011-11-25

Sunday, November 20, 2011

Introduction to Algorithms - The Reading Notes: Part 6, The Heap Sort

Heap Sort

The basic idea of Heap Sort is to sort with help of a special data structure called heap. A heap is a binary tree where each note in the tree contains a value. If the value of each node is no less than it's children, we call it a Max-Heap. If the value of each node is no more than it's children, we call it a Min-Heap.

Don't be confused by name heap, which is also used commonly as a notion of a region of memory space where program can dynamically allocate.

Heap is very interesting, on one hand, it's a binary tree, so it benefits binary tree properties such as it's height grows logarithmically with the number of nodes it has; on the other hand, it has one important order relationship between parent and child nodes.

As usual, we won't discuss in detail the theoretical aspect of the algorithm, instead, we focus on the implementation of the algorithm.

Before implementing the algorithm, let's first setup the unit test. As we already has our test infrastructure for previous sorting algorithms we discussed, we simply need to add some test cases inside.

For Heap Sort, I'm going to create a class with name HeapSort which provides a Sort method as other sort algorithms. So in the CppUnitTest/SortingTest.h, I add some test cases and their declarations:

CPPUNIT_TEST( testHeapSort_0 );
    CPPUNIT_TEST( testHeapSort_1 );
...
    void testHeapSort_0();
    void testHeapSort_1();

Then I define those tests (methods) in the CppUnitTest/SortingTest.cpp:

void SortingTest::testHeapSort_0()
{
    int iArray[] = {10, 5, 7, -2, 19, 50, 300, 320, 160, 1000};
    std::vector v(iArray, iArray + sizeof(iArray) / sizeof(int));
    HeapSort::Sort(v);
    CPPUNIT_ASSERT_EQUAL(v[0], -2);
    CPPUNIT_ASSERT_EQUAL(v[1], 5);
    CPPUNIT_ASSERT_EQUAL(v[2], 7);
    CPPUNIT_ASSERT_EQUAL(v[3], 10);
    CPPUNIT_ASSERT_EQUAL(v[4], 19);
    CPPUNIT_ASSERT_EQUAL(v[5], 50);
    CPPUNIT_ASSERT_EQUAL(v[6], 160);
    CPPUNIT_ASSERT_EQUAL(v[7], 300);
    CPPUNIT_ASSERT_EQUAL(v[8], 320);
    CPPUNIT_ASSERT_EQUAL(v[9], 1000);
}


void SortingTest::testHeapSort_1()
{
...
}

Now I can start coding the HeapSort class. First, I declare the main interface method, which is Sort for the HeapSort class, then I declare some helper methods Sort will use.

Same as described in the ITA, we will construct a binary tree directly on the input array, by using the index of an item in the array. So item 0 is always the root of the heap tree. Then, item 1 and 2 are two children of item 0, item 3 and 4 are two children of item 1, item 5 and 6 are children of item 2, and so on. In short, given a node i , we know that it's left child is 2*i + 1 and it's right child is (i + 1)*2 , if they exist, where i starts from 0. As for it's parent, it depends. If i = 0 , then it's the root node, so it does not really have a parent, otherwise, the parent node's index can be computed as (i - 1)/2 .

We declare Parent, Left and Right methods to do the computation described above.
Also, the Heap Sort is based on first building a MaxHeap ( in case we need to sort non-decreasingly, otherwise, we build a MinHeap), then each time takes the root (which is the largest item in the heap), then reconstruct the heap. So we will also declare two auxiliary methods BuildMaxHeap and MaxHeapify to do the job. Methods for building and reconstructing MinHeap are also provided. The whole HeapSort class declaration looks like follows:


class HeapSort 
{
    public:
    static void Sort(std::vector &v, bool bNonDecreasing = true);

    protected:
    static int  Parent(const int i) { return ((i > 0) ? (i - 1) >> 1: 0); }
    static int  Left(const int i)   { return (i << 1) + 1; }
    static int  Right(const int i)  { return (i + 1) << 1; }

    static void BuildMaxHeap(std::vector &v);
    static void BuildMinHeap(std::vector &v);
    static void MaxHeapify(std::vector &v, const int i, const int iHeapSize);
    static void MinHeapify(std::vector &v, const int i, const int iHeapSize);
};
Now comes the time to implement those methods. We will implement them in  a bottom-up way. So, first look at the MaxHeapify method, it takes  three parameters, a vector representing the heap (or a binary tree to be  built to a heap) and a integer i  which is an index in the vector, representing a sub-tree rooted at i . Also the heap size is provided as the third parameter. MaxHeapify assumes that two sub-trees rooted at both of i 's children are already MaxHeaps, only i  is possibly violating the MaxHeap property. So the goal is just to make a MaxHeap rooted at i . To this end, it simply recursively floats down the content of node i  until a proper place is found. "Float down" is just an exchange of i  with the maximum of i , Left(i ) and Right(i ). Of cause, in case that the maximum is i  itself, nothing needs to be done. The code reads as follows:


void HeapSort::MaxHeapify(std::vector &v, const int i, const int iHeapSize)
{
    int l = Left(i);
    int r = Right(i);

    int largest = i;

    if ( l < iHeapSize && v[l] > v[i])
    {
        largest = l;
    }

    if ( r < iHeapSize && v[r] > v[largest])
    {
        largest = r;
    }

    if (largest != i)
    {
        int temp = v[largest];
        v[largest] = v[i];
        v[i] = temp;

        MaxHeapify(v, largest, iHeapSize);
    }
}
Next, we will implement BuildMaxHeap. With the help of MaxHeapify, this  is fairly straightforward.  We simply need to first MaxHeapify the sub-trees at the lowest level  (actually the next to lowest, as the lowest are leaf nodes that does not  have any child so trivially they are MaxHeap). As a result, we can  start MaxHeapify the sub-trees from the middle of the vector.  The trick is, given a vector of size n , if we count from the end of the vector, the first item that may have a child always appears no early than the n/2 th item (think why?).  As a result, we have the following simply implementation of BuildMaxHeap:   

void HeapSort::BuildMaxHeap(std::vector &v)
{
    for (int i = (v.size() >> 1); i >= 0; i--)
    {
        MaxHeapify(v, i, v.size());
    }
}
We can implement similar methods for MinHeap too.

Now that we have everything in hand, let's implement the final method Sort. We start by building a MaxHeap (or MinHeap), then each time, we exchange the root of the heap (item 0) with the last item in the heap. This exchange does two things: put the current largest at the end of the vector and put a smaller one at the heap root, breaking heap property. Then we use MaxHeapify to reconstruct the heap.
Once the heap is reconstructed, we have again the largest item at root, then we simply repeat the exchange-reconstruct. The code looks like the following, sorting by non-decreasing or non-increasing order are supported with parameter bNonDecreasing, as other sort algorithms:


void HeapSort::Sort(std::vector &v, bool bNonDecreasing) 
{
    if (v.size() <= 1 )
    {
        return;
    }
    
    if (bNonDecreasing)
    {
        BuildMaxHeap(v);
    }
    else
    {
        BuildMinHeap(v);
    }

    int iHeapSize = v.size();
    
    for (int i = v.size() - 1; i >=1; i--)
    {
        int temp = v[0];
        v[0] = v[i];
        v[i] = temp;

        iHeapSize--;
        if (bNonDecreasing)
        {
            MaxHeapify(v, 0, iHeapSize);
        }
        else
        {
            MinHeapify(v, 0, iHeapSize);
        }
    }
}


About this document ...

ITA Reading Notes This document was generated using the LaTeX2HTML translator Version 2008 (1.71)
Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html -html_version 4.0,math -no_navigation -scalable_fonts -split 0 -no_auto_link ITAReadingNotes-Main
The translation was initiated by Bing Han on 2011-11-20