Thursday, August 16, 2012

Calculate the sum of all the primes below two million


The sum of the primes below 10 is 2 + 3 + 5 + 7 = 17.
Find the sum of all the primes below two million.
Question is from Project Euler Q10 , which is a very interesting website.

My Solution is as below, which has performance:
time a.out
sum of Prime = 142913828922
0.090u 0.005s 0:00.09 100.0%    0+0k 0+0io 0pf+0w 
        int l_iCount = 2;
        const int l_iMax = 2000000;
        const int IS_PRIME = 0;
        const int NOT_PRIME = 1;
        int l_arIntegers[l_iMax];
        //initialize and assume all to be prime
        memset(l_arIntegers, 0, sizeof(l_arIntegers));
        l_arIntegers[0] = NOT_PRIME;    //for 1
        l_arIntegers[1] = IS_PRIME;     //for 2
        long long l_iSumOfPrimes = 0;
        while(l_iCount < l_iMax)
        {
                if(l_arIntegers[l_iCount - 1] == IS_PRIME)
                {
                        for(int i = 2 * l_iCount; i < l_iMax; )
                        {
                                if(l_arIntegers[i - 1] != NOT_PRIME)
                                        l_arIntegers[i - 1] = NOT_PRIME;
                                i += l_iCount;
                        }
                        //it is a prime
                        l_iSumOfPrimes += l_iCount;
                        std::cout << "found a Prime = " << l_iCount << std::endl;
                }
                l_iCount++;
        }
        std::cout << "sum of Prime = " << l_iSumOfPrimes << std::endl;
When I started to code the solution, I tried the normal check for a prime number, it was too slow to be acceptable. then I thought maybe it is faster to create an array of all integers to be checked as above, just having simple comparison and assignment to 1, without tons of mod (%) operations. And the execution result proved to be working.
 
I have checked the recommended solutions from Project Euler after solving it, my solution is very close to the optimal answer.
 
Here is my Project Euler Label:
 

Wednesday, July 25, 2012

Find the greatest product of five consecutive digits

the problem is as below, got from ProjectEuler.net Problem 8:
Find the greatest product of five consecutive digits in the 1000-digit number.
7316717653133062491922511967442657474235534919493496983520312774506326239578318016984801869478851843858615607891129494954595017379583319528532088055111254069874715852386305071569329096329522744304355766896648950445244523161731856403098711121722383113622298934233803081353362766142828064444866452387493035890729629049156044077239071381051585930796086670172427121883998797908792274921901699720888093776657273330010533678812202354218097512545405947522435258490771167055601360483958644670632441572215539753697817977846174064955149290862569321978468622482839722413756570560574902614079729686524145351004748216637048440319989000889524345065854122758866688116427171479924442928230863465674813919123162824586178664583591245665294765456828489128831426076900422421902267105562632111110937054421750694165896040807198403850962455444362981230987879927244284909188845801561660979191338754992005240636899125607176060588611646710940507754100225698315520005593572972571636269561882670428252483600823257530420752963450
To do brute force calculation, logic is easy but kind of having bad performance. To practice and brainstorming some good algorithms to boost the checking speed can be ultimate fun in solving problems we faced.

My strategy is to check whether we can safely eliminate any 5 digits with 0 inside (which is obvious), then check for 1, 2, 3, till 9. comparing integer values must be much faster than multiply operations, not mention in thousands.

Below is my code, not polished yet, just to illustrate my ideas. My intention is not to show off my code but to ignite better ideas and faster solutions. 

int main()
{
        char l_arIntegers[1001] = {     "73167176531330624919225119674426574742355349194934\
96983520312774506326239578318016984801869478851843\
85861560789112949495459501737958331952853208805511\
12540698747158523863050715693290963295227443043557\
66896648950445244523161731856403098711121722383113\
62229893423380308135336276614282806444486645238749\
30358907296290491560440772390713810515859307960866\
70172427121883998797908792274921901699720888093776\
65727333001053367881220235421809751254540594752243\
52584907711670556013604839586446706324415722155397\
53697817977846174064955149290862569321978468622482\
83972241375657056057490261407972968652414535100474\
82166370484403199890008895243450658541227588666881\
16427171479924442928230863465674813919123162824586\
17866458359124566529476545682848912883142607690042\
24219022671055626321111109370544217506941658960408\
07198403850962455444362981230987879927244284909188\
84580156166097919133875499200524063689912560717606\
05886116467109405077541002256983155200055935729725\
71636269561882670428252483600823257530420752963450"};

        char* pIntegers = (char *)l_arIntegers;
        int l_iarMaxWithIdx[10] = {0, 9*9*9*9*1, 9*9*9*9*2, 9*9*9*9*3, 9*9*9*9*4, 9*9*9*9*5, 9*9*9*9*6, 9*9*9*9*7, 9*9*9*9*8, 9*9*9*9*9};


        int l_iMaxProduct = 1;
        int l_iEliminate = -1;

        for(int m = 0; m < 10; m++)
        {
                std::cout << "now check for m = " << m << ", Eliminate = " << l_iEliminate << " and max product " << l_iarMaxWithIdx[m] << " and current max product as " << l_iMaxProduct << std::endl;
                //check whether we can eliminate 1s
                if(l_iMaxProduct > l_iarMaxWithIdx[m])
                {
                        //we can safely eliminate ms already.
                        l_iEliminate += 1;
                        continue;
                }
                bool l_bReachEnd = false;
                for(int i = 4; i < 20 * 51;)
                {
                        if(pIntegers[i] - '0' <= l_iEliminate)
                        {
                                i += 5;
                                if(i >= 1001)
                                        l_bReachEnd = true;
                        }
                        else if(pIntegers[i - 1] - '0' <= l_iEliminate)
                        {
                                i += 4;
                                if(i >= 1001)
                                        l_bReachEnd = true;

                        }
                        else if(pIntegers[i - 2] - '0' <= l_iEliminate)
                        {
                                i += 3;
                                if(i >= 1001)
                                        l_bReachEnd = true;
                        }
                        else if(pIntegers[i - 3] - '0' <= l_iEliminate)
                        {
                                i += 2;
                                if(i >= 1001)
                                        l_bReachEnd = true;
                        }
                        else if(pIntegers[i - 4] - '0' <= l_iEliminate)
                        {
                                i += 1;
                                if(i >= 1001)
                                        l_bReachEnd = true;
                        }
                        else
                        {
                                if(i >= 1000)
                                        l_bReachEnd = true;

                                //none of previous 5 is 0
                                //check for any sum is greater than 9*9*9*9*1
                                int l_iProduct = (pIntegers[i] - '0') * (pIntegers[i-1] - '0') * (pIntegers[i-2] - '0') * (pIntegers[i-3] - '0') * (pIntegers[i-4] - '0');
                                std::cout << "now check for m = " << m << " and current product = " << l_iProduct << " with current max product as " << l_iMaxProduct << std::endl;
                                if(l_iProduct > l_iMaxProduct)
                                        l_iMaxProduct = l_iProduct;

                                if(l_iProduct > l_iarMaxWithIdx[m])
                                {
                                        //we can safely eliminate 1s now
                                        std::cout << "current continuous values are " << pIntegers[i-4] << pIntegers[i-3] << pIntegers[i-2] << pIntegers[i-1] << pIntegers[i] << std::endl;
                                        l_iEliminate += 1;
                                        i = 20 * 51;
                                }

                                i += 1;
                        }
                }
                if(l_bReachEnd)
                {
                        //no need to search again, we find the maximum
                        std::cout << "reach end before checking for " << (m+1) << ", max product  = " << l_iMaxProduct << std::endl;
                        break;
                }
        }

        std::cout << "max product  = " << l_iMaxProduct << std::endl;

        return 0
}



$ time a.out
now check for m = 0, Eliminate = -1 and max product 0 and current max product as 1
now check for m = 1, Eliminate = 0 and max product 6561 and current max product as 1
now check for m = 1 and current product = 882 with current max product as 1
now check for m = 1 and current product = 126 with current max product as 882
now check for m = 1 and current product = 294 with current max product as 882
now check for m = 1 and current product = 1764 with current max product as 882
now check for m = 1 and current product = 1470 with current max product as 1764
now check for m = 1 and current product = 630 with current max product as 1764
now check for m = 1 and current product = 630 with current max product as 1764
now check for m = 1 and current product = 270 with current max product as 1764
now check for m = 1 and current product = 135 with current max product as 1764
now check for m = 1 and current product = 432 with current max product as 1764
now check for m = 1 and current product = 648 with current max product as 1764
now check for m = 1 and current product = 648 with current max product as 1764
now check for m = 1 and current product = 324 with current max product as 1764
now check for m = 1 and current product = 180 with current max product as 1764
now check for m = 1 and current product = 180 with current max product as 1764
now check for m = 1 and current product = 20 with current max product as 1764
now check for m = 1 and current product = 90 with current max product as 1764
now check for m = 1 and current product = 270 with current max product as 1764
now check for m = 1 and current product = 378 with current max product as 1764
now check for m = 1 and current product = 1512 with current max product as 1764
now check for m = 1 and current product = 6048 with current max product as 1764
now check for m = 1 and current product = 1344 with current max product as 6048
now check for m = 1 and current product = 1344 with current max product as 6048
now check for m = 1 and current product = 960 with current max product as 6048
now check for m = 1 and current product = 1680 with current max product as 6048
now check for m = 1 and current product = 1680 with current max product as 6048
now check for m = 1 and current product = 5880 with current max product as 6048
now check for m = 1 and current product = 3920 with current max product as 6048
now check for m = 1 and current product = 1568 with current max product as 6048
now check for m = 1 and current product = 672 with current max product as 6048
now check for m = 1 and current product = 840 with current max product as 6048
now check for m = 1 and current product = 600 with current max product as 6048
now check for m = 1 and current product = 450 with current max product as 6048
now check for m = 1 and current product = 900 with current max product as 6048
now check for m = 1 and current product = 2700 with current max product as 6048
now check for m = 1 and current product = 540 with current max product as 6048
now check for m = 1 and current product = 972 with current max product as 6048
now check for m = 1 and current product = 1296 with current max product as 6048
now check for m = 1 and current product = 2916 with current max product as 6048
now check for m = 1 and current product = 972 with current max product as 6048
now check for m = 1 and current product = 3888 with current max product as 6048
now check for m = 1 and current product = 3888 with current max product as 6048
now check for m = 1 and current product = 5832 with current max product as 6048
now check for m = 1 and current product = 5832 with current max product as 6048
now check for m = 1 and current product = 15552 with current max product as 6048
current continuous values are 49698
now check for m = 2, Eliminate = 1 and max product 13122 and current max product as 15552
now check for m = 3, Eliminate = 2 and max product 19683 and current max product as 15552
now check for m = 3 and current product = 6048 with current max product as 15552
now check for m = 3 and current product = 5880 with current max product as 15552
now check for m = 3 and current product = 3920 with current max product as 15552
now check for m = 3 and current product = 900 with current max product as 15552
now check for m = 3 and current product = 2700 with current max product as 15552
now check for m = 3 and current product = 3888 with current max product as 15552
now check for m = 3 and current product = 3888 with current max product as 15552
now check for m = 3 and current product = 5832 with current max product as 15552
now check for m = 3 and current product = 5832 with current max product as 15552
now check for m = 3 and current product = 15552 with current max product as 15552
now check for m = 3 and current product = 11664 with current max product as 15552
now check for m = 3 and current product = 6480 with current max product as 15552
now check for m = 3 and current product = 7560 with current max product as 15552
now check for m = 3 and current product = 7560 with current max product as 15552
now check for m = 3 and current product = 13824 with current max product as 15552
now check for m = 3 and current product = 12096 with current max product as 15552
now check for m = 3 and current product = 12096 with current max product as 15552
now check for m = 3 and current product = 16128 with current max product as 15552
now check for m = 3 and current product = 8960 with current max product as 16128
now check for m = 3 and current product = 3840 with current max product as 16128
now check for m = 3 and current product = 3840 with current max product as 16128
now check for m = 3 and current product = 5760 with current max product as 16128
now check for m = 3 and current product = 11664 with current max product as 16128
now check for m = 3 and current product = 6480 with current max product as 16128
now check for m = 3 and current product = 6480 with current max product as 16128
now check for m = 3 and current product = 3600 with current max product as 16128
now check for m = 3 and current product = 8100 with current max product as 16128
now check for m = 3 and current product = 4500 with current max product as 16128
now check for m = 3 and current product = 6615 with current max product as 16128
now check for m = 3 and current product = 7560 with current max product as 16128
now check for m = 3 and current product = 7560 with current max product as 16128
now check for m = 3 and current product = 3240 with current max product as 16128
now check for m = 3 and current product = 12096 with current max product as 16128
now check for m = 3 and current product = 14112 with current max product as 16128
now check for m = 3 and current product = 2100 with current max product as 16128
now check for m = 3 and current product = 3150 with current max product as 16128
now check for m = 3 and current product = 6300 with current max product as 16128
now check for m = 3 and current product = 10080 with current max product as 16128
now check for m = 3 and current product = 18144 with current max product as 16128
now check for m = 3 and current product = 15552 with current max product as 18144
now check for m = 3 and current product = 15552 with current max product as 18144
now check for m = 3 and current product = 10368 with current max product as 18144
now check for m = 3 and current product = 10368 with current max product as 18144
now check for m = 3 and current product = 10368 with current max product as 18144
now check for m = 3 and current product = 8640 with current max product as 18144
now check for m = 3 and current product = 7776 with current max product as 18144
now check for m = 3 and current product = 810 with current max product as 18144
now check for m = 3 and current product = 1536 with current max product as 18144
now check for m = 3 and current product = 2048 with current max product as 18144
now check for m = 3 and current product = 3072 with current max product as 18144
now check for m = 3 and current product = 4608 with current max product as 18144
now check for m = 3 and current product = 4608 with current max product as 18144
now check for m = 3 and current product = 5760 with current max product as 18144
now check for m = 3 and current product = 6048 with current max product as 18144
now check for m = 3 and current product = 6048 with current max product as 18144
now check for m = 3 and current product = 5400 with current max product as 18144
now check for m = 3 and current product = 15552 with current max product as 18144
now check for m = 3 and current product = 15552 with current max product as 18144
now check for m = 3 and current product = 13608 with current max product as 18144
now check for m = 3 and current product = 40824 with current max product as 18144
current continuous values are 99879
now check for m = 4, Eliminate = 3 and max product 26244 and current max product as 40824
now check for m = 5, Eliminate = 4 and max product 32805 and current max product as 40824
now check for m = 6, Eliminate = 5 and max product 39366 and current max product as 40824
now check for m = 7, Eliminate = 6 and max product 45927 and current max product as 40824
now check for m = 7 and current product = 40824 with current max product as 40824
now check for m = 7 and current product = 31752 with current max product as 40824
now check for m = 7 and current product = 31752 with current max product as 40824
now check for m = 7 and current product = 24696 with current max product as 40824
now check for m = 7 and current product = 28224 with current max product as 40824
now check for m = 7 and current product = 28224 with current max product as 40824
now check for m = 7 and current product = 31752 with current max product as 40824
reach end before checking for 8, max product  = 40824
max product  = 40824
0.000u 0.001s 0:00.00 0.0%      0+0k 0+0io 0pf+0w

Monday, June 25, 2012

The future of institutional trading - slower trading through cloud based services?

"A venue that questions the need for speed is a new ATV called PDQ run by ex GETCO CEO Keith Ross. PDQ say they have developed new ways to allow the less fleet-footed to compete on a level playing field. They are aiming to persuade traders and institutions that their new trading systems offers a leg up by actually decelerating the execution of trades."




Cloud for Algo Trading?

It is kind of interesting topic for brainstorming. We are aiming to achieve faster speed with hardware acceleration. however some are already thinking to achieve better performance by slowing down the executions. However will it really make sense? Both strategies have same logic, a slower one can get better executions? 

Wednesday, June 20, 2012

Open Source Project Planner - OpenProj


Nowadays open source OpenProj becomes more and more popular for project management. How to use it efficiently and utilize its features to the maximum to help a better plan which is still adapt to changes along the path. (http://en.wikipedia.org/wiki/OpenProj)

OpenProj


here is a very good starting point to go through an online training material like http://www.scribd.com/doc/37439251/110/Understanding-the-Critical-Path  

Wednesday, May 16, 2012

基于业务设计技术架构(兼谈12306性能问题)by 陈勇

敏捷开发产品管理系列之八:基于业务设计技术架构(兼谈12306性能问题), 陈勇从业务员的角度分析怎么解决春节火车订票系统12306崩溃的问题。很有启发性。比方他提到的排队设计,确实是解决问题的好办法之一。

一个项目不是只需要技术人员,从另外一个角度,往往可以起到事半功倍的效果。以下是他的文章的链接。
 http://blog.csdn.net/cheny_com/article/details/7569724

Monday, May 7, 2012

Why Scalar Optimization Is Important by Richard Friedman

In this interesting article, Richard mentioned a very simple yet important Amdahl's Law. (http://en.wikipedia.org/wiki/Amdahl's_law#Parallelization)

Below is the formula.

S = T/((1-f)T+fT/N) = 1/(1-f+f/N)

He gave an example that If we can get half the executable code to run in parallel, and the other half running in one processor, what speedup can we expect on a 64–processor system?

Well, that gives a speedup S = 1.97 only!

The basic point here is that we should take the scalar latency optimization seriously when tuning the whole system. However we should not be misled by author to under estimate the importance of parallel as well for it is also important factor for another axis of the system performance measurement - throughput. Without high throughput support, low latency can be screwed while system load is heavy.

Original post of Richard is at http://blogs.sun.com/rchrd/entry/why_scalar_optimization_is_important, however it cannot be accessed or found in WWW any more. It is now collected in book "The Developer's Edge", which is an interesting book for reading.


The Developer's Edge

It is very generous that authors have published this book freely online at http://www.oracle.com/technetwork/server-storage/archive/r11-005-sys-edge-archive-495362.pdf 

One of the author Darryl's blog address is  http://www.darrylgove.com/ . You can always find the correct download link there.

Sunday, April 29, 2012

The Cost of Mutexes By Darryl Gove

sometimes we wonder what will be the cost of a mutex lock and unlock while we are concerned about the system performance at micro-seconds level.

Darryl Gove has done some interesting experiments in sun Solaris systems. The same idea can be applied to other OS like Linux as well.

His blog is used to be http://blogs.sun.com/d/entry/the_cost_of_mutexes and now it is moved to https://blogs.oracle.com/d/entry/the_cost_of_mutexes

His conclusion is that Mutex is 3 times slower than atomic operations.

And there is another point to note that he tested the cost of a normal function call comparing to inline functions, which shows 10ns required for a normal function call. This cost must be less and less with new generations of CPU, RAM and compilers.

Here I just quote what is really interested to me, you can check his full post by above links.

Excl.     Incl.      Name  
User CPU  User CPU         
 sec.      sec.       
3.973     3.973      
1.341     3.973      count
1.331     1.331      mutex_unlock
0.781     0.781      mutex_lock_impl
0.490     0.490      atomic_add_32
0.030     0.030      mutex_lock

It shows clearly that using of mutex, time is spent on an atomic_add_32, mutex_lock_impl and mutex_unlock. As mutex_lock_impl and mutex_unlock assembly code is very similar to atomic_add_32, it is no surprise that mutex will take around 3x time than atomic_add.

Below are the assembly codes:

time spent for the atomic_add_32 call: 
   0.010     0.010              [?]    2ecb8:  ld          [%o0], %o2
   0.040     0.040              [?]    2ecbc:  add         %o2, %o1, %o3
   0.010     0.010              [?]    2ecc0:  cas         [%o0] , %o2, %o3
## 0.370     0.370              [?]    2ecc4:  cmp         %o2, %o3
   0.        0.                 [?]    2ecc8:  bne,a,pn    %icc,0x2ecbc
   0.        0.                 [?]    2eccc:  mov         %o3, %o2
   0.050     0.050              [?]    2ecd0:  retl        
   0.010     0.010              [?]    2ecd4:  add         %o2, %o1, %o0

time spent for mutex_unlock and mutext_lock_impl is similar:
   0.        0.                 [?]    beff8:  mov         %o1, %o3
   0.020     0.020              [?]    beffc:  cas         [%o0] , %o2, %o3
## 0.560     0.560              [?]    bf000:  cmp         %o2, %o3
   0.        0.                 [?]    bf004:  bne,a       0xbeff8
   0.        0.                 [?]    bf008:  mov         %o3, %o2

Tuesday, March 27, 2012

Error - boost::unordered_map is not derived from type A

When we use STL or boost template classes like boost::unordered_map together with our own template class, as below:

#include "boost/unordered_map.hpp"#include <string> 
template < typename T >
class A
{
typedef boost::unordered_map< std:;string, T* > FtObjsTypeMap;

When we compile above class definition with gcc, following compilation error will be reported: 

A.h:7: error: type âboost::unordered_map<std::basic_string<char, std::char_traits<char>, std::allocator<char> >, T*, boost::hash<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::equal_to<std::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<const std::basic_string<char, std::char_traits<char>, std::allocator<char> >, T*> > >â is not derived from type âA<T>â

The reason is that This because boost::unordered_map< std:;string, T* > is dependent on the template parameter T of the class template A To enable correct parsing of the template without having to make any assumptions about possible specializations of any other templates, the language rules require you to indicate which dependent names denote types by using the typename keyword.

Following class A definition will pass the compilation: 
#include "boost/unordered_map.hpp"#include <string> 
template < typename T >
class A
{
typedef typename boost::unordered_map< std:;string, T* > FtObjsTypeMap;

The reason of the error is found at http://stackoverflow.com/questions/2841757/c-template-is-not-derived-from-type for my own recording and its mix usage with typedef, I posted my version of code here for future reference.