Tuesday, 24 July 2012

RVCT vs GCC - performance comparison on mergesort

I heard about RVCT compiler produced by ARM a lot of times. Some people are saying that it produces 30% faster code than GCC ARM compiler. Our real-time mobile application is compiled with GCC, so I decided to give RVCT a try.

I have downloaded 30-day evaluation RVDS Toolchain 4.1 which includes RVCT compiler. GCC that I use is of a little aged version 4.4.1 (July 22, 2009).

As the project we are working on is quite big and there is a lot of templates usage and so on, I would need to spend some time to make it compilable with RVCT ( "And then you discover that there are no two compilers that implement templates the same way" ). Just for now I have decided to make a small test apps to compare these compilers at first glance.

I am going to test the speed of the code that would implement two versions of a mergesort algorithm. First is iterative mergesort from here (replacing the line 34 with: l_max = (r - 1 < size) ? r - 1: size - 1;) iterative, and the second one is recursive from there . Both of the two versions I'm going to test on the same input file with 100000 unsorted integers. I generated the content for this file by heavy usage of rand() function.

The actual code looks like this, so I measure only time it takes merge to complete the sort.
DWORD t1 = GetTickCount();
mergeSort(arrayOne, arrayTwo, 100000);
DWORD t2 = GetTickCount() - t1;

On my desktop machine (Core i5-2400) the code compiled with Microsoft CL compiler with /O2 optimization took about 9-12 ms, and with optimization turned off 17-20 ms. But that's for comparison, lets move to GCC ARM and RVCT. I set GCC to -O3 optimization level, as well as RVCT. Here are the estimates for 5 consequential runs acquired on LG Optimus One P500 mobile phone with 600 MHz ARM 11 processor.

Iterative mergesort, milliseconds
RVCT - 133 135 133 135 140
GCC - 149 149 143 146 145

Recursive mergesort, milliseconds
RVCT - 92 89 88 88 91
GCC - 93 98 92 101 92

Results speak for themselves. It is prominent that in the above case implementation details of the same algorithm far outweigh the choice of a compiler. Anyway, I still have 20-something days left of evaluation for RVCT, and I am going to try it on a larger project. Upgrading GCC is an option also.

1 comment :

  1. How about code size ? they say RVCT excels at code-size too.