HomePage RecentChanges About Projects Computers CPlusPlus11 New Archive

HomePage

Welcome! This is where I write stuff...

2013-04-29 CPlusPlus11

List of all articles about C++0X or C++-11.

2012-12-24 Rim

Nytt rim efter en längre tids uppehåll i rimmandet. Har haft lite svårt att ta mig tid till att sitta ner och fundera. Hur som helst här är JulRimSD.

Många undrade när fortsättningen kommer och det får vi väl se helt enkelt!

2012-10-10 CPlusPlus11

Template-ized typedefs and template Aliases

With C++11 we can define a typedef as a template using the keyword using. Suppose you have a type with the template type parameter T and the template int parameter size. Now typedefs can't be templates but with the new using syntax the type can be defined like this:

template<typename T, int size>
using TheType = array<T, size>;

Achieving the same result as typedef. Now we have a type, TheType?, that is a template type.

It does not end there. We can further partially bind a template using the same syntax. So if we would like to define a new type based on TheType? but with the size bound to 10 we would do like this:

template <typename T>
using SizeTen = TheType<T, 10>;

2012-08-24 XEmacs

Setting font in XEmacs under Windows

This is an example on how to set the fonts in XEmacs under Windows.

(when (memq (device-type) '(mswindows))
   (set-face-font 'default "Courier New:Regular:8::")
   (set-face-font 'buffers-tab "Courier New:Bold:8::")
   (set-face-font 'bold "Courier New:Bold:8::")
   (set-face-font 'italic "Courier New:Italic:8::"))

2012-08-21 XEmacs

New build slave added

Since a few days now we have a new build slave for the xemacs buildbot. It is an iMac: "Mac OSX 10.6.8, iMac 3.06 GHz Intel Core i3". Thanks Raymond Toy for providing this resource.

2012-07-13 XEmacs

Problems with openssl missing under cygwin

This is not really an XEmacs thing but hit me when I tried to get gnus working with XEmacs under cygwin. It seems you need to install the openssl development package under cygwin to get the openssl command line tool. That took me some time to figure out.

Connecting to an imap server using the same setup as under Linux produced no error message about the openssl tool missing. I just got an unhelpful connection denied form gnus. That error message fooled me into a bunch of other possible error causes before I understood that it was just that the tool openssl that was missing.

2012-06-15 CPlusPlus11

Raw String Literals

A raw string is a character sequence where there are no special characters. What you see is what you get. This makes it much easier to defined a string that contains a character sequence that would be interpreted as a special char if a normal string would have been used. Sequences like \n, \t, \b etc needs special treatment if not to be interpreted as special characters.

The syntax in its most straight forward form is this:

R"(<char-sequence>)"

R for raw and "( and )" as starting and ending sequence allowing characters between be just what they are. Here are some examples of raw strings:

R"(hello "world")"               -> hello "world"
R"(no newline \n or tab \t)"     -> no newline \n or tab \t

R"(
Multiline "string"
          is just as it is typed
)"                               -> Multiline "string"
                                              is just as it is typed

As seen in the last example you can easily define large multilined text this way.

Another situation where raw strings pay off is when there are lots of chars that normally needs to be escaped. This is the situation with regular expressions where the back slash character, \, is used either to give or remove special meaning to characters. So in regular expressions defined by normal strings the back slash character itself needs to be escaped. We don't need all this extra escaping when using raw strings.

Regexp support isn't in gcc yet so I'll come back with examples later.

To complete the definition of raw strings. What if you need the terminating sequence )" in your raw string? The general definition of a raw string, which allows you to specify a delimiter sequence, solves this:

R"<delimiter>(<char-sequence>)<delimiter>"

Where delimiter is an up to 16 char sequence with no whitespace. So you can write like this using # as delimeter:

R"#(')"')#" -> ')"'

2012-05-29 Power failure

Power failure during the afternoon. Now power is back and we are online again.

Below is the report from Fortum.

Avslutade avbrott det senaste dygnet
2012-05-29 15:23
Strömavbrott i Skärholmen.
Sluttid: 2012-05-29 18:18.
Maximalt 349 berörda kundanläggningar.

2012-05-25 Computers

Automatic vectorization

Just the other day I saw a video that presented a new feature in Visual Studio 2011 called auto vectorization. This made me curious about the feature and whether it was available in the tools I have access to, e.g. g++.

The short answer is yes and if you use the optimization level 3, compilation flag -O3, you get this. It seems like the situation is the same for Visual Studio 2011. You get this automatically. So that is fine and end of story if you like. If you want some more details please read on.

How does this optimization work?

The idea is to use the CPU registers introduced with technologies such as MMX or SSE. These vector registers can hold multiple scalar values and perform operations on them. That is a situation that occurs in a loop going over a vector.

int a[SIZE], b[SIZE], c[SIZE];
...
for (i=0; i<SIZE; i++)
{
   a[i] = b[i] + c[i];
}

In the code fragment above the arithmetic in the loop can be vectorized meaning that the operations for more than one index can be performed in parallel using the vector registers. It is up to the compiler to analyse the code to figure out that the optimization can be applied.

How to use it

With gcc this optimization if on by default from optimization level 3, compilation flag -O3. You can also turn it on by using the flag, -ftree-vectorize. Note however that you also need to turn on the use of the vector registers. On my x86 machine I need to add the compiler flag -msse2.

There is however another useful flag, -ftree-vectorizer-verbose. This causes the compiler to tell you if it has found a loop that it is able to optimize. This is good since compiler optimization is like a black box. You either have to run the program and measure it or analyse the assembly code to understand if the optimization took place. A verbose message is a great help in that situation so you know you are on the right track.

Will my program run faster?

This must be tested of course. So I came up with this test program.

// Vectorization

#include <chrono>
#include <iostream>

using namespace std;

const int SIZE=2048;
int a[SIZE], b[SIZE], c[SIZE];

void foo ()
{
   for (int j=0; j<1000000; j++)
   {
      for (int i=0; i<SIZE; i++)
      {
         a[i] = 0;
         b[i] = 10;
         c[i] = 100;
      }

      for (int i=0; i<SIZE; i++)
      {
         a[i] = b[i] + c[i];
      }
   }
}

int main()
{
   auto start = chrono::steady_clock::now();

   foo();

   auto diff = chrono::steady_clock::now() - start;
   auto ms = chrono::duration_cast<chrono::milliseconds>(diff);
   cout << "It took " << ms.count() << endl;
}

In order to get some measurable execution time I had to loop over the loop one million times. The optimized version ran in 1122 milliseconds while the unoptimized program ran in 4424 milliseconds. That is something like a speed up of 4 times. This is due to that on the current machine the vector registers can store four integers at once. A vector register is 128 bits and an int is 32 bits.

Summary

Vectorization is an optimization technique that can be used automatically by a smart compiler. It is also easy to understand how it works. Programs that will benefit from it uses loops to manipulate data.

The optimization uses mechanisms that is on the CPU within the a single core. It does illustrate though how you can speed up execution by using parallelism and it helps in understanding the features in C++ AMP, Accelerated Massive Parallelism which I hope to be able to show more about in the future.

More...