Stuff about computers...
Just the other day I saw a video that presented a new feature in Visual Studio 2011 called auto vectorization. This made me curious about the feature and whether it was available in the tools I have access to, e.g. g++.
The short answer is yes and if you use the optimization level 3, compilation flag -O3, you get this. It seems like the situation is the same for Visual Studio 2011. You get this automatically. So that is fine and end of story if you like. If you want some more details please read on.
The idea is to use the CPU registers introduced with technologies such as MMX or SSE. These vector registers can hold multiple scalar values and perform operations on them. That is a situation that occurs in a loop going over a vector.
int a[SIZE], b[SIZE], c[SIZE]; ... for (i=0; i<SIZE; i++) { a[i] = b[i] + c[i]; }
In the code fragment above the arithmetic in the loop can be vectorized meaning that the operations for more than one index can be performed in parallel using the vector registers. It is up to the compiler to analyse the code to figure out that the optimization can be applied.
With gcc this optimization if on by default from optimization level 3, compilation flag -O3. You can also turn it on by using the flag, -ftree-vectorize. Note however that you also need to turn on the use of the vector registers. On my x86 machine I need to add the compiler flag -msse2.
There is however another useful flag, -ftree-vectorizer-verbose. This causes the compiler to tell you if it has found a loop that it is able to optimize. This is good since compiler optimization is like a black box. You either have to run the program and measure it or analyse the assembly code to understand if the optimization took place. A verbose message is a great help in that situation so you know you are on the right track.
This must be tested of course. So I came up with this test program.
// Vectorization #include <chrono> #include <iostream> using namespace std; const int SIZE=2048; int a[SIZE], b[SIZE], c[SIZE]; void foo () { for (int j=0; j<1000000; j++) { for (int i=0; i<SIZE; i++) { a[i] = 0; b[i] = 10; c[i] = 100; } for (int i=0; i<SIZE; i++) { a[i] = b[i] + c[i]; } } } int main() { auto start = chrono::steady_clock::now(); foo(); auto diff = chrono::steady_clock::now() - start; auto ms = chrono::duration_cast<chrono::milliseconds>(diff); cout << "It took " << ms.count() << endl; }
In order to get some measurable execution time I had to loop over the loop one million times. The optimized version ran in 1122 milliseconds while the unoptimized program ran in 4424 milliseconds. That is something like a speed up of 4 times. This is due to that on the current machine the vector registers can store four integers at once. A vector register is 128 bits and an int is 32 bits.
Vectorization is an optimization technique that can be used automatically by a smart compiler. It is also easy to understand how it works. Programs that will benefit from it uses loops to manipulate data.
The optimization uses mechanisms that is on the CPU within the a single core. It does illustrate though how you can speed up execution by using parallelism and it helps in understanding the features in C++ AMP, Accelerated Massive Parallelism which I hope to be able to show more about in the future.
As a C++ programmer you are probably used to, and have accepted, the verbose code you have to write in order to set up a test case with CppUnit. C++ does not have the flexibility like other languages that can know about test cases by naming conventions etc and then set up almost everything automatically. In C++ you need to do most of that yourself by coding it.
So once upon a time you set up the boiler plate code from the CppUnit Cookbook to get the framework for your unit tests installed. Your main test program could very well look like this.
#include <cppunit/extensions/TestFactoryRegistry.h> #include <cppunit/ui/text/TestRunner.h> int main(int argc, char **argv) { CppUnit::TextUi::TestRunner runner; CppUnit::TestFactoryRegistry ®istry = CppUnit::TestFactoryRegistry::getRegistry(); runner.addTest(registry.makeTest()); bool wasSuccessful = runner.run("", false); return wasSuccessful; }
This is what it takes to run your test cases using the TextUi::TestRunner?. At that time you had a small set of tests but, as the project continued, more tests were added. And now after some time, if you did your homework right, your test suite is anything from small. In fact it is starting to take some time to execute. To long time. When you are developing new test cases it has become a bottle neck in the test, code and refactor loop.
So you would like to limit the number of test cases you run in order to get up to speed. Can it be done? Yes!
The solution lies in that old code you picked from the Cookbook so long ago you hardly remember it. Maybe you, like me, lived long enough with it to assume that it is so it must be in C++ being such a verbose and non dynamic language as it is. Maybe you, like me, have used comments or preprocessor constructs to hide tests. Was I wrong! Single tests can be run. It has been there all the time at my finger tips. If you look carefully at the Cookbook code you'll see that the first parameter of the run method is an empty string. That string parameter is actually the name of the test or the test suite to run! The empty string is only the special case of running all tests!
Now equipped with this knowledge it is easy to change the code to allow us to specify what test to run. The most straight forward is to use the first arg, if present, to the unit test program to be the test case to run. Like this:
... bool wasSuccessful = runner.run((argc > 1) ? argv[1] : "", false); ...
Getting the predefined macros from a compiler might be the first thing you should do to learn your tools but in my case it seem to be the last thing I do. To get to the info has in my case often required reading through long an boring manuals. However in the case of gcc there is simple way to get the info directly from the compiler itself.
It goes like this.
echo | gcc -E -dM -
The option -E tells gcc just to run the preprocessor. -dM tells the preprocssor to output all defines.
The final twist is to use stdin to enter the code which in this case is just whitespace. There are other ways do that but I like this pipe version.
When I first learned about Unix many moons ago sed(1), the stream editor, was one of the tools to explore. I didn't find it particularly useful until now. How many times have I not wanted to filter out only a part of a file or a stream. Without knowing better I have used tools like awk(1) but more often written scripts in perl or python to do the job.
And now, years later, yes even decades later, I find that it has been there all the time in sed, the range concept! Look here:
sed -n '/start-regexp/,/end-regexp/p'
does the trick. All that is needed is there! Here is the details.
So /start-regexp/,/end-regexp/ defines a region in the stream and 'p' defines that it should be printed.
Voila! That is what we wanted. A simple way to select portions of a stream. Lets look at an example. This is how to get the output during one specific minute in the warn log:
sed -n '/Oct 20 15:59:/,/Oct 20 16:/p' < /var/log/warn
Really quite simple don't you think!
I have had some problems for a while with the mouse interaction on my vmware virtual machine running XP. The problem was that the mouse lost its binding to the virtual machine making it impossible to use the mouse. This is not the way to use Windows I must say! It is possible to crawl for a while but really working under these conditions are not really efficient to say the least.
Googling yet again came up this time with a workaround that really works:
GDK_NATIVE_WINDOWS=true vmware
The alternative using VMWARE_USE_SHIPPED_GTK="force", that I think I have used at times before, didn't cut it this time. vmware dumps core using that.
Got bitten by this problem once again but it was so long I had forgotten the cure! Had to Google it again. This is how to reset your keyboard under X.
setxkbmap
Strangely hidden is the property to use some, although small, visual effects in metacity. Here is a command to activate it:
gconftool-2 -s --type bool /apps/metacity/general/compositing_manager true
Using the graphic tool gconf-editor works of course as well.
I don't do this everyday so, guess what, I always forget how to do it and find myself spending lots of time searching for how to do it. This is a note so that I hopefully will know where to look when I forget it next time.
Hint one: Don't get lost in the massive amount of pages about setting the registry keys for activating logging. On XP Tracing is used instead. You activate it like this using a shell and from the command prompt you type:
>netsh netsh> ras netsh ras>set tracing PPP enabled netsh ras>exit
The log is found in %SYSTEMROOT%\tracing\PPP.LOG
.
You disable it the other way around. So from a command prompt you type:
>netsh netsh> ras netsh ras>set tracing PPP disabled netsh ras>exit
I don't remember how many times I have customized my Gnome system to use XEmacs with Gnus and message-mode as my mail writer application just to loose those settings due to some upgrade or something. This is a note so that I won't forget it as easily next time.
The trick is to wrap the code in a script. This way it is easy to use the default from the gnome control panel as a pattern for the customization. The script looks like this:
#!/bin/sh gnuclient -q -eval '(gnus-msg-mail (replace-regexp-in-string "^mailto:" "" "'"$1"'") "'"$2"'")'
Actually I have just picked this from the net with minor modifications.
The $2 arg is the subject line but mailto-links don't have subjects. It can be put to use tough if you execute the script and supply the args manually so I have kept the second arg.
When you are using a VPN connection to access some network in a secure way you normally loose your local network connection. All has to go through the secured network. This is all for good VPN reasons but is of course a practical problem. The VPN network might not allow as free access to the internet as you which so not all your services are available.
You might solve this by disconnecting and reconnecting the VPN connection in order to get to your local services. If you have to do this often during the work day it is a pain of course.
A simple alternative is to use a virtual machine for your VPN connection. With the use of a virtual machine you turn your physical machine into two computers. One with local access and the other with the VPN access.
This also can prove valuable in that it is almost like two different sessions. The VPN session will allow you to organize your bookmarks, tools etc to the tasks you do on the VPN network without disturbing your normal setup.
Like this it is simple to use your computer as you are used to while you work simultaneously on the VPN network.