A 64-bit horse that can count
1. Potential errors
2. How it all begun
4. Diagnosis of potential errors
The article concerns the peculiarities of Visual C++ compiler's behavior when generating 64-bit code and possible errors relating to it.
The phenomenon of "The Clever Hans", Mr. von Osten's horse, was described in 1911 . The Clever Hans was famous because of his ability to read and solve mathematical problems by tapping with his front hoof. Of course, there were a lot of skeptics. That's why a team of experts tested Hans' abilities and proved that the horse was showing them without any help of Mr. von Osten. But how could a common horse possess such an intellectual level - a human one?! The psychologist O. Pfungst carried out some very thorough experiments and discovered that Hans received very faint unintentional hints from those who were asking him questions. For example, when people asked Hans about anything they started to stare at his front hoof with the help of which the horse "answered". But as soon as Hans had tapped the right number, they raised their eyes or head just a little waiting for him to finish his answer. And the horse, that had been trained to note and use these very subtle motions considered them as signals to stop his action. From aside it looked as if the horse had given the right answer to the question.
Such a wonderful horse it was that counted and solved arithmetic problems although he was unable to do it. 64-bit programs turned out to be such digital horses of the beginning of the 21^st century, many of which cannot count either although are successful in pretending to do so. Let's consider this phenomenon in detail.
I am the author and co-author of some articles devoted to the problems of developing 64-bit applications. You can see the articles on our site: http://www.viva64.com/articles/64-bit-development/. In these articles, I try to use the term "a potential error" or "a hidden error" rather than just "an error" [2, 3, 4].
This is explained by that one and the same code can be viewed upon as both correct and incorrect depending on its purpose. A simple example - using a variable of int type for indexing an array's items. If we address an array of graphics windows with the help of this variable, everything is alright. We never need, and moreover it is impossible, to operate billions of windows. But when we use a variable of int type for indexing an array's items in 64-bit mathematical programs or data bases, it can well be a problem when the number of items excesses 0..INT_MAX range.
But there is one more much subtler reason to call errors "potential". The point is that it depends not only on the input data but on the mood of the compiler's optimizer if an error occurs or not. I have been avoiding this topic for a long time for most of such errors occur explicitly in the debug-version and only in release-versions they are "potential". But not every program built as debug can be debugged at large data sizes. There is a situation when the debug-version is tested only at very small sizes of data. And overload testing and testing by end users at actual data is performed only in release-versions where errors can be temporarily hidden. That's why I decided to tell you what I know about it. I hope that I will manage to persuade you that it is dangerous to rely only on the checks of the execution stage (unit-tests, dynamic analysis, manual testing) when porting a program on a different platform. You will say that all this is meant for promoting Viva64 tool. Yes, you are right, but still read the horror stories I'm going to tell you. I am fond of telling them.
- Why do you have two identical JMPs in a row in your code?
- What if the first one wouldn't work?
I faced the peculiarities of Visual C++ 2005 compiler's optimization for the first time when developing PortSample program. This is a project included into Viva64 distribution kit and is intended for demonstrating all the errors which Viva64 analyzer diagnoses. The examples included into this project must work correctly in 32-bit mode and cause errors in 64-bit one. Everything was alright in the debug-version but I faced difficulties in the release-version. The code which was to lead to a hang or crash in 64-bit mode worked successfully! The cause lay in optimization. The solution consisted in additional redundant complication of the examples' code and adding "volatile" key words which you can see in PortSample project in a great number.
The same relates to Visual C++ 2008. The code differs a bit but everything written in this article can be applied both to Visual C++ 2005 and Visual C++ 2008. We won't make any difference between them further.
If you think that it is good that some errors don't occur, refuse this thought. Code with such errors becomes very unstable and a smallest change of it not relating directly to an error can cause change of the code's behavior. To make sure, I would like to point out that this is not the fault of the compiler but of the hidden defects of the code. Further, we will show sample phantom errors which disappear and occur in release-versions when smallest alterations of the code are introduced and which you have to hunt for a long time.
The section will be long and boring, so I will begin with a funny story which is an abstract of the section:
Once Heracles was walking by a lake and there he saw Hydra. He ran up to her and cut her single head off. But instead of one head two more grew. Heracles cut them off too but 4 more appeared. He cut the 4 heads off - and there were 8 ones: So passed one hour, two hours, three hours: And then Heracles cut Hydra's 32768 heads off and Hydra died for she was 16-bit.
Like in this funny story errors lie in types' overflow which can occur or fail to occur depending on the code the compiler will generate when optimization is enabled. Let's consider the first example of the code which works in release mode although it shouldn't be so:
Let's consider another example of optimization and see how easy it is to spoil everything:
But the world is fragile. It is enough just to complicate the code a bit and it becomes incorrect:
Miraculously the array is filled correctly in the release-version. UnsafeCalcIndex function integrates inside the loop and 64-bit registers are used:
Let's modify (complicate) UnsafeCalcIndex function a bit. Pay attention that the function's logic has not been changed at all:
A program is a sequence of processing errors.
(c) An unknown author
I suppose that many already existing 64-bit applications or those which will be soon ported on 64-bit systems, can suddenly spring more and more unpleasant surprises. A lot of defects may be found in them when increasing the size of input data which was unavailable for processing in 32-bit systems. Hidden defects can suddenly occur during further modification of the program code or change of libraries or a compiler.
Like in the story about the horse, the first impression can be deceptive. It can only seem to you that your program processes large data sizes successfully. You need to perform a more thorough check to see exactly if your 64-bit horse can actually count.
To make sure that a 64-bit program is correct, the minimum thing you can do is to use not only the release-version but the debug-version as well at all stages of testing. Keep in mind that it is a necessary but far not sufficient condition. If your tests use data sets which, for example, don't cover a large main memory size, an error can fail to occur both in release- and debug-versions . It is necessary to extend unit-tests and data sets for overload and manual testing. It is necessary to make algorithms process new data combinations which are available only in 64-bit systems .
An alternative way of diagnosing 64-bit errors lies in using static analysis tools. It is much more radical and safe than guessing if you have added enough tests or not. It is convenient for it doesn't demand using the debug-version for crunching gigabytes of data.
The point of the method is to perform a full analysis of a project for a single time when porting the program and look through all the diagnostic messages on suspicious sections in the code. Many are frightened off by the list of thousands and tens of thousands of warnings. But the total time spent at once on analyzing them will be much less than the time spent on correcting various bug-reports appearing literally from nowhere for many years. It will be those very phantoms described above. Besides, when you start working with the list of warnings you will soon find out that most of them can be filtered and there will be much less work than you have expected. Further, you will have only to use static analysis for a new code and it doesn't take much time.
Of course, when speaking about a toolkit for searching 64-bit phantoms, I offer the tool that we develop - Viva64. By the way, this tool will soon be included into PVS-Studio which will unite all our static analysis tools.
To be more objective and avoid constantly being driven out from sites with this article as an advertising one, I will mention other tools as well. We should list Gimpel PC-Lint and Parasoft C++Test. Rules for testing 64-bit errors are implemented in them too, but they possess less diagnostic abilities than a highly tailored Viva64 . There is also Abraxas CodeCheck in the new version of which (14.5) functions of diagnosing 64-bit errors are also implemented but I don't possess more detailed information about it.
I will be glad if this article helps you master new platforms easier, for you will know what hidden problems can occur. Thank you for attention.
Re: A 64-bit horse that can count
Nomination this Article for Article of the month - May 2009
|All times are GMT +5.5. The time now is 22:54.|