Scope of Variables Declared in for() The new ANSI C++ standard specifies that variables declared as in for(int i=1; ...) have a scope local to the for statement. Unfortunately, older compilers (like Visual C++ 5.0) use the older concept that the scope is the enclosing group. Below, I list two possible problems arising from this change and their recommended solutions. Say you want to use the variable after the for() statement. You would have to declare the variable outside of the for() statement. Code: int i; for(i=1; i<5; i++) { /* do something */ } if (i==5) ... Say you want to have multiple for() loops with the same variables. In this case, you'd put the for statement in its own group. You could also declare the variable outside of the 'for', but that would make it slightly trickier for an optimizing compiler (and a human) to know what you intended. Code: { for(i=1; i<5; i++) { /* do something */ } }
inline vs. __forceinline MS Visual C++, as well as several other compilers, now offer non-standard keywords that control the inline expansion of a function, in addition to the standard inline keyword. What are the uses of the non-standard keywords? First, let's review the semantics of inline. The decision whether a function declared inline is actually inline-expanded is left to the sole discretion of the compiler. Thus, inline is only a recommendation. For example, the compiler may refuse to inline functions that have loops or functions that are simply too large, even if they are declared inline. By contrast, the non-standard keyword __forceinline overrides the compiler's heuristics and forces it to inline a function that it would normally refuse to inline. I'm not sure I can think of a good reason to use __forceinline as it may cause a bloated executable file and a reduced instruction-cache hit. Furthermore, under extreme conditions, the compiler may not respect the __forceinline request either. So, in general, you should stick to good old inline. inline is portable and it enables the compiler to "do the right thing". __forceinline should be used only when all the following conditions hold true: inline is not respected by the compiler, your code is not to be ported to other platforms, and you are certain that inlining really boosts performance (and of course, you truly need that performance boost).
Debugging the Memory Leaks in MS VC++ 6.0 The failure to deallocate previously allocated memory is known as a memory leak. A memory leak is one of those hard to detect bugs, and it may cause unpredictable behavior in your program. To allocate memory on the heap, you call new. To deallocate, you call delete. If a memory object has not been deallocated, a memory leak dump for a leak can be seen in the Output window in the end of a VC++ debug session. The dump is as follows: {N} normal block at 0x00421C90, 12 bytes long. Data: < > CD CD CD CD CD CD CD CD CD CD CD CD In which N is a unique allocation request number that represents the sequence number of an allocation of the "leaked" object. This dump is not very helpful. To improve this dump, you can insert additional code that extends the memory leak dump with a file name and line number within this file, where the "leaked" allocation has occurred. This capability began with MFC 4.0 with the addition of the Standard C++ library. The MFC and C run-time library use the same debug heap and memory allocator. Here's the additional code: Code: #ifdef _DEBUG //for debug builds only #define new DEBUG_NEW #undef THIS_FILE static char THIS_FILE[] = __FILE__; #endif The __FILE__ is an ANSI C macro defined by compiler. The preprocessor fills the macro with a string, whose content is the current file name, surrounded by double quotation marks. An improved memory leak dump is as follows: Path\Filename (LineNumber): {N} normal block at 0x00421C90, 12 bytes long. Data: < > CD CD CD CD CD CD CD CD CD CD CD CD MS VC++ AppWizard and ClassWizard place the additional code, shown above, in the .CPP files that they create by default. The filename that is shown above will be a .CPP file name or .H file that has a template class code in them where the N-th object was allocated. Having the location where the "leaked" N-th object was allocated is not enough. You need to know when the memory leak occurred, because the line with allocation code can be executed hundreds of times. This is why setting a simple breakpoint on this line may not be adequate. The solution is to break an execution of the debugged program only at the moment when the "leaked" N-th object is allocated. To do this, stop the execution of your program in the debugger just after the beginning. Type in _crtBreakAlloc in the Name column of the Watch window and press the Enter key. If you use the /MDd compiler option, type in {,,msvcrtd.dll}*__p__crtBreakAlloc() in the Name column. Replace a number in the Value column with the value of N. Continue to debug using the same execution path, and the debugger will stop the program when the "leaked" N-th object will be allocated.
Negative Numbers Represented in C++ You probably know that integers are represented in binary--in base 2. This is pretty straightforward for positive numbers, but it means you must choose an encoding for representing negatives. The encoding used by C++ (as well was by C and Java) is two's complement. In two's complement, the first bit of a negative number is always 1. Otherwise, the number is 0 or postive. To find the bitstring representing a negative number, you take the bitstring representing the corresponding positive number and flip all the bits. Next, add 1 to the result. In the following example, Ive used 4-bit numbers for simplicity: -5d = -(0101b) = (1010b + 1b) = 1011b Notice that -1d is represented as all 1's: -1d = -(0001b) = 1110b + 1 = 1111b A nice property of this encoding is that you can subtract by negating and then adding the following: 7d - 3d = 0111b - 0011b => 0111b + 1100b + 1 => 0111b + 1101b = 0100b = 4d Yet another nice property is that overflows and underflows wrap around and cancel one another out, like this: 5d + 6d = 0101b + 0110b = 1011b = -(0100b + 1) = -0101b = -5d If you subtract 6 from this result (by adding its negation), youll get 5 back. First, compute -6: -6d = -(0110b) = 1001b + 1 = 1010b Then, add -6d to -5d to get the original value: 1011b + 1010b = 0101 Matthew Johnson
Access a Class Member Function Without Creating a Class Object In some cases, it is possible to call a class member function without creating the class object. In the following example, the program will print "hello world" although class A has never been created. When the program enters the "PrintMe" function, the "this" pointer is zero. This is fine as long as you don't access data members through the "this" pointer. Code: #include <stdio.h> class A { public: void PrintMe(); }; void A::PrintMe() { printf("Hello World\n"); } void main() { A* p = 0; p->PrintMe(); }
Notes about the system() Function The system() function (declared in <cstdlib>) launches another program from the current program. As opposed to what most users think, it doesn't return the exit code of the launched process. Instead, it returns the exit code of the shell that launches the process in question. Consider the following example: Code: int main() { int stat = system("winword.exe"); //launch Word and wait // until the user closes it } When the system() call returns, the value assigned to stat is the exit code of the shell process that in turn launches Word, not Word’s own exit status. Thus, examining the return code of system() is pretty useless in most cases. To collect the exit code of a launched application, you have to use an Interprocess Communication mechanism such as signals, pipes etc.
Using the Volatile Keyword to Avoid Failures When compiling a program, the compiler adds some optimizations that may cause your application to misbehave. For example consider the following code: Code: // To avoid threads waiting on the critical section in vain if (m_instance == NULL) { EnterCiriticalSection(pcs); if (m_instance == NULL) m_instance = new MyInstance(); } The compiler may cache the second condition (m_instance == NULL) and not update the content of m_instance if it has been changed by another thread. The solution is to declare the instance with the volatile keyword. This tells the compiler to get the content of m_instance every time it is used and not cache its content. The declaration is: volatile MyInstance* m_instance;
Testing the Copy Constructor and the Assignment Operator You have written a test program for your class X, and everything worked fine. How did you test the copy constructor and the assignment operator? Code: class X { public: X (const X&); const X& operator= (const X&); }; Okay, you called both methods, and neither hung up. But did you test the logical independence of source and target? Consider this: Code: X a1, a2; a1 = a2; The meaning of operator= should be two-fold: a1 is a logical copy of a2, but afterwards, there should be no further connection between these two objects: No operation on a1 should affect a2 and vice versa. Except for very seldom, specific cases, deviating from this rule makes maintenance a nightmare. You should conform to this rule, and you should test it. (If you think that code review by eye suffices, then recall that constructors and the assignment operator are never inherited. Did you take all possible subtleties like this into account in your code review?)
Determine Object Size Without Using the sizeof() Operator The sizeof() operator gives you the number of bytes required for storing an object. It also operates on a data type or a value. Another way of determining the size of an object is to use pointer arithmetic, as in the following example: Code: struct point { long x, y; }; int main() { struct point pt = {0}, *ppt = &pt; unsigned char *p1 = NULL, *p2 = NULL; size_t size = 0; p1 = (unsigned char*)(ppt); p2 = (unsigned char*)(++ppt); size = p2 - p1; // size is now 8 bytes (2 longs) // same as sizeof(struct point) or sizeof(pt) return 0; }
Overloaded Operators May Not Have Default Parameters Unlike ordinary functions, overloaded operators cannot declare a parameter with a default value (overloaded operator() is the only exception): Code: class Date { private: int day, month, year; public: Date & operator += (const Date & d = Date() ); //error, default arguments are not allowed }; This rule may seem arbitrary. However, it captures the behavior of built-in operators, which never have default operands either.
Constant pointers There are cases when you need to define a constant pointer to a variable/object; for instance, when taking a function address, or when you want to protect a pointer from unintended modifications such as assignment of new address, pointer arithmetic, etc. In fact, an object’s this is a constpointer. A constant pointer is declared: Code: int j = 10; int *const cpi = &j; //cpi is a constant pointer to an int *cpi = 20; //OK, j is assigned a new value cpi++; //Error; can not change cpi If you’re confused by the syntax, remember: const defines a constant pointer, whereas a const variable is declared like this: Code: const int k = 10; //j’s value may not be changed And a const pointer to a const variable: Code: int *const cpi = &k; //cpi is a constant pointer to a const int *cpi = 20; //Error; k’s value cannot be modified cpi++; //Error; can not modify a const pointer
Declaring Pointers to Data Members Although the syntax of pointers to members may seem a bit confusing at first, it is consistent and resembles the form of ordinary pointers, with the addition of the class name followed by the operator :: before the asterisk. For example, if an ordinary pointer to int looks like this: Code: int * pi; you define a pointer to an int member of class A like this: Code: class A{/**/}; int A::*pmi; // pmi is a pointer to an int member of A You can initialize a pointer to member: Code: class A { public: int num; int x; }; int A::*pmi = &A::num; // 1 The statement numbered 1 defines a pointer to an int member of class A and initializes it with the address of the member num. Now you can use the pointer pmi to examine and modify the value of num in any object of class A: Code: A a1; A a2; int n = a1.*pmi; // copy the value of a1.num to n a1.*pmi = 5; // assign the value 5 to a1.num a2.*pmi = 6; // assign the value 6 to a2.num Similarly, you can access a data member through a pointer to A: Code: A * pa = new A; int n = pa->*pmi; // assign to n the value of pa->num pa->*pmi = 5; // assign the value 5 to pa->num Or using a pointer to an object derived from A: Code: class D : public A {}; A* pd = new D; pd->*pmi = 5; // assign a value of 5 to pd->num
Declaring Pointers to Member Functions Pointers to member functions consists of the member function's return type, the class name followed by ::, the pointer's name, and the function's parameter list. For example, a pointer to a member function of class A that returns int and takes no arguments is defined like this (note that both pairs of parentheses are mandatory): Code: class A { public: int func (); }; int (A::*pmf) (); /* pmf is a pointer to some member function of class A that returns int and takes no arguments*/ In fact, a pointer to a member functions looks just like an ordinary pointer to function, except that it also contains the class's name immediately followed by the :: operator. You can invoke the member function to which pmf points like this: Code: pmf = &A::func; //assign pmf A a; A *pa = &a; (a.*pmf)(); // invoke a.func() // call through a pointer to an object (pa->*pmf)(); // calls pa->func() Pointers to member functions respect polymorphism. Thus, if you call a virtual member function through a pointer to member, the call will be resolved dynamically: Code: class Base{ public: virtual int f (int n); }; class Derived : public Base { public: int f (int h); //override }; Base *pb = new Derived; int (Base::*pmf)(int) = &Base::f; (pb->*pmf)(5); // call resolved as D::f(5); Note that you cannot take the address of a class's constructor(s) and destructor.
No Member-Function to Pointer-to-Member-Function Conversion Does you compiler accept the following code? Code: class A { public: void f(int); }; void (A::*pm)(int) = A::f; // #1 no ampersand before A::f A standard-compliant compiler should flag line #1 as an error. Let's examine it in further detail to see why. The expression A::f is called a qualified-id. A qualified-id denotes a member of a class. According to the C++ standard, there is no implicit conversion from a qualified-id denoting a member function to the type "pointer to member function". In other words, an ampersand must appear before the qualified-id if you want to take the address of the member function. For example: Code: void (A::*pm)(int) = &A::f; // now OK Programmers who are new to the concept of pointers to members are often confused by this subtlety. After all, when dealing with ordinary functions, implicit conversion from a function type to the type "pointer to function" does exist: Code: void g(); void (*pf) () = g; // OK, implicit conversion of g to &g However, there are many differences between a qualified-id and a plain function's name. Enabling implicit conversion of a qualified-id to a member function's address can therefore cause confusion and ambiguities. To avoid this, C++ requires that a class member's address be taken explicitly by preceding an ampersand to the qualified-id. Even if your compiler happens to accept the code in line #1, you should add an ampersand to avoid maintenance problems in the future, and to make your code more readable.
'Restrict' Pointers One of the new features in the recently approved C standard C99, is the restrict pointer qualifier. This qualifier can be applied to a data pointer to indicate that, during the scope of that pointer declaration, all data accessed through it will be accessed only through that pointer but not through any other pointer. The 'restrict' keyword thus enables the compiler to perform certain optimizations based on the premise that a given object cannot be changed through another pointer. Now you're probably asking yourself, "doesn't const already guarantee that?" No, it doesn't. The qualifier const ensures that a variable cannot be changed through a particular pointer. However, it's still possible to change the variable through a different pointer. For example: Code: void f (const int* pci, int *pi;); // is *pci immutable? { (*pi)+=1; // not necessarily: n is incremented by 1 *pi = (*pci) + 2; // n is incremented by 2 } int n; f( &n, &n); In this example, both pci and pi point to the same variable, n. You can't change n's value through pci but you can change it using pi. Therefore, the compiler isn't allowed to optimize memory access for *pci by preloading n's value. In this example, the compiler indeed shouldn't preload n because its value changes three times during the execution of f(). However, there are situations in which a variable is accessed only through a single pointer. For example: Code: FILE *fopen(const char * filename, const char * mode); The name of the file and its open mode are accessed through unique pointers in fopen(). Therefore, it's possible to preload the values to which the pointers are bound. Indeed, the C99 standard revised the prototype of the function fopen() to the following: Code: /* new declaration of fopen() in <stdio.h> */ FILE *fopen(const char * restrict filename, const char * restrict mode); Similar changes were applied to the entire standard C library: printf(), strcpy() and many other functions now take restrict pointers: Code: int printf(const char * restrict format, ...); char *strcpy(char * restrict s1, const char * restrict s2); C++ doesn't support restrict yet. However, since many C++ compilers are also C compilers, it's likely that this feature will be added to most C++ compilers too.
Assigning a Specified Memory Address to a Pointer In low-level programming and hardware interfaces, you often need to assign a pointer to a specific physical address. To do that, you have to cast the address value using the reinterpret_cast operator. Here's an example that shows how this is done: Code: void *p; // assign address 0x5800FF to p p = reinterpret_cast< void* > (0x5800FF);
Avoiding Crashes Due to Multiple Deletes Many times, a program crashes due to multiple deletes on the same pointer. In some cases, this is due to a programming error that can be removed. There are situations, however, where it may be unclear whether to delete a pointer or not. Let's consider the code given below: Code: void function(){ char *pcMemory; try { pcMemory = new char[25];//allocate memory //perform some operations delete[] pcMemory; //delete memory } catch(...){ } //perform some operations //Delete pcMemory or Not?? } If everything works fine, pcMemory will be properly deleted at the end of the try block. But if one of the operations between allocation and deletion throws an exception, pcMemory will remain allocated. At the end of function, it's not known whether to delete pcMemory or not. Deleting it may cause a double deletion followed by a crash. Similarly, if it's not deleted, it may result in a memory leak. The trick is to always set the memory pointer to NULL after deletion: Code: void function(){ char *pcMemory; try { pcMemory = new char[25];//allocate memory //perform some operations delete[] pcMemory; //delete memory pcMemory = NULL; } catch(...){ } //perform some operations delete[] pcMemory; //Deleting NULL is legal and not an error or crash. } Because a delete statement can easily be put at the end of function, whether pcMemory has been deleted or not, deleting NULL is absolutely safe.
When Are Pointers Equal? Pointers to objects or functions of the same type are equal if and only if in they are both NULL: Code: int *p1 = NULL, p2 = NULL; bool equal = (p1==p2); //true Or if they point to the same object: Code: char c; char * pc1 = &c; char * pc2 = &c; equal = (pc1 == pc2); // true Additionally, pointers are equal if they point one position past the end of the same array.
Never Use Incompatible Pointers to Member Functions A reader posted a message on one of the C++ newsgroups recently. He had to port a huge project from Borland C++ 5.0 to Visual C++ 6.0. The project uses the callback mechanism through pointers to member functions. Unfortunatly, these pointers are assigned member functions with incompatible signatures. For example: Code: class Base; //a pointer to a member function that takes no arguments and returns bool typedef bool (Base::*meth1)(); //a pointer to a member function that takes long and returns bool typedef bool (Base::*meth2)(long); class Base{ public: bool Base::perform(meth1 m, long param); }; Due to a historical design mistake, the pointer to member m (the first argument of perform) is sometimes called with more than one argument, and sometimes it is called with no arguments. Of course, a brute force cast is needed to enable that, as in the line numbered 1: Code: bool Base::perform(meth1 m, long param) { meth2 *p = (meth1 *)(&m); // 1.cast pointer to meth1 to pointer to meth2 meth2 q = *p; // force m to take an argument, although it's not supposed to take any return (this->*q)(param); //2 } For some reason, the application built under Borland's compiler tolerated this hack. However, under Visual C++, it crashes when the line numbered 2 executes. The reader wanted to know if there was a simple way to make this code execute under Visual C++ without encountering a runtime crash and without having to make code corrections. Whatever the answer may be (perhaps there is a way to tamper with the underlying code generation of Visual C++ and make this code somehow work), there is no escape from fixing the code so that it avoids incompatible casts of pointers to member functions. There are several reasons for that. First, even if there is a magic trick that will make this code run under Visual C++ 6.0 without a crash, that is only a temporary band-aid. There's no guarantee that this trick will work with future releases of Visual C++. Furthermore, the C++ Standard clearly says that using incompatible pointers to member functions yields undefined behavior. Founding an application on such shaky foundations -- clearly knowing that the code violates basic rules of the language -- is an ominous sign regarding the quality and the reliability of the software in question.
Passing More Arguments to a Callback Function Callback functions have a fixed signature so you cannot alter the number or type of the arguments it takes. For example, the standard qsort() function takes a pointer to a function that has the following signature: Code: int (cmp*)(const void *, const void *) //user's comparison function This signature allows you to pass exactly two arguments to your custom-made comparison function. However, suppose that the comparison function has to compare strings and you want to pass a third argument to indicate whether the comparison should be case-sensitive or not. You cannot simply add a third argument to the function call, because the compiler will complain about it (don't even think about casting the function pointer; see this tip). Instead, you can define a struct that contains two members: Code: struct Comparison { char *name; bool caseSensitive; }; Now instead of passing a pointer to char as the second argument, you can pass a pointer to an instance of that struct. The comparison function will unpack the struct and compare the strings accordingly: Code: int MyCompareFunc(const void *pfirst, const void * psecond) { const Comparison * pcmp = (Comparison *) psecond; const char * pstr1 = (const char *) pfirst; const char * pstr2 = pcmp->name; //extract string from struct if (pcmp->caseSensitive == true) //the second field of the struct { return strcmp(pstr1, pstr2) } else { //perform case-insensitive comparison } }