Say I have these two lines: Code: /* verify that the checksum matches */ if (l_iChecksum == p_lpszFileData[l_iDataSize]) 'l_iChecksum' is of type (unsigned int) which should be 4 bytes in length. Now I try to compare that with the 4 bytes at p_lpszFileData[l_iDataSize]. Of course, it won't work, it's only comparing 1 byte from p_lpszFileData[l_iDataSize]. My question is, should I fix it this way: Code: /* verify that the checksum matches */ if (l_iChecksum == *(unsigned int*)&p_lpszFileData[l_iDataSize]) Or is there any easier way to get this done?
You don't say what the type of p_lpszFileData[l_iDataSize] is. lpsz suggests that lpszFileData is a pointer to char, which is dereferenced using pointer arithmetic via an index. I don't know what the p_ notation is supposed to suggest other than an emphasis that it is indeed a pointer. How many bytes of the checksum are actually significant? All 4? I suspect not, if the checksum is stored in a char. A checksum of that type is often an add without carry. I don't think I would cast the pointer to an unsigned pointer and dereference it. I would AND l_iChecksum with 0x000000ff, cast it to a char, then compare it. I recommend that you think over your use of hungarian notation. Some people think it improves the readability and understandability of the code, but if it's multiplied for each variable or no one knows what the prefixes mean, the opposite goal is achieved.
The checksum is stored in 4 bytes at the end of a file. The "p_" stands for "parameter" to indicate it's a parameter from a function's sight. It is a pointer though, I indicated that with "lpsz" (long pointer to a zero terminated string). Actually I'm just trying out some naming conventions and other coding styles lately (I am busy with C and C++ for about one year now). I want to read 4 bytes from the 'p_lpszFileData' variable which is of type 'const char * '.
Then you're going to have to go around the block the long way. If you know how long the array is, then make an unsigned int pointer and point it to &array [size-4], pull the value, and compare it. The potential problem is one of portability and endianness. If you know you'll always be working with the same system (same endianness), then you only need to know how the 4 bytes were stored, in the first place, and follow that procedure when you pull them out. If that isn't amenable to the pull as an unsigned, then you'll have to pull and pack them the same way they were stored. It sounds more like a crc than a checksum. Whatever, it was no doubt calculated on the fly and either it or its complement was stored. You need to know that, too.
I was calculating the checksum manually. I'm writing a Tetris clone and I was adding highscores, the highscores file needed some encryption to prevent users from easily hacking it. The checksum stored will be in the same endianness as when read, because usually the highscores data file will be on one computer and not transferred to another. Have you got any suggestions for me? Oh and for the [size-4] thing, I already substracted 4 from 'l_iDataSize', forgot to put it in my start post.
C++ But please tell me in which way your solution (i.e. if you're going to give one) in C++ differs from C.
C++ is both more powerful and has easier to use constructs than C (because OO is built in, you don't have to labor for it). You can, of course, use OO in C, and you can write "C" programs in C++ (though all C code might not compile in C++ without adjustments). If I were doing this in C, I would use a lower level file I/O such as fread/fwrite. Even though file I/O is byte oriented, I would use the model where you specify how many elements you want to read/write, along with the size of the element. I would then transfer the data, calculating the checksum as I went, and finish off the file with a transfer of the negative (or complement, depending on how you like it) of the checksum. I would then read it back in the same fashion, calculating the checksum as I went. The checksum would also be included in the calculation, so the final checksum value should be zero. In C++, I would take a different tack. I would have a class or struct representing a score record for each person. This class or struct would have a method for computing its individual checksum, as well as a method (or friend) for outputting it to an output stream. I would also have a class representing the file, which would be comprised of a vector of the individual score records. This class would have a method for accumulating the individual checksum, a method for calling the output of all the members of the vector, and finish up with an output of the inverse of the accumulated checksum. The input mechanism would just do all this chit in reverse and indicate success or failure, based on the checksum results. Please understand that a checksum/crc is not a method of encryption. It merely offers a good probability (not 1.0 by any means) that the information was recovered without error or modification. Any reasonably proficient person can modify the file and defeat it.
Well heres my idea and how I've implemented it now: I first read the highscores file (using fread()), then decrypt it (using a custom algorithm). Then the checksum is calculated (without the last 4 bytes of the file to not include the checksum itself in the checksum) and then the checksum calculated is verified with the one specified in the file. Then I use sscanf() to put the data into my data structures with each score and each person's name. About your C++ way, I'd like to have a small code example of a class explaining your explanation (get it?). I'm really interested in learning more to do this in a more structured way than I'm doing now. You see, I was programming in assembly before and jumped onto C and C++ about a year ago, but even in C and C++ I'm sometimes coding like it's assembly, thinking too much low-level.
You're entirely correct. About 80% of my programming was in assembler. The purpose of the higher level languages is to introduce abstraction so that you can concentrate on solving the problem with a more human outlook, rather than accomodating the Von Neumann paradigm. As machines get more powerful and have more resources, the abstraction becomes possible. It's less efficient in terms of the micro, but more efficient in terms of the programmer. Hardware costs are down and people costs are up, so the abstraction yields a better return on investment. The deal is to ask yourself how you would solve the problem; forget the demands of the machine and work only with the demands of the specific language. I'll work something up along C++ lines and post it.
Thanks for taking the time to make the C++ example. About the abstractions and the optimizations of the compiler, I was lately looking into the output assembly of a C++ program and noticed that class constructors were inserted TWO times! Directly after the constructor, the same code was put after it another time. Why is this? Does it serve any purpose? I may not know enough about the C (or C++) language(s) to think as abstract to solve a problem, but I'm yet learning Can you recommend any good book about for example structured programming or data structures?