The phenomenon of stuffing excessive data which is beyond the capacity of the allocated memory is known as buffer overflow. In Simple words we can say that for lets say 'n' bytes of memory, if we try to store any number of values which is greater than 'n', then the values beyond the nth value will overflow the buffer of 'n' bytes and will be written in bytes beyond the nth byte(Which is like corrupting someone Else's memory) Example For example, lets go through the following peice of code : Code: #include<stdio.h> #include<string.h> int main(void) { int i = 0; char arr[5]; for(i=0;i<10;i++) arr[i] = 'A'; return 0; } Since in a proggramming language like 'C', it left upto the programer to manage memory on his/her own, so its possible sometimes that we mess up with the memory, like what we have done here. We allocated 5 bytes of character array, ie an array which can store 5 characters but we tried to stuff in 10 character values. This will not produce any warning or error but looking at what we have done we now know that we have written passed the allocated buffer of 5 bytes on stack. This is known as buffer overflow. Basic Exploit Buffer overflows are still common in todays world applications and are still exploited by Crackers(treating 'Hackers' as a positive term ) Though Crackers have been able to take control of the complete system by exploiting some basic buffer overflows, here I will describe a very basic kind of exploitation to get you familiar with. Lets try to understand this peice of code : Code: #include<stdio.h> #include<string.h> int authenticate(char *passwd) { int flag = 0; char pass[8]; strcpy(pass,passwd); if(!strcmp(pass,"hell") || !strcmp(pass,"heaven")) flag = 1; return flag; } int main(int argc, char*argv[]) { if(argc < 2) { printf("\n USAGE : %s password\n",argv[0]); return 1; } if(authenticate(argv[1])) { printf("\n Access granted \n"); } else { printf("\n No access granted \n"); } return 0; } In this peice of code, the program expects a password as an input from the user and tries to compare it with 'hell' and 'heaven'. If the passowrd matches, the code grants access(to something very secret, lets say A BIG BIG TREASURE). When the program was run, we tried some patterns. Some of them matched while some not. In a nutshell, everything went as expected!!! Here is the console output : Code: ~/practice $ ./bufferoverflow hell Access granted ~/practice $ ./bufferoverflow heaven Access granted ~/practice $ ./bufferoverflow world No access granted ~/practice $ ./bufferoverflow planet No access granted Now while playing with different passwords, something strange happened. Looking at the console : Code: ~/practice $ ./bufferoverflow AAAAAAAAAAAAA Access granted WTH!!!!!! What is this? this password was not in the list of allowed passowrds but still the code granted access to that BIG BIG TREASURE... So, this is the time to debug the code, how did the code let this happen, lets use GDB for this : Code: ~/practice $ gdb ./bufferoverflow GNU gdb (GDB) 7.1-ubuntu Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /home/himanshu/practice/bufferoverflow...done. (gdb) b authenticate:8 No source file named authenticate. Make breakpoint pending on future shared library load? (y or [n]) n (gdb) break authenticate:8 No source file named authenticate. Make breakpoint pending on future shared library load? (y or [n]) n (gdb) break bufferoverflow:8 No source file named bufferoverflow. Make breakpoint pending on future shared library load? (y or [n]) n (gdb) break bufferoverflow.c:8 Breakpoint 1 at 0x400617: file bufferoverflow.c, line 8. (gdb) break bufferoverflow.c:15 Breakpoint 2 at 0x40065b: file bufferoverflow.c, line 15. (gdb) run AAAAAAAAAAAAA Starting program: /home/himanshu/practice/bufferoverflow AAAAAAAAAAAAA Breakpoint 1, authenticate (passwd=0x7fffffffeb21 'A' <repeats 13 times>) at bufferoverflow.c:9 9 strcpy(pass,passwd); (gdb) x pass [B]0x7fffffffe750: 0xf7a69aa8[/B] (gdb) next 11 if(!strcmp(pass,"hell") || !strcmp(pass,"heaven")) (gdb) x &flag [B]0x7fffffffe75c: 0x00000041[/B] (gdb) next Breakpoint 2, authenticate (passwd=0x7fffffffeb21 'A' <repeats 13 times>) at bufferoverflow.c:15 15 return flag; (gdb) print flag [B]$1 = 65[/B] (gdb) c Continuing. [B] Access granted [/B] Program exited normally. I put two breakpoints in the function 'authenticate()'. When the first breakpoint hit, I saw the address of the local buffer 'pass'. It came out to be 0x7fffffffe750. Now I continued and saw the address of the variable 'flag'. It came out to be 0x7fffffffe75c. The difference between the addresses of these two variables comes out to be 12. ie, on stack, the variable 'flag' is 12 bytes away from the variable 'pass'. Now when I analyzed the value of 'flag', it came out to be '65'. How could this be possibe? As In the code either the 'flag' was set to '0' or '1'. Now I analyzed the flow of code : I passed a 13 byte password. strcpy function copied all the 13 bytes to array 'pass' (which was allocated only 8 bytes by the programer) This means, strcpy corrupted 5 (13 - 8) bytes. And the 5th corrupted byte was the starting byte of the integer 'flag'. Hence due to this buffer overflow, the value of flag changed from '0' to '65'(ascii value of 'A'). So in the main() function, the function 'authenticate()' returned 65, which is non-zero and hence the output came out to be 'Access granted'. HUH!!! tough time.... Conclusion Buffer overflows occur due to poor programming practices like using functions that do not have boundry checks (ex: strcpy()). Over the time the standard 'C' library has evolved and exposed new functions like 'strncpy()' whih mandate programer to enter the maximum number of bytes which are allowed to be copied into the destination buffer. These functions have no doubt helped reduce the buffer overflows but still finally it all depends on the quality of code written. Stay tuned for more!!!!!!
I have some tutorials on the same topic too! http://www.go4expert.com/showthread.php?t=24798 http://www.go4expert.com/showthread.php?t=24917 Have a look on them! BTW , Nice tutorial!
This is a great article. I never appreciated articles on C until I started learning Objective C. It teaches me some of the roots and helps me understand C based languages a little better. Great post!
this is really helpful....thanks to share ur knowledge.. do u know about c#. if yes share some information