Go4Expert (http://www.go4expert.com/)
-   Assembly Language Programming (ALP) Forum (http://www.go4expert.com/forums/assembly-language-programming-forum/)
-   -   parsing an .exe file (http://www.go4expert.com/forums/parsing-exe-file-t2114/)

waisty 3Dec2006 23:20

parsing an .exe file
hi. i'm new on this forum, so please pardon me if i break any rules. I'm a computer science final year student working on my project, and a small part of it requires that i parse an .exe file (using assembly or C), to know what methods/procedures are used in the program. I've searched everywhere for it, to no avail. On a thread(not posted by me) on another site, it was considered a prank post, as it seemed impossible to the members. I assure you, this is no prank, and any replies would be highly appreciated. Thanks -- waisty

shabbir 4Dec2006 11:08

Re: parsing an .exe file
You have posted it as an Article under the Article / Source code section. I have moved it to the Queries and Discussion forum.

No it will not be considered as prank as we already have some codes to parse the exe. Refer Change Icon of EXE file through code extracting it from other EXE file

waisty 16Dec2006 00:26

Re: parsing an .exe file
Thanks, shabbir, for your reply. although the code works perfectly for changing the icon of an .exe file, it doesn't really help with actual parsing. what i want to do, is to parse the .exe file to check the internal structure. More precisely, I need to know where in the file each module(method) starts and stops, and where each segment starts and stops. Any help would be greatly appreciated.

DaWei 16Dec2006 01:13

Re: parsing an .exe file
Have you investigated the formats for the various kinds of .exe files? There are a few. Not all formats carry as extensive a set of information as others. None that I know of will give you the location of every procedure/method/function. Some will give you the location of imported modules/names. I would suggest that you review your assignment to make sure you have interpreted it correctly.

waisty 16Dec2006 02:37

Re: parsing an .exe file
I have done some investigation on .exe file format, and i know that none of the formats will tell you the location of every procedure. But i do know that there must be some line of machine code that signifies the beginning and end of a procedure. Also, its not an assignment, its kind of a final year project I chose for myself. Thanks for your continued help

DaWei 16Dec2006 04:51

Re: parsing an .exe file
Actually, there is not a line of machine code that signifies the beginning of a procedure. The code that signifies the return procedure might or might not be at the end, and it might occur more than once. A procedure is called by saving the current point of execution (wherever it may be) and setting the instruction pointer to the value representing the start of the procedure. The start of the procedure could be any set of values at all. If you thought that you could find all calls, then you could interpret the following address and infer the location of the procedure. Unfortunately, the location is a relative value, generally, in an exe file.

If you knew exactly which language, and which compiler of that language, produced the code, then you could presume some fairly standard overhead code, and look for that. Nothing in the world prevents blocks of data from containing those same values, however. The total effectiveness of such a process, for all exes, would probably border on crap.

Since I'm having to point all these things out, you are apparently somewhat of a novice (not a derogatory comment, just a presumption). It's possible that you've let your ambition overload your abilities, at this point.

waisty 16Dec2006 18:32

Re: parsing an .exe file
Thanks for your last reply. Actually, I am kind of a novice to assembly language programming, but I have quite a bit of experience in high level programming. You stated in your last reply that a procedure is called by saving the current point of execution and setting the instruction pointer to the value representing the start of the procedure. For each type of machine, there must be a specific machine code(or set of codes) that does the calling, and another set of machine codes for returning. Once i can get these machine codes for the different machines, then I can easily parse the .exe for the information i want. Since disassemblers do this all the time, it must be possible. Thanks again for your previous reply and continued help

DaWei 16Dec2006 20:01

Re: parsing an .exe file
The problem is that even the best disassemblers screw up. The extent of this screwup is determined by the extent to which non-code (inline data) is allowed to exist in the same area as the code.

Suppose, also, that the call instruction is represented by C3 nn nn, where nn nn is the address of the procedure to be called. Suppose, further, that some instruction wants to load a register with C3. You find this C3 by scanning the code. The instruction that follows would be interpreted as nn nn, rather than an instruction. The best that you can hope for is that screwups would amount to less than xx%, where xx is some value of failure which you consider acceptable. I would question quite severely your evaluation of an acceptable xx. If for no other reason, just to see you sweat and attempt to explain your conclusions.

waisty 17Dec2006 01:16

Re: parsing an .exe file
Well, for every instruction code for a specific machine, there is a specific length in bytes that the instruction uses. With this information, it would be easy for one to know where each instruction stops, and where the next one starts. So, for the case of C3 nn nn and xx C3, if the file is properly parsed, it would be easy to know if the C3 in question was an instruction, or an operand. So, if properly parsed, there should be no error. However, in the case where there may be errors, a seemingly high error would be tolerable, because, I am trying to reconstruct virus infected files, which would otherwise have to be deleted. Thanks for your help. And I look forward to your next reply.

DaWei 17Dec2006 02:02

Re: parsing an .exe file
Again, you're presuming that there is no data embedded in the code. Since you are so up on the ways to do it, however, I don't understand why you are posting to find an answer. Just write your disassembler, parse for the requisite codes (be sure to distinguish between absolute and relative calls), and turn in your project.

All times are GMT +5.5. The time now is 19:27.