Determine operand size

Discussion in 'Assembly Language Programming (ALP) Forum' started by vahagn_iv, Mar 8, 2010.

  1. vahagn_iv

    vahagn_iv New Member

    Joined:
    Mar 8, 2010
    Messages:
    3
    Likes Received:
    0
    Trophy Points:
    0
    Hallo all,

    I've just started to learn assembly and now trying to write a disassembler. My program determines prefixes and opcodes. But when I pass to the operand determination a problem occurs. Let us take as an example opcode 0x09 (OR). [UTL="hqqp://ref.x86asm.net/coder64.html"]Here[/URL] is the list of all x86 opcodes in 64 bit mode. The opcode 0x09 takes as first operand r/m16/32/64. The question is what are the cases for each operand size?

    I am not sure, but I think that it should depend on the presence of 0x67 prefix. But how the third size appears?? Or, may be I mix something....

    Thanks in advance.
     
  2. vahagn_iv

    vahagn_iv New Member

    Joined:
    Mar 8, 2010
    Messages:
    3
    Likes Received:
    0
    Trophy Points:
    0
    I mixed prefixed. Mean 0x66 instead of 0x67 in the original post.
     
  3. MazeGen

    MazeGen New Member

    Joined:
    Mar 9, 2010
    Messages:
    2
    Likes Received:
    1
    Trophy Points:
    0
    Hello vahagn, I'm the author of the opcode list.

    See also http://ref.x86asm.net/geek64.html#x09

    It says that the operand type is "vqp". If you look into the documentation, "vqp" means "Word or doubleword, depending on operand-size attribute, or quadword, promoted by REX.W in 64-bit mode."

    Since you pointed to coder64, I assume you have in mind a disassembler for 64-bit code. In this case, the size is 16 bits if prefix 66 is present (which switches "to the other operand size") or 64 bits if REX.W is present (REX.W takes precedence over prefix 66). If none of them are present, the size is 32 bits.

    I would recommend to start with disassembler for 32-bit code because 64-bit code encoding rules are more complicated.
     
    shabbir likes this.
  4. vahagn_iv

    vahagn_iv New Member

    Joined:
    Mar 8, 2010
    Messages:
    3
    Likes Received:
    0
    Trophy Points:
    0
    Hallom MazeGen,

    that is an excellent table. You've done a great job, at least for students.
    Thanks for your answer.

    I want my function void ReadOneInstruction(byte * bytestream,enum mode, enum addressing_mode) be able to read all types of instructions. Of course I will start from 32-bit code, but want to write program so that it be easy to modify it for other modes.

    I have two more questions.

    Do I understand right that

    1. depending on the EFLAGS register value(different modes) there are two possible operand sizes. The default size changes to the second one by specifying prefix 0x66. The same is for addressing mode with prefix 0x67???

    2. While parsing, I should assume that the program being parsed does not dynamically modify the EFLAGS register and , therefore, default operand size and addressing mode preserve their values?
     
  5. MazeGen

    MazeGen New Member

    Joined:
    Mar 9, 2010
    Messages:
    2
    Likes Received:
    1
    Trophy Points:
    0
    Hello vahagn,

    As for the first part of the question, it is not the EFlags register that sets current mode. The mode depends on values of specific bits in CR0 system register (see bit PE - "Protection Enable (bit 0 of CR0) — Enables protected mode when set; enables real-address mode when clear."

    In real mode, default operand and address size is always 16 bits. In protected mode, it depends on code segment descriptor setting. In 64-bit mode, default operand size is 32 bits, address size 64 bits.

    Anyway, your ReadOneInstruction function doesn't need to know this (the mode was set by the OS at boot time) - you just pass a pointer to bytestream and default operand and address sizes.

    It is not an easy reading, but you should get familiar with Intel manuals:

    intel.com/products/processor/manuals/

    Specifically you should get Volume 1: Basic Architecture; Volume 2A: Instruction Set Reference, A-M; Volume 2B: Instruction Set Reference, N-Z

    Segment descriptors etc. are described in Volume 3A: System Programming Guide.

    I wrote a 64-bit tour through Intel manuals, but it can help you to get general directions (chapter numbers are out of sync now):

    x86asm.net/articles/x86-64-tour-of-intel-manuals

    For the second part of the question, yes, those are prefixes 0x66 and 0x67. In 64-bit mode, REX.W overrides prefix 0x66 and sets 64-bit operand size (and because default address size is 64 bits, prefix 0x67 in 64-bit mode changes it to 32 bits).

    Yes, default sizes are set by the OS (CR0 register, segment descriptors) so they remain the same.

    This forum is so weird. Even if I use "hqqp" prefix to links, it says "Too many live links/images found in your post content. Please edit your post or contact the administrator." :shout:
     

Share This Page

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice