1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

About Memory Alignment

Discussion in 'C' started by dharmaraj.guru, Oct 31, 2007.

  1. dharmaraj.guru

    dharmaraj.guru New Member

    Joined:
    Oct 23, 2007
    Messages:
    16
    Likes Received:
    0
    Trophy Points:
    0
    Many of us know that both C and C++ do padding when allocating memory for structure.
    But only few know that why the complier does it. In short, for the efficient access of the memory for OS, padding is required.

    In long words, the instruction can be fetched/written one machine word at a time. This machine word differs depends on h/w. In a typical 32-bit machines, in one memory attempt, 4 bytes can be either read or written. Thus for the efficient access of memory, our compilers does padding by barring the unused memory.

    Let's take an example of a structure with and without having padding.
    Code:
    struct x
    {
    	char a;
    	int b;
    }x;
    
    Without padding :
    ********************
    If we count the size of structure x, it would become 5 bytes only. If the allocation goes as it is (5 bytes), then there is no issue in reading the member x.a. It can be read in single attempt. But for the second member x.b, OS has to do two reads, then calculate the result, return to the user. It holds true and even difficult when the case comes to write. Also, if members are increased, and not aligned properly, OS itself takes time in reading/writing a single member. Consider the following structure.
    Code:
    struct y
    {
    	char a;
    	int b;
    	char c;
    	char d;
    	char e;
    	int f;
    } y;
    
    For the above structure, memory attempt is even much more complicated for any member, thus resulting OS spending more time in accessing these members only.

    With Padding:
    *****************
    Lets take the same structure x again:
    Though, we are allocating 3 more bytes than the actual size, each member can be accessed in single attempt. If we consider the second case also, any member can be accessed in single attempt.

    One more point to note here is, if we list down the members in ascending order based on the size, we can utilize the memory in efficient manner. For example,
    Code:
    struct A
    {
    	char x;
    	char y;
    	int z;
    }A;
    
    occupies only 8 bytes, where as,
    Code:
    struct A
    {
    	char x;
    	int z;
    	char y;
    }A;
    
    occupies 12 bytes.

    Hence, it is always to have the lower size members at first which precede the bigger size members when writing the structures.

    Please correct me if I am wrong anywhere and do provide more info about this topic.

    ||| Dharma |||
     
  2. shabbir

    shabbir Administrator Staff Member

    Joined:
    Jul 12, 2004
    Messages:
    15,293
    Likes Received:
    365
    Trophy Points:
    83
  3. Kailash

    Kailash New Member

    Joined:
    Jul 24, 2007
    Messages:
    14
    Likes Received:
    0
    Trophy Points:
    0
    i got the concept of with padding
    but can't get the conept of without padding.
    how in structure x OS read x.a in one attempt and x.b in two attempt.
    please clarify me.
     
  4. dharmaraj.guru

    dharmaraj.guru New Member

    Joined:
    Oct 23, 2007
    Messages:
    16
    Likes Received:
    0
    Trophy Points:
    0
    Lets take the same structure x.

    x needs 5 bytes (1 char + 1 integer) for its storage. Lets say x is stored from the address 1000 to 1004. As we know on a 32 bit machine, all memory word addressses are 4 multiples only ie, word addresses can be 1000, 1004,1008,1012....,2000,.... only. It cannot be either 1001 or 1002. Hence, when we access x.a(char), attempt on address 1000 is made by OS. Infact, OS actually reads 4 bytes(1000 – 1003) and extracts the first byte for the user. Thus, the char variable is accessed in single attempt. Consider the access of integer. It is stored in the region of 1001 to 1004. OS performs read on address 1000 first, extracts 3 bytes from it, then read address 1004, extracts one byte from it, do the necessary actions to formulate the integer, and finally return to user. Thus integer variable is accessed by 2 memory attempts alongwith few overhead operations for formulating the integer.

    Am I answered your question correctly?

    ||| Dharma |||
     
  5. asadullah.ansari

    asadullah.ansari TechCake

    Joined:
    Jan 9, 2008
    Messages:
    356
    Likes Received:
    14
    Trophy Points:
    0
    Occupation:
    Developer
    Location:
    NOIDA
    It's simple!!!!!

    Code:
       struct stud
       {
               char ch;
               int    i;
               char ch1;
        }; 
    Total size of this structure will be 12 byte on 32-bit machine. Because first character's size will be only 1 Byte but due to memory alignment( Algorithm to fast access memory by CPU) , 3 byte will as structure Hole. Because next data is integer which size is 4 Byte.

    To avoid this user have to write structure very carefully.

    Code:
    Struct stud
    {
       char ch;
       char ch1;
       int  i;
    } ;
    Now It's size will be 8 byte. First two character will come to on Four Byte cycle where actually size of these two character is 2 byte. But it is better that 1st structure.
    This things happen only due to easily and fast access memory by CPU.
     
    Last edited by a moderator: Jan 9, 2008
    shabbir likes this.
  6. msdnguide

    msdnguide New Member

    Joined:
    May 14, 2011
    Messages:
    13
    Likes Received:
    0
    Trophy Points:
    0
    Occupation:
    msdnguide
    Location:
    Delhi
    Home Page:
    memory alignment is very imp when we port code from 32 bit to 64 bit systems. code that work perfect on 32 bit systems may cause Bus errors in 64 bit system due to misaligned memory
     
  7. gngrwzrd

    gngrwzrd New Member

    Joined:
    Feb 6, 2012
    Messages:
    1
    Likes Received:
    0
    Trophy Points:
    0
    Does using the __attribute__((packed)) on a structure cause the same effect for memory access? AKA: causing the CPU to have to do bit operations to get the right data?

    I'm curious about the condition in which using __attribute__((packed)) is correct?
     
  8. dearvivekkumar

    dearvivekkumar New Member

    Joined:
    Feb 21, 2012
    Messages:
    29
    Likes Received:
    5
    Trophy Points:
    0
    Hi,

    I was looking for the answer of this answer quite long. Thanks the explanation in the comment was really awesome. Thanks.

    Please help me correct my understanding.

    The padding is done, so that OS has not to perform multiple read operation. (But it's desirable to perform rest of computation for segregating the unwanted read bits)
    For example
    Code:
    struct A
    {
        char ch;       // let memory address be 1000, 1001
        char ch1;     // 1002, 1003
        int i;             // 1004, 1005, 1006, 1007
    };
    
    struct A a;
    char c1 = a.ch;     // (1)
    char c2 = a.ch1;   // (2)
    int ii = a.i;             // (3)
    
    For perform the the (1), (2) and (3) steps the OS performs three read operations. But does extraction(remove 3 extra bytes) for getting the one byte while it perform the step(1) since it's read 4 bytes and same for step(2) but for step(3) it does single read operation and there was no need to any addition computation in extraction. And may be the extraction is not that much expensive for the OS, that why it's assigned 2-2 bytes for ch and ch1, instead of giving them 4-4 bytes.

    Please correct me if my understanding/observation lack somethings.

    Thanks.
     
  9. dearvivekkumar

    dearvivekkumar New Member

    Joined:
    Feb 21, 2012
    Messages:
    29
    Likes Received:
    5
    Trophy Points:
    0
    Can any one explain this please?

    "Multi-byte data must usually be aligned on a natural boundary."
     

Share This Page