Sunrast's Implementation of Run length Encoding

Discussion in 'Python' started by lionaneesh, Mar 3, 2012.

  1. lionaneesh

    lionaneesh Active Member

    Joined:
    Mar 21, 2010
    Messages:
    848
    Likes Received:
    224
    Trophy Points:
    43
    Occupation:
    Student
    Location:
    India
    Sun Rasterfile also known as Sunrast is an image format developed by Sun Microsystems, It uses a version of Run Length Encoding to compress images.

    In this tutorial we'll be discussing how Sun Rasterfile's implementation of RLE is different from the basic vanila version and how is it used to reduce the size of images. The format implements a Run length encoding trigger, which is used to reduce the wastage of bytes in case of a non redudant data.
    0x80 (RLE Trigger) is a trigger to read a runlength for the next value, otherwise the bytes are written unchanged (as in original).

    Note: This is a continuation of my previous article on Run Length Encoding

    The Working



    The sunrast format uses RLE in the following format :-

    Code:
    [RLE TRIGGER] [RUN – 1] [BYTE]
    
    Where “RLE TRIGGER” is equal to 0x80, (which is predefined in the format's specification)
    RUN is the run length (i.e length of repetition)
    and Byte is the value to be repeated.

    Lets look at some examples and try to understand How Sunrast encodes them :-

    Original : [0x11 0x22 0x33]
    Encoded : [0x11 0x22 0x33]

    Here there is no difference between the original and the encoded data because the data is non redudant.

    Original : [0x11 0x11 0x11]
    Encoded : [0x80 0x02 0x11]

    • 0x80 (RLE Trigger)
    • 0x02 (Runlength – 1)
    • 0x11 (Value)

    Original : [0x80 0x11 0x11]
    Encoded : [0x80 0x00 0x02 0x11]

    This example is a bit tricky, this involves [0x80] to be encoded to [0x80 0x00].

    Q: Why is that?
    A: Basically its a way to indicate that the byte value is same as the RLE trigger and without RLE. Its just a way sunrast handles it and its predefined in the format's specifications.


    Original : [0x80 0x80 0x80]
    Encoded : [0x80 0x03 0x80]

    What so ever be the byte value, if its being repeated, then the format follows the normal RLE format i.e [RLE TRIGGER] [RUN – 1] [BYTE].

    Libav already has a working Sunrast Encoder (which is written by me :pleased:)

    Exercise :

    Now that we understand the Sunrast's version of RLE lets try to solve some examples to check our knowledge :-

    • 0x01 0x80 0x00 0x11 0x12 0x12 0x12
    • 0x01 0x80 0x80 0x80 0x80 0x80 0x80
    • 0x00 0x00 0x00 0x00 0x00 0x00 0x00
    • 0x01 0x02 0x03 0x04 0x05 0x06 0x07

    Here's the Python Code to Check your answers :-

    Code:
    #!/bin/py
    # Sunrast's Version of RLE
    
    def rle(data) :
        run = 1
        encoded = []
        for i in range(1, len(data)) :
            if data[i] == data[i - 1] :
                run = run + 1
            else :
                if run == 1 : # no RLE
                    if data[i - 1] == '0x80' :
                        encoded.append(('0x80', '0x00'))
                    else :
                        encoded.append((data[i-1]))
                else :
                    encoded.append(('0x80', run - 1 , data[i - 1]))
                    run = 1
            if i == len(data) - 1 :
                if run == 1 and data[i] != '0x80':
                    encoded.append((data[i]))
                elif run == 1 and data[i] == '0x80':
                    encoded.append(('0x80', '0x00'))
                else :
                    encoded.append(('0x80', run - 1 , data[i]))
        return encoded
    
    # Test :-
    print rle(['0x01', '0x80', '0x11', '0x12', '0x12', '0x12', '0x80'])
    

    Conclusion



    Sunrast's implementation of RLE reduces the wastage of bytes in case of non redudant data, but some times it can be a little expensive if a file contains a lot of 0x80 bytes at different places, encoding a 0x80 byte (non-redudant) would take 2 bytes [0x80 0x00] which will cost more bytes than the original data. The worst case i.e a file with alternate 0x80 bytes can cost double the bytes as used in the original data.
     

Share This Page

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice