Run Length Encoding

Discussion in 'Python' started by lionaneesh, Mar 3, 2012.

  1. lionaneesh

    lionaneesh Active Member

    Joined:
    Mar 21, 2010
    Messages:
    848
    Likes Received:
    224
    Trophy Points:
    43
    Occupation:
    Student
    Location:
    India
    Run length Encoding also known as RLE is probably the most basic encoding method that's being used in different fields of computing like, Multimedia Formats etc. In this tutorial we'll be looking at how RLE works and a basic python example which implements RLE.

    How does it works



    Run length encoding works best on the files with lots redundancy, example an Image with lots of whitespace, like a scanned document etc. Run length encoding can be very effective in such files, and can reduce the size of such files to around half of the original size.

    So, to understand how it works lets take an example :-

    Original Data : 0x00 0x00 0x01 0x01 0x01 0x01 0x01
    Encoded Data : 0x02 0x00 0x06 0x01

    Original data has a sequence of bytes which have a lot of repetition, 0x00 has been repeated 2 times and 0x01 has been repeated 6 times, Now its quite simple to understand the encoded data.

    It follows the following format :-

    Code:
    [RUN] [BYTE] [RUN2] [BYTE2] ...
    
    So, in the example which we discussed, 0x00 has been repeated 2 times, in this case RUN = 2 and BYTE = 0x00, same is with 0x01. Now that we understand How this algorithm works, why not write a basic python script to do some RLE.

    Code:
    #!/bin/py
    
    def rle(data) :
        run = 1
        encoded = []
        for i in range(1, len(data)) :
            if data[i] == data[i - 1] :
                run = run + 1
            else :
                encoded.append((run , data[i - 1]))
                run = 1
            if i == len(data) - 1 :
                encoded.append((run , data[i]))
        return encoded
    
    # Test :-
    print rle([ 'A', 'A', 'B', 'B', 'C', 'C' ])
    
    Exercise :

    Now that we understand how the algoritm works lets test our knowledge by solving some basic exercises.

    • ABCDEFGHOOPPPP
    • Go4expert
    • Aneesh
    • Shabbir

    You can check your answers by using the python script above.

    Conclusion



    Run Length Encoding can be used to encode any type of data losslessly but its beneficial with only a specific type of data which involves a fair amount of redundancy, in some other casses it can even be expensive to use and could cost more bytes than the original data.

    Example: lets try to encode 'go4expert' using RLE.

    Here is the modified python script to do that :-

    Code:
    #!/bin/py
    
    def rle(data) :
        run = 1
        encoded = []
        for i in range(1, len(data)) :
            if data[i] == data[i - 1] :
                run = run + 1
            else :
                encoded.append((run , data[i - 1]))
                run = 1
            if i == len(data) - 1 :
                encoded.append((run , data[i]))
        return encoded
    
    # Test :-
    print rle([ 'g','o','4','e','p','e','r','t' ])
    
    Output :-
    Code:
    [(1, 'g'), (1, 'o'), (1, '4'), (1, 'e'), (1, 'p'), (1, 'e'), (1, 'r'), (1, 't')]
    
    As we can see the original data costs us 8 bytes but the encoded data actually costs double, i.e 16 bytes.
     
  2. lionaneesh

    lionaneesh Active Member

    Joined:
    Mar 21, 2010
    Messages:
    848
    Likes Received:
    224
    Trophy Points:
    43
    Occupation:
    Student
    Location:
    India
    Note: In the above Python code , '[' has been replaced by [ (I don't know the reason), Please replace that before using that code.
     
  3. shabbir

    shabbir Administrator Staff Member

    Joined:
    Jul 12, 2004
    Messages:
    15,375
    Likes Received:
    388
    Trophy Points:
    83
    Python Syntax highlighter had an issue. I have fixed it now
     
  4. Alan_Smith

    Alan_Smith Banned

    Joined:
    Jan 24, 2012
    Messages:
    6
    Likes Received:
    0
    Trophy Points:
    0
    Location:
    Los Angeles
    Home Page:
    http://www.spinxwebdesign.com/
    In programming language, Many times, it required to get all same types of data or any other details in arranged format, So that we can work perfectly and also user can easily understand it. The Data encoding is really very useful. The given detail with example is really appreciated.
     

Share This Page

  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice