Go4Expert

Go4Expert (http://www.go4expert.com/)
-   Python (http://www.go4expert.com/articles/python-tutorials/)
-   -   Run Length Encoding (http://www.go4expert.com/articles/run-length-encoding-t27938/)

lionaneesh 3Mar2012 11:15

Run Length Encoding
 
Run length Encoding also known as RLE is probably the most basic encoding method that's being used in different fields of computing like, Multimedia Formats etc. In this tutorial we'll be looking at how RLE works and a basic python example which implements RLE.

How does it works



Run length encoding works best on the files with lots redundancy, example an Image with lots of whitespace, like a scanned document etc. Run length encoding can be very effective in such files, and can reduce the size of such files to around half of the original size.

So, to understand how it works lets take an example :-

Original Data : 0x00 0x00 0x01 0x01 0x01 0x01 0x01
Encoded Data : 0x02 0x00 0x06 0x01

Original data has a sequence of bytes which have a lot of repetition, 0x00 has been repeated 2 times and 0x01 has been repeated 6 times, Now its quite simple to understand the encoded data.

It follows the following format :-

Code:

[RUN] [BYTE] [RUN2] [BYTE2] ...
So, in the example which we discussed, 0x00 has been repeated 2 times, in this case RUN = 2 and BYTE = 0x00, same is with 0x01. Now that we understand How this algorithm works, why not write a basic python script to do some RLE.

Code: python

#!/bin/py

def rle(data) :
    run = 1
    encoded = []
    for i in range(1, len(data)) :
        if data[i] == data[i - 1] :
            run = run + 1
        else :
            encoded.append((run , data[i - 1]))
            run = 1
        if i == len(data) - 1 :
            encoded.append((run , data[i]))
    return encoded

# Test :-
print rle([ 'A', 'A', 'B', 'B', 'C', 'C' ])


Exercise :

Now that we understand how the algoritm works lets test our knowledge by solving some basic exercises.
  • ABCDEFGHOOPPPP
  • Go4expert
  • Aneesh
  • Shabbir

You can check your answers by using the python script above.

Conclusion



Run Length Encoding can be used to encode any type of data losslessly but its beneficial with only a specific type of data which involves a fair amount of redundancy, in some other casses it can even be expensive to use and could cost more bytes than the original data.

Example: lets try to encode 'go4expert' using RLE.

Here is the modified python script to do that :-

Code: python

#!/bin/py

def rle(data) :
    run = 1
    encoded = []
    for i in range(1, len(data)) :
        if data[i] == data[i - 1] :
            run = run + 1
        else :
            encoded.append((run , data[i - 1]))
            run = 1
        if i == len(data) - 1 :
            encoded.append((run , data[i]))
    return encoded

# Test :-
print rle([ 'g','o','4','e','p','e','r','t' ])


Output :-
Code:

[(1, 'g'), (1, 'o'), (1, '4'), (1, 'e'), (1, 'p'), (1, 'e'), (1, 'r'), (1, 't')]
As we can see the original data costs us 8 bytes but the encoded data actually costs double, i.e 16 bytes.

lionaneesh 3Mar2012 12:35

Re: Run Length Encoding
 
Note: In the above Python code , '[' has been replaced by [ (I don't know the reason), Please replace that before using that code.

shabbir 3Mar2012 14:21

Re: Run Length Encoding
 
Quote:

Originally Posted by lionaneesh (Post 93017)
Note: In the above Python code , '[' has been replaced by [ (I don't know the reason), Please replace that before using that code.

Python Syntax highlighter had an issue. I have fixed it now

Alan_Smith 3Mar2012 17:46

Re: Run Length Encoding
 
In programming language, Many times, it required to get all same types of data or any other details in arranged format, So that we can work perfectly and also user can easily understand it. The Data encoding is really very useful. The given detail with example is really appreciated.


All times are GMT +5.5. The time now is 21:12.