Run length Encoding also known as RLE is probably the most basic encoding method that's being used in different fields of computing like, Multimedia Formats etc. In this tutorial we'll be looking at how RLE works and a basic python example which implements RLE.
Run length encoding works best on the files with lots redundancy, example an Image with lots of whitespace, like a scanned document etc. Run length encoding can be very effective in such files, and can reduce the size of such files to around half of the original size.
So, to understand how it works lets take an example :-
Original Data : 0x00 0x00 0x01 0x01 0x01 0x01 0x01
Encoded Data : 0x02 0x00 0x06 0x01
Original data has a sequence of bytes which have a lot of repetition, 0x00 has been repeated 2 times and 0x01 has been repeated 6 times, Now its quite simple to understand the encoded data.
It follows the following format :-
So, in the example which we discussed, 0x00 has been repeated 2 times, in this case RUN = 2 and BYTE = 0x00, same is with 0x01. Now that we understand How this algorithm works, why not write a basic python script to do some RLE.
Exercise :
Now that we understand how the algoritm works lets test our knowledge by solving some basic exercises.
You can check your answers by using the python script above.
Run Length Encoding can be used to encode any type of data losslessly but its beneficial with only a specific type of data which involves a fair amount of redundancy, in some other casses it can even be expensive to use and could cost more bytes than the original data.
Example: lets try to encode 'go4expert' using RLE.
Here is the modified python script to do that :-
Output :-
As we can see the original data costs us 8 bytes but the encoded data actually costs double, i.e 16 bytes.
How does it works
Run length encoding works best on the files with lots redundancy, example an Image with lots of whitespace, like a scanned document etc. Run length encoding can be very effective in such files, and can reduce the size of such files to around half of the original size.
So, to understand how it works lets take an example :-
Original Data : 0x00 0x00 0x01 0x01 0x01 0x01 0x01
Encoded Data : 0x02 0x00 0x06 0x01
Original data has a sequence of bytes which have a lot of repetition, 0x00 has been repeated 2 times and 0x01 has been repeated 6 times, Now its quite simple to understand the encoded data.
It follows the following format :-
Code:
[RUN] [BYTE] [RUN2] [BYTE2] ...
Code: python
#!/bin/py
def rle(data) :
run = 1
encoded = []
for i in range(1, len(data)) :
if data[i] == data[i - 1] :
run = run + 1
else :
encoded.append((run , data[i - 1]))
run = 1
if i == len(data) - 1 :
encoded.append((run , data[i]))
return encoded
# Test :-
print rle([ 'A', 'A', 'B', 'B', 'C', 'C' ])
Exercise :
Now that we understand how the algoritm works lets test our knowledge by solving some basic exercises.
- ABCDEFGHOOPPPP
- Go4expert
- Aneesh
- Shabbir
You can check your answers by using the python script above.
Conclusion
Run Length Encoding can be used to encode any type of data losslessly but its beneficial with only a specific type of data which involves a fair amount of redundancy, in some other casses it can even be expensive to use and could cost more bytes than the original data.
Example: lets try to encode 'go4expert' using RLE.
Here is the modified python script to do that :-
Code: python
#!/bin/py
def rle(data) :
run = 1
encoded = []
for i in range(1, len(data)) :
if data[i] == data[i - 1] :
run = run + 1
else :
encoded.append((run , data[i - 1]))
run = 1
if i == len(data) - 1 :
encoded.append((run , data[i]))
return encoded
# Test :-
print rle([ 'g','o','4','e','p','e','r','t' ])
Output :-
Code:
[(1, 'g'), (1, 'o'), (1, '4'), (1, 'e'), (1, 'p'), (1, 'e'), (1, 'r'), (1, 't')]



