Applications of Trees

ICS 211 Spring 2000

Applications of trees

More?

 

 

Encoding the alphabet using binary digits.

26 letters

2^4 = 16 (not enough)

2^5 = 32 (some to spare)

 

Here, you know that if every letter is represented by 5 digits so you will know exactly what has been encoded. Great! problem solved… BUT this has some disadvantages… Some letters repeat themselves more times than others, if those high-frequency letter’s representation were shorter, we could save:

Lets now take into consideration the frequency of the letters.

The vowels are more frequent than the letter "z" and yet we use the same amount of digits to encode either one of them…

One possibility is to use variable length code.

Example:

Since T is less frequent than E or T.

PROBLEM

If we have 0101 it can translate to either:

Too bad L , so sad…

But, there are Prefix Codes ! J

What the $#%$ is a Prefix code?

Prefix Code: A code that has the property that the code of a character is never a prefix of the code of another character.

Here is a Prefix Code example:

10110 = ATE and it can’t correspond to anything else.

Why? Because 1 is a prefix and by itself it doesn’t represent a character, while 10 does.

Examples:

By sight, is this a prefix code? Please ignore that all letters are encode using the same length

Write: EATS, SEAT, TEA

What about… is this a prefix code?

Write: EATS, SEAT, TEA

What about… is this a prefix code?

Try it with: EATS, SEAT, TEA

How do we go from prefix codes to binary trees?

A prefix code can be represented using a tree, where:

Example:

Remember: Left is 1, right is 0. However this is just a convention to follow your book. Other books show the opposite. Check p.301 Standish

Ok.

We had the tree and we use it to decode. Fine!

Now…

If we had letters and a given frequency for each one of the letters, how do we build a tree?

Your book pages 298-300 There is an explanation for the Huffman code and an example. Try following your book’s example and build the real Huffman Code prefix binary tree using the real letter frequencies .