Course: CIS451/651 Data
Compression in Multimedia
Professor: Paul D. Amer
Semester: Spring 2009
Title: Homework Chapter 3 - Statistical Coding Methods
Due Date:
Tasks
Read Chapter 3. Note: Hand-written Huffman trees are acceptable as long as they are neat and legible.
- (2 pts) Section 3.10; Projects and Problems 4
- (2 pts) Section 3.10; Projects and Problems 5
- (2 pts) Section 3.10; Projects and Problems 6
- in the problem, replace "... obtained in Problem 3" with "... obtained in Problem 4"
- (2 pts) Section 3.10; Projects and Problems 7
- (2 pts) Section 3.10; Projects and Problems 10
- in the problem, replace "as a 0 instead of a 1" with "incorrectly" so that part (b) will make sense.
- (2 pts) Section 3.10; Projects and Problems 13
- (3 pts) 2-Symbol Huffman Code
- Generate a 2 symbol
Huffman code for the class example alphabet {A,B,C,D}
which occur with probabilities {.49, .25, .25, .01}. Assume the
number of characters (odd or even) in the file is encoded with out-of-band
header information. That is, your solution should have exactly 16 codewords {AA,AB, AC, AD,
BA, ..., DD}.
- Compute the expected
number of bits/character used by your code, and argue whether or not the extended
2-character code is better than the 1-character code derived in class.
- (4 pts) More Huffman Codes
- Two information
sources with alphabet {a,b,c,d,e,f,g,h}
are characterized by the probabilities in the following table. For
each source, find a binary (bits/character), ternary (trits/character),
and quaternary (four code symbols - quits(?)/character)
Huffman code, and calculate its average length.
- Be sure to read about
m' in Section 3.3 when deciding how many characters to combine in the first step.
|
|
a
|
b
|
c
|
d
|
e
|
f
|
g
|
h
|
|
source 1
|
1/8
|
1/8
|
1/8
|
1/8
|
1/8
|
1/8
|
1/8
|
1/8
|
|
source 2
|
.05
|
.20
|
.10
|
.35
|
.05
|
.12
|
.03
|
.10
|
- (3pts) Required for 651; Extra credit for 451:
- Consider a source with
alphabet {A,B,C,D,E}. Roughly how many bits
would be needed to encode an input file containing 5005 characters: ABCDEAAA...ABBB...BCCC...CDDD...DEEE...E
where the sequences of As, Bs, Cs, Ds, Es are each 1000 long, using
- a 2-pass static huffman code
- a 1-pass adaptive huffman code (as described in class)
- (5 pts) Required for 651; Extra credit for 451 - Morse Code
- Solve this problem correctly, and find out what
Morse code has to do with the Fibonacci series. Assume (incorrectly!)
that the likelihood of each of the 45 symbols in the Morse code source alphabet
(i.e., 26 characters (for this question, ignore "ch"),
10 digits, 9 punctuation characters) are equally
probable. Given this invalid assumption, how many time units on average
will be required to transmit a symbol?
- Still assuming equal probabilities, design a
better code than Morse did using {dot, dash} and "silence" (i.e.,
design a code that requires fewer time units on average to transmit a random
message). Show mathematically how much better
your code is expected to be, i.e., how many time units on average will be
required to transmit a character with your code? For your code, you may not change the following basic
Morse code constraints: 1 time unit for a dot, 3 time units for a dash, 1
silence time unit between dots and dashes within a symbol; 3 silence time units
between symbols; and 6 silence time units between words.
- (15 pts) Extra credit for 451, 651 - 2008 ACM Programming Contest World Finals
Notes
- Students may work
individually or in groups of 2. Only one submission should be
turned in from each group.
- (repeated from
syllabus) Academic Honesty: Students in one group are
not permitted to access or compare homework answers with those of any
other student (past or present) or any web site prior to submitting the
assignment. Comparing answers before submitting one's work is
considered cheating. If you do not have time to complete an
assignment, it is better to submit partial solutions than to get answers
from someone else. Students who compare answers, or get answers from the Internet
prior to submission should
be keenly aware that if caught, they will be prosecuted according to
University guidelines. This applies both to the student who gets answers
and the student who gives answers.
- (repeated from
syllabus) Lateness: Unexcused late assignments will be
penalized up to 5% per day not including weekends up to a 10-day maximum penalty of 50%. Without prior discussion
with the professor, assignments will not be accepted more than two weeks
late.
- Answers should be submitted
in the order assigned. Illegible answers will lose credit.