Course: CIS451/651 Data
Compression in Multimedia
Professor: Paul D. Amer
Semester: Spring 2013
Title: Homework Chapter 3 - Statistical Coding Methods
Due Date:
Tasks
Read Chapter 3. Note: Hand-written Huffman trees are acceptable as long as they are neat and legible.
- (2 pts) Section 3.10; Projects and Problems 4
- (2 pts) Section 3.10; Projects and Problems 5
- (2 pts) Section 3.10; Projects and Problems 6
- in the problem, replace "... obtained in Problem 3" with "... obtained in Problem 4"
- (2 pts) Section 3.10; Projects and Problems 7
- (2 pts) Section 3.10; Projects and Problems 10
- in the problem, replace "as a 0 instead of a 1" with "incorrectly" so that part (b) will make sense.
- (2 pts) Section 3.10; Projects and Problems 13
- (3 pts) 2-Symbol Huffman Code
- Generate a 2 symbol Huffman code for the alphabet {x,y,z}
which occur with respective probabilities {.6, .3, .1}. Assume the
number of characters (odd or even) in the file is encoded with metadata.
Your solution will have exactly 9 codewords {xx, xy, xz, yx, yy, yz, zx, zy, zz}.
- Compute the expected number of bits/character used by your
2-symbol Huffman code and compare it with a 1-symbol Huffman code.
- How "good" are the 1 and 2 symbol codes? That is, how close are they to optimal?
- (4 pts) More Huffman Codes
- Two information
sources with alphabet {a,b,c,d,e,f,g,h}
are characterized by the probabilities in the following table. For
each source, find a binary (bits/character), ternary (trits/character),
and quaternary (four code symbols - quits(?)/character)
Huffman code, and calculate its average length.
- Be sure to read about
m' in Section 3.3 when deciding how many characters to combine in the first step.
|
a
|
b
|
c
|
d
|
e
|
f
|
g
|
h
|
source 1
|
1/8
|
1/8
|
1/8
|
1/8
|
1/8
|
1/8
|
1/8
|
1/8
|
source 2
|
.05
|
.20
|
.10
|
.35
|
.05
|
.12
|
.03
|
.10
|
- (3pts) Required for 651; Extra credit for 451:
- Consider a source with
alphabet {A,B,C,D,E}. Roughly how many bits
would be needed to encode an input file containing 5005 characters: ABCDEAAA...ABBB...BCCC...CDDD...DEEE...E
where the sequences of As, Bs, Cs, Ds, Es are each 1000 long, using
- a 2-pass static huffman code
- a 1-pass adaptive huffman code (as described in class)
- (5 pts) Required for 651; Extra credit for 451 - Morse Code
- Solve this problem correctly, and find out what
Morse code has to do with the Fibonacci series. Assume (incorrectly!)
that the likelihood of each of the 45 symbols in the Morse code source alphabet
(i.e., 26 characters (for this question, ignore "ch"),
10 digits, 9 punctuation characters) are equally
probable. Given this invalid assumption, how many time units on average
will be required to transmit a symbol?
- Still assuming equal probabilities, design a
better code than Morse did using {dot, dash} and "silence" (i.e.,
design a code that requires fewer time units on average to transmit a random
message). Show mathematically how much better
your code is expected to be, i.e., how many time units on average will be
required to transmit a character with your code? For your code, you may not change the following basic
Morse code constraints: 1 time unit for a dot, 3 time units for a dash, 1
silence time unit between dots and dashes within a symbol; 3 silence time units
between symbols; and 6 silence time units between words.
- (15 pts) Extra credit for 451, 651 - 2008 ACM Programming Contest World Finals
Notes
-
Graduate students must do all assignments individually. Undergraduate students may collaborate in groups of
2 for assignments. Only one submission with both names should be turned in from a group.
- Clearly label your answers,
and please submit answers in the order assigned.
- (repeated from course
syllabus) Academic Honesty:
Unless explicitly stated otherwise, students are not permitted to access or compare any homework, or
program-project answers with those of any other student or group past or present, alive or dead, or any Internet
web site prior to submitting the assignment. Comparing answers, or getting answers off the Internet before
submitting one's work is considered cheating. If you do not have time to complete an assignment, it is
better to submit partial solutions than to get answers from someone else. While it is obviously difficult to
enforce this policy, students who do not follow this policy should be keenly aware that in this class, they a
re cheating, and if caught, will be prosecuted according to University guidelines. This applies both to the
student (or group) who gets answers and the student (or group) who gives answers.
- (repeated from course syllabus) Lateness Policy:
Assignments are due at the beginning of class. Unexcused late assignments will be penalized up to 10% per
school day (weekends do not count) up to a 2-day maximum penalty of 20%. Without prior discussion with the
professor, assignments will not be accepted more than two school days late without a university approved
excuse.