CISC882 - Introduction to Natural
Language Processing - Assignment 1
Due: Tuesday, September 18, 2007
Exercises
(These
exercises are borrowed heavily from Johanna Moore, University of Edinburgh.)
- ELIZA (30 points total).
- Implement a small version of your own ELIZA in Perl. You should include enough
rules so as to hold a conversation that is at least 10 exchanges long.
For those of
you who already have some Perl experience, do Jurafsky and Martin 2.2. Stick
with the Rogerian psychotherapy domain and implement your program in perl.
- Provide rules such that for a given user
input, there is more than one option (as in the example on page 32-33 of
J&M.)
- When more than one rule can apply, select a
rule at random.
- The original ELIZA
had a "memory" mechanism. When no pattern matched the input,
it said "Tell me more about X", where X was some
topic that the user mentioned earlier in the dialogue; i.e., X
was something that appeared in an input that the user typed in
previously. Add such a history mechanism to your program.
- Jurafsky & Martin 2.1 (15 points)
- Jurafsky & Martin 2.4 (15 points)
- Jurafsky & Martin 2.5 (15 points)
- Jurafsky & Martin 2.6 (15 points)
- Exercise 7: (70 points total)
Using the FSA's
you've just designed, write a program in Perl that puts XML-like tags around
time and date specifications. For example:
- INPUT: a text in English.
- OUTPUT: the same text with all date and time
expressions marked by <TIME> and </TIME> (for both dates and
times).
- SAMPLE INPUT: Christmas is celebrated on the
25th of December.
- SAMPLE OUTPUT: <TIME> Christmas
</TIME> is celebrated on <TIME> the 25th of December
</TIME>.
- SCOPE: At a minimum, your program should be
able to process all time and date expressions in the following files:
as well as all time and
date expressions listed in exercises 2.4 - 2.6. in J&M, page 54.
- SUBMIT: source code, output of your program
on the files mentioned earlier and a text file listing all time and date
expressions that your program can handle.