Time: | T H 9:30-10:45 | Place | 102A Smith Hall |
Professor: | Kathy McCoy | Office: | Room 108, Human Language Technologies Lab |
Office Hours: | T 8:00-9:15, H 11:00-12:30, by appointment | aka "The Tea House"; 100 Elkton Road | |
Email: | mccoy@cis.udel.edu | Phone: | 302-831-1956 |
This course provides an introduction to the field of computational linguistics, also called natural language processing (NLP) - the creation of computer programs that can understand and generate natural languages (such as English). We will use natural language understanding as a vehicle to introduce the three major subfields of NLP: syntax (which concerns itself with determining the structure of an utterance), semantics (which concerns itself with determining the explicit truth-functional meaning of a single utterance), and pragmatics (which concerns itself with deriving the context-dependent meaning of an utterance when it is used in a specific discourse context). The course will introduce both linguistic (knowledge-based) and statistical approaches to NLP, illustrate the use of NLP techniques and tools in a variety of application areas, and provide insight into many open research problems.
Prerequisites: CISC681 - Introduction to Artificial Intelligence
Speech and Language Processing - An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition Second Edition by Jurafsky and Martin. Please check the online errata for the text for each chapter as you read it.
As the course goes on, I will put slides/materials up on the web for the class lecture. I will try to put these up early so that you can make a printout to take notes on during the class.
Please note that many of the materials/slides are borrowed from the book's website and also from the NLP courses of Julia Hirschberg, Diane Litman, James Martin, Kathy McKeown, and Johanna Moore. Also thanks also to Owen Rambow for the introduction to CFG's.
CALENDAR - Second Half WILL Change
Date |
Topic | Reading | Assignments |
8/28 |
Course Overview, Introduction
Print of Course Overview, Introduction |
Chapter 1 | |
8/30 |
Regular
Expressions and Automata
Print of Regular Expressions and Automata |
Chapter 2, Perl
Introduction by Patrick Ryan |
Assignment 1 - Stock market Question Answering -
due September 18th Test File assign1-wsj_2300.txt |
9/04 |
Finish up lecture 2 - Regular Expressions and Automata
A short lecture on Words and the Lexicon Print of a short lecture on Words and the Lexicon Morphology and Finite State Transducers Print of Morphology and Finite State Transducers | Chapter 3 | |
9/06 |
Continue with Morphology and Finite State Transducers | Chapter 3 | |
9/11 |
N-Grams
Print of N-Grams |
Chapter 4 (through 4.7) | |
9/13 |
Continue with N-Grams | Chapter 4 (through 4.7) | |
9/18 |
Finish N-Grams
Context-Free Grammars for English Print of Context-Free Grammars for English |
Chapter 12.1-12.4 |
ASSIGNMENT 1 DUE September 19th, noon
|
9/20 |
Assignment 1 Results -- Candy-Bar Competition
Continue with Context-Free Grammars |
Chapter 12.1-12.4 | Assignment 2
due 10/10 - midnight
GetScanTime.zip Evaluation Script |
9/25 |
Finish Context-Free Grammar for English | Chapter 12.1-12.4 | |
9/27 |
Start
Parsing with CFGs,
Print of Parsing with CFGs |
Chapter 13 | |
10/02 |
Some English Analysis,
Print of Some English Analysis |
Chapter 13 | |
10/04 |
More Parsing with CFGs;
CKY and Earley Algorithms
Print of CKY and Earley Algorithms |
Chapter 13 | |
10/09 |
More Parsing; Start
Statistical Parsing
Print of Statistical Parsing |
Chapter 14 | |
10/11 |
More Statistical Parsing | Chapter 14 |
|
10/16 |
Finish Statistical Parsing; Start
Unification Grammars
Print Unification Grammars |
|
|
10/18 |
Discussion of Assignment 2 -- Competition for NLP Belt | ||
10/23 |
No Class - Kathy out of town | ||
10/25 |
No Class - Kathy out of town |
|
|
10/30 |
Postponed Class - University Classes Canceled due to Hurricane Sandy | Midterm Exam Due - Extension Given to November 1st | 11/1 |
Finish Unification Grammars;
Representing Meaning
Print of Representing Meaning |
Chapter 17 | Midterm Exam Due |
11/06 |
NO CLASS - Election Day! | ||
11/8 |
Finish Representing Meaning;
Semantic Analysis
Print of Semantic Analysis |
Chapter 18 |
|
11/13 |
Finish Up sementic Analysis;
Lexical Semantics
Print of Lexical Semantics |
Chapter 19 | |
11/14 |
*** Wednesday Evening Make-Up Class Marathon ***
6:30pm-9:00pm Finish Lexical Semantics; Begin Question Answering, Information Retrieval, and Text Summarization Print of Question Answering, Information Retrieval, and Text Summarization |
Chapter 23 | Assignment 3 Out... |
11/15 |
More on Information Retrieval, and Text Summarization | ||
11/20 |
Discourse Coherence
Print of Discourse Coherence |
Chapte 21 | |
12/22 |
HAPPY THANKSGIVING!! | ||
11/27 |
More Discourse: Rhetorical Structure Theory, Anaphora, Centering | ||
11/29 |
Continue Anaphora/Coherence
Print slides on Focusing/Centering |
||
12/04 |
Anaphora Resolution | ||
12/5 |
*** Wednesday Evening Class ***
6:30pm-9:00pm; 102A Smith CLASS PROJECT3 PRESENTATIONS/EVALUATIONS |
||
12/13 |
Take-Home Final Exam Due before 12:30pm | Final Exam Due | Final Exam |
Topic | Reading | ||
Course Overview, Introduction | Chapter 1 | ||
Regular Expressions and Automata | Chapter 2, Perl
Introduction by Patrick Ryan |
||
Words and the Lexicon | Chapter 2 | ||
Morphology and Finite State Transducers | Chapter 3 | ||
N-Grams | Chapter 4 | ||
Word Classes and Part of Speech Tagging | Chapter 5 (through 5.6) | ||
Context-Free Grammars for English | Chapter 12 | ||
Parsing with CFGs, | Chapter 13 | ||
Earley Algorithm | Chapter 13.4 | ||
Statistical Parsing | Chapter 14 | ||
Features and Unification | Chapter 15 | ||
Representing Meaning | Chapter 17 | ||
Semantic Analysis | Chapter 18 | ||
Lexical Semantics | Chapter 19 | ||
Word Sense Disambiguation | Chapter 20 | ||
Discourse | Chapter 21 | ||
Applications | Chapter 22, 23, 24 |
Assignments must be your own individual work, unless explicitly stated otherwise. You must do the work without undue help from other people, and you must not present material from resources such as the Web, books, papers, code listings, and other people as your own. You may talk to each other about concepts and techniques, but you must not discuss specific solutions or approaches to solutions. Web resources will be very useful in this course and we will encourage class discussion of the use of such resources with their proper citations. Copying or paraphrasing someone's work, or permitting your own work to be copied or paraphrased, even in part, is not allowed and will result in an automatic grade of 0 for the assignment.
Classic NLP programs
Appelt and Israel's information extraction tutorial (IJCAI-99).
Allen's Dialogue Modeling for Spoken Language Systems tutorial (ACL Workshop 1997).