Time: | T H 2:00-3:15 | Place | 102A Smith Hall |
Professor: | Kathy McCoy | Office: | Room 201 77-79 E. Delaware Avenue |
Office Hours: | T 3:30-5:00, H 9:00-10:30, by appointment | ||
Email: | mccoy@cis.udel.edu | Phone: | 302-831-1956 |
This course provides an introduction to the field of computational linguistics, also called natural language processing (NLP) - the creation of computer programs that can understand and generate natural languages (such as English). We will use natural language understanding as a vehicle to introduce the three major subfields of NLP: syntax (which concerns itself with determining the structure of an utterance), semantics (which concerns itself with determining the explicit truth-functional meaning of a single utterance), and pragmatics (which concerns itself with deriving the context-dependent meaning of an utterance when it is used in a specific discourse context). The course will introduce both knowledge-based and statistical approaches to NLP, illustrate the use of NLP techniques and tools in a variety of application areas, and provide insight into many open research problems.
Prerequisites: CISC681 - Introduction to Artificial Intelligence
Speech and Language Processing - An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition Second Edition by Jurafsky and Martin. Please check the online errata for the text for each chapter as you read it.
As the course goes on, I will put slides/materials up on the web for the class lecture. I will try to put these up early so that you can make a printout to take notes on during the class.
Please note that many of the materials/slides are borrowed from the NLP courses of Julia Hirschberg, Diane Litman, James Martin, and Johanna Moore. Also thanks also to Owen Rambow for the introduction to CFG's.
CALENDAR
Date |
Topic | Reading | Assignments |
9/01 |
Course Overview, Introduction
Print of Course Overview, Introduction |
Chapter 1 | |
9/03 |
More Introduction... | Assignment 1
due September 22nd |
|
9/08 |
Regular
Expressions and Automata
Print of Regular Expressions and Automata |
Chapter 2, Perl
Introduction by Patrick Ryan |
|
9/10 |
Regular
Expressions and Automata (second part)
Print of Regular Expressions and Automata (second part) and Finite Automata, Words, and the Lexicon Print of Finite Automata, Words, and the Lexicon |
Chapter 2, Perl
Introduction by Patrick Ryan |
|
9/15 |
Morphology
and Finite State Transducers
Print of Morphology and Finite State Transducers |
Chapter 3 | |
9/17 |
N-Grams
Print of N-Grams |
Chapter 4 (through 4.4?) | |
9/22 |
Continue with N-Grams |
Assignment 2
due 10/13
ASSIGNMENT 1 DUE |
|
9/24 |
Finish N-Grams
Word Classes and Part of Speech Tagging Print of Word Classes and Part of Speech Tagging |
Chapter 5 | |
9/29 |
more on Part of Speech Tagging |
Chapter 5 | |
10/01 |
Finish Part of Speech Tagging
Context-Free Grammars for English Print of Context-Free Grammars for English |
Chapter 12 | |
10/06 |
Finish Context-Free Grammars for English
Start Parsing with CFGs, Print of Parsing with CFGs |
Chapter 13 | |
10/08 |
Guest Lecturer: Keith Trnka!!!
Words/N-Grams/Evaluation on Corpora |
PDF of Keith's Language Modeling slides 6/page PDF of Keith's Language Modeling slides 1/page |
|
10/13 |
More discussion of Context-Fress Grammars for English | Test Files for Assignment 2 Competition ASSIGNMENT 2 TECHNICALLY DUE - Test files released. Prepare spreadsheets for Thursday's class discussion |
|
10/15 |
Discussion of Assignment 2 -- Competition for NLP Belt |
|
|
10/20 |
Finish Parsing with CFGs;
Earley Algorithm
Print of Earley Algorithm |
Chapter 13.4 |
|
10/22 |
Guest Lecturer: Charlie Greenbacker
Generating Referring Expressions |
||
10/27 |
Guest Lecture #2: Keith Trnka!!!!
Word Prediction and Topic/Style Modeling |
|
|
10/29 |
Finish Early Algorithm
Features and Unification Print of Features and Unification |
Chapter 15 |
Midterm Exam Questions 1-3 Due
|
11/03 |
Representing Meaning
Print of Representing Meaning |
Chapter 17 | Midterm Exam Question 4 Due |
11/05 |
Representing Meaning | Chapter 14 | |
11/10 |
Finish Representing Meaning;
Semantic Analysis
Print of Semantic Analysis |
Chapter 15 |
|
11/12 |
Finish Up sementic Analysis -- Intro to Compansion Project | ||
11/17 |
Discourse Processing: resolving anaphora, focusing, centering | ||
11/19 |
More Discourse: Centering, RAFT/RAPR, Pronoun Generation? | ||
11/24 |
CLASS PROJECT PRESENTATIONS | ||
12/26 |
HAPPY THANKSGIVING!! | ||
12/01 |
CLASS PROJECT PRESENTATIONS | ||
12/03 |
CLASS PROJECT PRESENTATIONS | ||
12/08 |
CLASS PROJECT PRESENTATIONS | ||
??? |
CLASS PROJECT PRESENTATIONS | ||
??? |
Final Class Project Due before 1:00pm | Final Reports Due | Project Reports |
Topic | Reading | ||
Course Overview, Introduction | Chapter 1 | ||
Regular Expressions and Automata | Chapter 2, Perl
Introduction by Patrick Ryan |
||
Words and the Lexicon | Chapter 2 | ||
Morphology and Finite State Transducers | Chapter 3 | ||
N-Grams | Chapter 6 (through 6.4) | ||
Word Classes and Part of Speech Tagging | Chapter 8 (through 8.4) | ||
Context-Free Grammars for English | Chapter 9 | ||
Parsing with CFGs, | Chapter 10 | ||
Earley Algorithm | Chapter 10 | ||
Features and Unification | Chapter 11 | ||
Representing Meaning | Chapter 14 | ||
Semantic Analysis | Chapter 15 | ||
Discourse | Chapter 18 | ||
Natural Language Generation | Chapter 20 | ||
Project Presentations | |||
Probabilistic Models of Spelling | Chapter 5.1-5.6,
and pieces of the rest of chapter |
||
More on Part of Speech Tagging | Chapter 8.5 - 8.7 | ||
Lexicalized and Probabilistic Parsing | Chapter 12 | ||
Lexical Semantics | Chapter 16 | ||
Word Sense Disambiguation and Information Retrieval | Chapter 17 | ||
Dialogue and Conversational Agents | Chapter 19 | ||
Machine Translation | Chapter 21 |
Assignments must be your own individual work, unless explicitly stated otherwise. You must do the work without undue help from other people, and you must not present material from resources such as the Web, books, papers, code listings, and other people as your own. You may talk to each other about concepts and techniques, but you must not discuss specific solutions or approaches to solutions. Web resources will be very useful in this course and we will encourage class discussion of the use of such resources with their proper citations. Copying or paraphrasing someone's work, or permitting your own work to be copied or paraphrased, even in part, is not allowed and will result in an automatic grade of 0 for the assignment.
Classic NLP programs
Appelt and Israel's information extraction tutorial (IJCAI-99).
Allen's Dialogue Modeling for Spoken Language Systems tutorial (ACL Workshop 1997).