Here is a copy of the short power-point presentation I did in SIG-NEWGRAD. It contains a very brief introduction to some of my research interests.
My area of interest is natural language processing. More specifically, I am interested in natural language generation, in developing algorithms that will allow a computer to generate coherent texts similar to those that a human would generate. While I am studying all aspects of the generation process, my emphasis has been on developing algorithms to capture discourse coherence principles, which includes mechanisms for structuring text and for tracking and taking advantage of the focus of attention as it shifts through the text. In addition, I am interested in applying my work on generation to devices designed to aid people with disabilities. Of particular interest are language acquisition issues.
The long term goal of this work is to develop a writing tool for native
users of American Sign Language (ASL). Envisioned is a computer system
that will take a text written by a deaf user, detect and analyze the
errors, engage that user in a tutorial dialogue, and generate appropriate
corrections to the text. Here we lay some of the necessary foundation for
the envisioned system. In this work we view written English as a second
language and are interested in identifying the role of ASL on the
acquisition of written English and investigating appropriate correction
strategies that are tailored to the individual user. Great emphasis
is placed on modeling the acquisition of English as a second language
and on tracking an individual writer through this process.
Links:
In this work we investigate how knowledge of language may be used to make
an intelligent communication device. One such device takes telegraphic
input and generates full sentences. The system takes advantage of word
order and rich semantic reasoning to determine the intended meaning and
syntactic information to generate the full sentence.
Links:
With the explosion of texts on the web, the ability to summarize
a document (so that a reader can decide whether or not it is relevant
to them) is becoming very important. In this work, we use "lexical chains"
to pull out the important information in a text. We have developed a
linear time algorithm for lexical chain computation. Continuing work in
this area investigates how a coherent and informative summary can be
produced on the basis of these lexical chains.
Links:
Some people must access information on the web through text-only
interfaces (e.g., people who are blind, people who are accessing the web
through interfaces such as cell phones). In such a situation, graphical
information is inaccessible. In this project we investigate summarizing
graphical information in text. We investigate determining the intentions
of the graphical designer (by choice of graph used, for example) and
remarkable features of the particular graph, to determine what should
be included in a textual summary.
Links:
The field of Natural Language Generation can be seen as consisting of two parts. The first is "deep generation" where the system plans the content and structure of a coherent text. Generally the output of the deep generation component is a formal specification of what is to be conveyed by the text. The second is "sentence generation which produces a syntactically correct sentences from the formal specification output by the deep generation component. My work in this area spans both of these components.
A sentence generator is responsible for generating a syntactically correct sentence given some formal specification of the semantic content of what is to be said and the discourse context. Tree Adjoining Grammars appear to be a grammatical formalism well suited for the generation task because the elementary structures of the grammar correspond to minimal semantic structures. My work in this area has concentrated on proposing a methodology for generating sentences defined by a Tree Adjoining Grammar given a functional specification of not only what is to be said, but also the current context.
My recent work in this area has begun to look at issues involved
in multi-lingual generation (and paraphrase). A goal is to develop
a generation architecture that relies on lexical-grammatical resources
that can easily/uniformly generate in any language.
Links:
The appropriate use of pronouns in text is an area that is not well
understood. In this work we look at a model of discourse structure
and how this structuring device may influence the use of pronouns in text.
We have produced a text annotation tool that enables texts to be annotated
and empirical studies to be done in order to test our hypotheses.
Links:
The focus of a sentence within a text is the object about which the
sentence is most centrally concerned. As a discourse progresses, the focus
moves in a relatively well defined manner. Tracking the focus of attention
is not only useful for generating coherent texts, but is also useful in
pronoun (anaphora) interpretation and generation. Although some algorithms
for tracking the focus of attention have been developed, these algorithms
have only handled simple sentence types. This project has been concerned
with developing methods for tracking focus in more complicated sentence
structures. In addition, it is concerned with investigating the effects of
focus on pronoun generation.
Links: