Capturing the Evolution of Grammatical Knowledge in a CALL
System for Deaf Learners of English


Lisa N. Michaud 
lmichaud@wheatoncollege.edu 
Department of Mathematics and Computer Science
Wheaton College, Norton, MA 02766, USA


Kathleen F. McCoy
mccoy@cis.udel.edu 
Department of Computer and Information Sciences
University of Delaware, Newark, DE 19716, USA

http://www.eecis.udel.edu/research/icicle 

November 1, 2005 

Abstract 

The ICICLE project is a Computer-Assisted Language Learning (CALL)
environment geared toward teaching English as a second language. This
paper reports on an initial prototype application of the system for
deaf learners of written English. A primary focus of the ICICLE effort
has been devoted to enabling the system to adapt to a learning user
over the evolution of language proficiency. In this paper, we overview
and motivate the design of our novel user modeling component that
integrates Selinkers Interlanguage theory and other research in Second
Language Acquisition in order to accurately represent a learners
internal grammar as it changes over time. The objectives of this
effort are two-fold: to accurately diagnose and respond to learner
errors, and to focus tutorial feedback on those errors which are most
relevant to the learner's acquisition process.
 

INTRODUCTION 

We are currently developing the ICICLE system, a Computer-Assisted
Language Learning (CALL) system which instructs on English as a second
language through the paradigm of a writing tutor (Michaud, 2002;
Michaud and McCoy, 2004; Michaud et al., 2005). The name ICICLE
represents "Interactive Computer Identification and Correction of
Language Errors." The system's long term goal is to employ natural
language processing and generation to tutor learners of English as a
second language on grammatical components of their written texts. Our
work to date toward this goal has focused on the correct analysis of
the source and nature of user-generated language errors so that the
system can generate tutorial feedback to student performance which is
both correct and individualized. 

ICICLE's interaction with the user is intended to take place over long
periods of time as various pieces of writing are submitted to the
system for analysis. The interaction concerning a single piece of
writing is accomplished through a cycle of user input and system
response. This cycle begins when a user submits the piece of writing
to review by the system; the system performs an 


1


analysis on this writing, determines its grammatical errors, and
constructs a response in the form of tutorial feedback. This feedback
is aimed toward making the student aware of the nature of the errors
found in the writing and toward giving him or her the information
needed to correct them. When the student makes those corrections
and/or other revisions to the piece, it is re-submitted for analysis
and the cycle begins again. 


Although ICICLE is a general framework and can be adapted to any
population of English learners, our current implementation has been
designed to be used with deaf students. Below we overview part of our
motivation for selecting this particular user population. 


Education Issues for the Deaf 

Literacy figures for deaf students are poor to the extent of being
shocking (cf. Paul, 1998; Quigley and King, 1982, for discussions of
empirical studies), despite the fact that intelligence is distributed
normally in this population (Swisher, 1989). With this problem
impacting every aspect of a deaf student's education and future (Loera
and Meichenbaum, 1993; Moores, 1987), the search for an approach to
improving the situation demands a second look at the unique needs of
this learner group and at how a natural language system may be
designed to meet them. 


American SignLanguage(ASL) is a communication form used by many deaf
individuals; Charrow and Wilbur (1975) reported that at the time of
their writing, ASL was the third most widely-used non-English language
in the United States after Spanish and Italian. They also reported
that most prelingually deaf adolescents and adults in the United
States have ASL as their native language. We have therefore chosen to
focus on native or near-native users of ASL as our target learner
group for the ICICLE system. While having a strong native language
base is of great benefit in the acquisition of a second language, and
observations suggest that those deaf individuals who have had the
benefit of early and natural acquisition of ASL display increased
aptitude for the acquisition of English (cf. Charrow and Fletcher,
1974; Charrow and Wilbur, 1975; Swisher, 1989), the fact that they are
indeed two distinct languages and the existence of broad differences
between ASL and English together pose challenges for the learner
attempting to transfer general language knowledge from one to the
other. For instance, ASL is a visual-gestural language whose grammar
is distinct and independent of the grammar of English or any other
spoken language (Baker and Cokely, 1980). The sign-order rules of ASL
are not the same as the word-order rules of English, and ASL syntax
includes systematic modulations to signs as well as non-manual
behavior (e.g., posture and facial expression) that achieve a
simultaneous mode of communication not possible with the completely
sequential nature of written English (Baker and Cokely, 1980). We
discussed these differences plus the differences in cognitive
processing techniques between spoken and manual languages in (Michaud
and McCoy, 1998) and (Michaud et al., 2000). For ASL natives, English
is a distinctly different and challenging language, motivating the
need to view the process of acquiring fluency in written English as
second language acquisition and to incorporate that view in a strategy
for facilitating the learning process. By espousing this perspective,
we are consistent with the "Bi-Bi" philosophy in deaf education, where
deaf children are seen as being bilingual and bicultural. The primary
mode of communication in a Bi-Bi classroom is ASL, and English is
taught as a second language (Barnum, 1984; Drasgow, 1993; Erting,
1978; Gutierrez, 1994; Johnson et al., 1989; Quigley and Paul, 1984;
Stevens, 1980; Strong, 1988; Swisher, 1989). This philosophy
encourages the students to build on their strong ASL language
foundation as they acquire written English. 

ICICLE therefore attempts to address the many difficulties facing the
deaf user of ASL by providing an environment in which he or she can
practice the usage of written English and get grammatical feedback and
instruction without the "loss of face" associated with a human tutor. 


2


Figure 1: A screenshot of the current ICICLE implementation. 

This target user group, however, is very heterogeneous (Stewart,
2001), spanning individuals who can produce utterances near to those
of a native English speaker and individuals who struggle with basic
English structures. In order to deal appropriately with such a
population, it is clear that ICICLE needs to incorporate adaptivity to
the level of its user. 


Current Implementation and Architecture 

The current prototype implementation of ICICLE is a graphical
application which provides the learner with several windows through
which to work with the text. The learner may load the text of an essay
into the main window shown in the upper left of the screen shot in
Figure 1. The learner may then ask the system to analyze the text,
and, once analyzed, the text will be re-displayed in the window to the
right with the errors identified via underlining. In a future
implementation, a tutorial response employing natural language
generation will explain the errors to the user; in this prototype, the
bottom window of the application reproduces all sentences with errors
and the user has the option of querying the nature of those errors,
accessing canned one-sentence explanations. (The prototype also
currently allows the user to access a graphical depiction of the parse
trees for each sentence, something intended for the ICICLE developers
only that will not be openly available if the system is deployed among
the target users.) The text may then be edited and re-analyzed as needed. 

In order to identify errors, ICICLE uses a text parser to
syntactically interpret samples of user-written text. The analysis
process relies on lexical information from the COMLEX Syntax 2.2
lexicon (Grishman et al., 1994), the Kimmo morphological processor
(Karttunen, 1983; Antworth, 1990), and Allen's TRAINS parser version
4.0 (related to the parser in (Allen, 1995)). We have provided the
parser with an augmented CFG grammar which has been developed
specifically for ICICLE. This grammar currently consists of 321 rules
representing standard English constructions, augmented by mal-rules
(Sleeman, 1982), or bug rules, which represent commonly-committed
grammatical errors, derived from an error taxonomy compiled out of
actual writing samples from deaf 


3


Figure 2: The ICICLE system architecture. 

college students (Suri and McCoy, 1993)1. 

Figure 2 depicts how the components of ICICLE work in concert to
connect the work of this parser to the rest of the system. ICICLE
analyzes each sentence of the learner's writing in turn using the
parser. The output produced by this process is a (sometimes large) set
of possible parse trees, each representing a different syntactic
analysis of the sentence. Some of the potential parses for a given
sentence may contain errors, and others may not; in addition,
different parses may place the 'blame' for grammatical errors on
different constituents. Selecting a single parse that best represents
what grammatical structure the learner most likely intended to use has
been a large focus of this work and is discussed below. Since
determining the nature and cause of student errors is an integral step
to deciding how to approach student instruction (Matz, 1982), the
parser must be able to make principled decisions between these options. 

The single parse that is most likely given the learner?s current
mastery of the language is the one selected by the Parse Tree
Selector. If this parse used mal-rules, then the sentence will be
highlighted via underlining in the output display. In the future,
components shown with dashed lines will be fully implemented, so that
an Error Filter will cull from all of the errors committed those which
are most relevant to the learner?s current acquisition status (as
indicated by the User Model), and those errors will be passed to a
Tutorial Generator so that natural language explanations of the errors
may be provided. As mentioned earlier, the explanations currently
provided are constructed of canned text, and no filtering has yet been
applied to the production of explanations. 


It has been said that a well-designed tutoring system actively
undertakes two tasks: that of 


1. The mal-rules in the grammar may be dependent on the first language
of the learner to some extent. We anticipate that in adapting a future
implementation of ICICLE to another population of learners, we may
retain some of the existing mal-rules, while new mal-rules may need to
be added, according to the typical errors committed by the new target
population. 


4 


the diagnostician, discovering the nature and extent of the student?s
knowledge, and that of the strategist, planning a response (such as
the communication of information) using its findings about the learner
(Glaser et al., 1987; Spada, 1993). A user model typically serves as a
repository for the information passing between these two processes,
representing what has been discovered about the learner and making
that data available to drive the decisions of the system when planning
tutorial actions. 

In terms of this description of an intelligent tutoring system, the
receipt of a new piece of writing and its subsequent analysis is where
ICICLE plays the role of the diagnostician; the planning of a tutorial
response is where the role of the strategist is played out. Since the
system is intended to be used by an individual over time and across
many pieces of writing, these roles and the cycle they represent will
be performed many times with a user; therefore, it is clearly
beneficial for ICICLE to maintain a model of its users, serving as the
repository for the information gathered by, and referenced for, each
of those two tasks. With the aid of such a model, ICICLE adapts itself
to the changing needs of a student across the learning journey. 

Below, each step of ICICLE's cyclic interaction with its user is
outlined in terms of what demands it places on a model of its user in
order to function in an adaptive way. 


Parse Disambiguation 

As stated above, in order to obtain a correct analysis of the source
and nature of student errors, the parse tree selection module of the
ICICLE system needs to choose between multiple parses or
interpretations of each utterance. This selection may significantly
affect the feedback the system gives to the user because the different
parses may place the blame for the ungrammaticality in different
places. For example, the parser may encounter the following sentence
in the writing of a student: 

(1) * She is teach piano on Tuesdays. 

Different parses of this sentence may involve different
error-capturing mal-rules exhibiting fundamentally different
errors. We claim that having a model of the student?s current mastery
of English will help disambiguate between these choices. For instance,
the system must determine whether the student is perhaps a beginning
learner who is over-applying the auxiliary BE and had intended to
generate the simple present instead: 


(2) [Interpretation] She teaches piano on Tuesdays. 

Another possibility is that the student has mastered the morphology of
the simple present but has trouble forming the morphology of the
progressive tense: 

(3) [Interpretation] She is teaching piano on Tuesdays. 

A third possibility is that the student has actually mastered both
present and progressive tenses but is struggling with forming passive
voice: 

(4) [Interpretation] She is taught piano on Tuesdays. 

To determine which of these possibilities is correct, it is necessary
for the error analysis component to have at its disposal a model of
the student?s grammatical proficiency which indicates his or her
mastery of such language concepts as the morphology of the present,
progressive, and passive forms of verbs. This knowledge aids in
choosing between structurally-differentiated parses by providing 


5

information on which grammatical constructs the user can be expected
to use correctly or incorrectly (McCoy et al., 1996) 2. 

Selective Tutoring 

In a future version of the ICICLE system, after identifying the errors
committed by a user, the system will select a subset of these errors
to send to the tutorial response component for the generation of
instructive text. This will be done by the Error Filter, which will
rely heavily on the user model component. While all errors that ICICLE
identifies will be marked in the output window, we intend ICICLE to
give instruction only on those language components which are at the
user's current level of acquisition; errors on those above this level
are likely to be beyond the user's understanding, while errors on
those which are well-established are likely to be simple mistakes
which do not require instruction. Generating a tutorial response to
errors in either of these ranges is likely to cause frustration in the
learner (Linton et al., 1996). At the same time, many researchers
agree (c.f. Corder, 1967; Rueda, 1990; Vygotsky, 1986) that the
efficacy of second language (L2) instruction is greatly increased when
the instruction is adapted to the needs of the learner in his or her
current state of acquisition. Learnability theory constrains the
ability of L2 learners to acquiring only those features they are ready
to learn (Bialystok, 1978; Ellis, 1993; Higgins, 1995), and
specifically in the case of our target learner audience, Kelly (1987)
argues that pointing out every error committed by the deaf writer has
the potential to be counter-productively overwhelming. 

These are all very strong arguments for the system to target its
instruction selectively. Therefore, when ICICLE decides what user
errors should provoke instruction, it should narrow this choice to
that "narrow shifting zone dividing the already-learned skills from
the not-yet-learned ones" (Linton et al., 1996, p. 83). Focusing
instruction on this range has been the goal of other instruction
systems such as Meno-tutor (Woolf, 1984; Woolf and McDonald, 1984),
MULEDS (Opwis, 1993; Ploetzner et al., 1990), and LEAP (Linton et al., 1996).
 

The need to be selective when deciding what should be tutored upon
places an additional demand on the user model: not only must it show
the user's command of each grammatical structure, but it also must
indicate which structures are likely to be learned next. With such
knowledge, the error analysis component may trim away those errors
outside this indicated realm of accessible and productive learning. 

A VIEW OF SECOND LANGUAGE ACQUISITION 

So far, we have established that a user model that indicates (1) a
student's current level of grammatical competence and (2) what
grammatical structures the user is currently attempting to acquire
would greatly benefit a CALL system. We turn now to some linguistic
theories of the Second Language Acquisition process as a basis for our
user model's design. 


Interlanguage 

In previous work on the ICICLE user model, we established its
essential nature as a representation of the user's location along the
path toward acquiring written English as a second language (Michaud
and McCoy, 1999, 2000). Corder (1973) stressed the need for such a
representation to go beyond the question, "Does the learner know
structure X?" to model the answer to: "What language 


2. This is not to say that the user will not make mistakes in
already-mastered material. What we wish to produce is the most likely
parse given the current mastery of the language. 


6 


rules is the learner using?" To design a representation to contain
such information, we turned to the Interlanguage theory of Second
Language Acquisition (Selinker, 1971, 1972). In this theory, a learner
produces utterances in the L2 from an internalized grammar which
represents his or her hypothesis of the L2. From an initial hypothesis
I1, the interlanguage is revised systematically over time as the
learner acquires the L2, progressively moving toward a model which
results in more native-like linguistic performance (Corder, 1974). The
target of these revisions is some In, an approximation of the L2 3 . 

A Language Hypothesis 

If the learner's interlanguage were captured at a given moment in this
progression, it would be a "complete" grammar of the L2 as the learner
understands it at this time. Corder (1974) referred to this individual
language hypothesis as a kind of "transitional" dialect which he
labeled an "idiolect." Somewhere between the learner's earliest
hypothesis I1 and the target goal In, the stage Ii is therefore a
distinct interlanguage in and of itself (Schwartz, 1998; Schwartz and
Sprouse, 1996), and is the grammar from which the learner is currently
generating L2 utterances. 


The core of ICICLE's user model, a representation of current user
language competence, strives to capture this Ii-- a moment of the
user's internalized grammar. The contents of this model reveal to the
system the status of the learner's acquisition of the grammatical
structures recognized by the system.
 

Interlanguage Transitions 

Although we may speak of Ii, a given stage of a learner?s
interlanguage, we must also acknowledge that the interlanguage is not
a static entity. As a hypothesis, it is under constant revision as the
learner systematically examines portions of the hypothesis and updates
them to reflect the L2 more closely. Over time, more of the
interlanguage correctly models the target language, and less reflects
incorrect assumptions. 


Many researchers refer to this transitional process as Hypothesis
Testing, where the learner is actively engaged in the systematic
comparison of a portion of his or her interlanguage hypothesis against
the L2, inducing new rules and testing their validity (cf. Corder,
1974) 4 . It is understood that the learner perceives a difference
between his or her productions (generated by his or her internal
grammar, the interlanguage) and the input which he or she receives,
leading to a revision or restructuring of the interlanguage grammar in
favor of the target-like form (Brown and Hanlon, 1970; Carroll, 1995;
Selinker, 1971, 1972, 1992). Ellis (1994) characterizes each "step" of
the interlanguage grammar as sharing "rules" with the previous step,
but differing in that some rules have been added or revised. This
revision affects the portion of the interlanguage hypothesis that is
currently the focus of the learner's Hypothesis Testing. 


The Frontier of Acquisition 

As interlanguage transitions occur between states Ii and Ii+1, this
portion of the hypothesis which is the focus of the learner's
attention is of interest to us because it represents those language
structures 


3 If the learning were perfect, the resulting grammar would be that of
the L2. However, many researchers believe that the L2 acquisition is
almost always imperfect and that rule fossilization often occurs prior
to actually achieving the L2, therefore resulting in an imperfect
approximation. 


4 Note that the selection of the portion of grammar to come under
Hypothesis Testing is systematic and not a random choice. The
uniformity of the sequence of acquisition across different learners is
discussed in more depth later. 


7 


the learner is currently in the process of acquiring, the very set of
structures we discussed above as the ideal focus of effective tutorial
instruction because they are neither already mastered nor beyond the
user's reach. 


We have adapted Lev Vygotsky's concept of a subset of material being
mastered that is currently within the learner's grasp to acquire, the
Zone of Proximal Development (ZPD) (Vygotsky, 1986). The general idea
has been already applied to second language acquisition by researchers
such as Washburn (1994) and Krashen (1982), who stated that when the
learner is at some level i in acquiring the L2 grammar, there is some
part of the grammar at level i+ 1 that the learner is "due to
acquire.". Applying the ZPD concept directly to the interlanguage
domain, therefore, the ZPD corresponds to the portion of the
interlanguage that is the center of Hypothesis Testing and is in the
process of making a transition to the target (L2) grammar. 


The identification of the ZPD for a given second language learner
would be an ideal indication of the next language structures he or she
will acquire?the ?frontier? of the acquisition process? and
consequently, those structures on which instruction would be most
beneficial because they are neither well-established nor beyond his or
her ability to learn at this time. 


Researchers in the acquisition of learner knowledge in general
(c.f. Ragnemalm, 1996) and language acquisition in particular
(c.f. Brown and Hanlon, 1970; Corder, 1967; Ellis, 1994; Krashen,
1983) have noted the ?immature? or ?transitional? performance which
appears for a time before acquisition. From this it is possible to
conclude that the changing errors observed in second language learners
are indicative of the frontier of their acquisition process. While a
single error may be relatively meaningless, the consistent occurrence
of specific errors gives us information from which we can describe ?a
picture of the linguistic development of a learner? (Corder, 1974,
p. 125), or the learner?s current interlanguage state (Ii). 


Toward Modeling the Interlanguage 

We stated above that by modeling the user?s language proficiency, we
would empower the ICICLE system to discriminate between competing
syntactic parses of user utterances on the basis of what L2
performance can be expected from this user. In order to describe how
this would proceed, we must first establish the relationship between
our user model representing a learner?s interlanguage state Ii and the
parses between which ICICLE must choose. These parses are made of
syntactic constituents formed by rules in our parsing grammar. As
discussed above, this grammar consists of rules reflecting both
standard and error-containing English syntax. In effect, each
?language structure? in the user?s interlanguage is realized by some
group of specific syntactic rules representing different possible
realizations, both correct and incorrect. 


As the learner forms and tests an interlanguage hypothesis, he or she
focuses on certain structures and tests the hypothesis concerning
these structures. If learning is completely successful, as acquisition
progresses, a given structure which was first represented within the
interlanguage hypothesis via one or more mal-rules will be revised to
eventually be represented by good grammar rules that correctly
generate it. Note that before this happens, an initial revision may be
in favor of other mal-rules representing misconceptions, but which are
closer on the path to a correct characterization of the L2, the
intended target of the sequence of interlanguage revisions. 


Our presentation of a learner?s Ii can therefore be characterized as
consisting of a set of language ?rules,? each modeling how the
language structures of the L2 are theorized by this learner at this
point. Some rules will model standard L2 production, and some will
model misconceptions. Since some of the structures involved are
undergoing transition because they are in ZPDi, there will also be
competing rules, both correct and incorrect, which cover the same
structure. When working out the nature of the L2 during Hypothesis
Testing, the learner?s productions may vacillate between 


8

the competing realizations of a structure in the stage of currently
being acquired. 


One Grammar, Many Users 

Since the learner generates sentences in the L2 using the rules which
are in his or her interlanguage, it is clear that our parsing grammar
must include the rules that could be found in any learner?s Ii in
order to have the ability to recognize the syntax of any learner?s
writing. 


Two characteristics of ICICLE?s user drive the determination of the
contents of this parsing grammar. First, as mentioned earlier, ICICLE
is designed to be used by a heterogeneous user group, spanning a broad
range of English writing proficiency. Secondly, ICICLE is intended to
follow a given user over time and the development of new language
skill. Therefore, the parsing grammar cannot reflect a unique Ii but
rather must reflect the union of all possible Ii. This means that it
must contain all of those rules which could be in any Ii for any user
of the system. This includes a complete, broad-coverage grammar
representing the English language, to capture all of the structures
which a learner may have already acquired. It also includes a broad
range of mal-rules to cover any of the misconceptions a learner may
have during the progress toward L2 competence. 


Since this grammar is designed to ideally encompass all possible Ii,
it is analogous to the concept of a ?space? of possible student models
mentioned by Sleeman (1982) and Opwis (1993), among others, usually
defined, like in our case, as a set of rules and mal-rules. Our
grammar can therefore be postulated as a grammatical space containing
all possible interlanguages Ii. This space, which we will call
I,contains: 


* All correct grammatical rules which model standard L2 production.
 
* All possible incorrect rules modeling incorrect hypotheses of
grammatical structures (malrules) exhibited by the population5 . 

Given this definition of I, a specific user?s interlanguage state Ii
at any point in time i would consist of a subset of I, specifically: 

* Those correct grammatical rules which realize grammatical structures
he or she has acquired. 

* Those incorrect rules (mal-rules) which represent remaining
misconceptions which the process of Hypothesis Testing has not yet
repaired. This represents the user's implementation of those
grammatical structures still unacquired. 

* Both correct and incorrect rules covering those structures currently
undergoing Hypothesis Testing (the current ZPD, which we will label
ZPDi). 

This characterization of the user is illustrated in Figure 3. Those
structures which are acquired are represented by correct rules (shaded
squares), while those about which the user maintains misconceptions
are shown as mal-rules (shaded "broken" shapes). Structures in this
user's ZPDi would be represented by both types of elements as
different realizations compete with each other during Hypothesis
Testing.
 

5. To develop a grammar to approximate I, we collected a large number
of writing samples from the population. These samples spanned a wide
range of writing proficiency and thus represented learners at various
stages of acquisition. We have attempted to develop a grammar that
could parse this corpus (Schneider and McCoy, 1998). The resulting
grammar approximates I. 


9 


Figure 3: The user's interlanguage as a subset of grammatical space. 

Reflecting the Individual 

Our modeling task is therefore to determine what rule subset forms a
user?s Ii.We do this by observing his or her written productions in
conjunction with an inferencing mechanism based on typical acquisition
order in order to fill in missing knowledge6 . Although the
Interlanguage theory is based on verbal L2 production, observations by
such researchers as Pennington (1996) have concluded that L2 writing
is also reflective of a systematically changing cognitive
interlanguage representation of the L2. Mangelsdorf (1989) holds that
Hypothesis Testing is visible in written L2, and that learners
performing Hypothesis Testing in a written format will also adjust
their knowledge based on the feedback they receive, bringing their
interlanguage closer to the L2. With respect to the L2 writing of deaf
students, Russell et al. (1976) observe the existence of systematic
rules which, while frequently representing a deviance from Standard
English, approach a more standard model of the language over time. 


Given this and other work in acquisition (c.f. Krashen, 1981; Krashen
et al., 1978; Kelly, 1987; Tarone, 1982), we can conclude that the
kinds of expository texts which are the expected input to the ICICLE
system will indeed reflect what the user has truly acquired, and
therefore are much truer indicators of the learner's Ii than if the
system were based on elicitation exercises, where the learner would be
more focused on grammar than communication. 


The implication is that we can expect structures which have been
acquired previously to state Ii to occur in the learner's language
usage without significant variation or error, while structures in the
ZPD at state Ii (ZPDi) will exhibit the variation typical of the
transitional competence. Meanwhile, structures beyond the ZPD should
either be absent from the learner?s language production because of
avoidance, or they should be used with consistent error because they
cannot be avoided. At state Ii+1, many structures which were in ZPDi
have now been acquired and should be part of the learner's area of
competence, and a new ZPDi+1 will have been formed as new rules come
under the focus of the learner's Hypothesis Testing. 


Implementation 

For our implementation of the ICICLE user model, one of the approaches
we have incorporated into our design is that of an overlay model. The
underlying idea of an overlay is that the user's knowledge is
represented as the subset of some knowledge space stored in the
system7.Since we 


6 This inferencing mechanism is discussed in more depth later.
 
7 Sometimes in an overlay model this subset is marked strictly in a
boolean fashion--the user has acquired a particular unit, or has
not--and sometimes degrees of certainty or degrees of mastery are
indicated. 


10 


have described our user model as essentially representing the user's
Ii, the knowledge space here involved is the space of grammatical
constructs in the English language. 

This overlay-based aspect of ICICLE?s user modeling component is
called MOGUL, for Modeling Observed Grammar in the User's Language. We
refer to the basic unit of information in MOGUL as a Knowledge Unit,
or KU, a term borrowed from Desmarais et al. (1996), who define the
concept as representing a meaningful unit in knowledge of the domain,
where the user?s mastery of a KU can be reliably assessed. In the
domain of language mastery, we see each language "structure" as a
Knowledge Unit. They are essentially associated with bundles of the
grammatical rules in our parsing grammar. For example, one KU might be
Subject Relative Clauses. Associated with this KU would be all of the
rules from the parsing grammar which implement this concept. This
includes not only those rules modeling correct execution of this type
of relative clause, but also the mal-rules which realize the ways in
which this structure is executed incorrectly by the learner
population. We assess the user?s mastery of each KU as follows: 


* Structures which the learner uses consistently correctly (reflecting
grammar rules modeling standard English) are assessed as acquired. 

* Structures which exhibit consistent error, when present?generated
from mal-rules, presumably--are assessed as unacquired. 

* Structures which exhibit variable performance are determined to be
in the ZPD. 


This mastery is recorded on each KU of the MOGUL model by tagging KUs
as "acquired," "unacquired," or "ZPD." In the next section, we address
how we have extended this basic idea to compensate for possibly
incomplete data on the individual user. 

Incomplete Knowledge 

As described so far, the data stored in MOGUL is derived by the system
from the written language performance of the user. An inescapable fact
of deriving language data from freeform written text, however, is that
through lack of opportunity or deliberate avoidance, many language
structures will be absent (Corder, 1973; Wilbur et al., 1976). This
complicates our modeling process because we are forced to proceed on
incomplete data about learner characteristics. 


In the Grundy system, Rich (1979) also addressed the problem of basing
system decisions on small amounts of possibly incomplete information
about the user. In order to compensate for the fact that Grundy was
able to collect data on only some characteristics of the user, Rich
developed the idea of "stereotypes" which essentially represented
populations of users and their associated typical
characteristics. This allowed some of the data already collected on
the user to "trigger" associated characteristics which could be
inferred about the individual until such a time as other data
contradicted that conclusion. An example in Grundy's domain (the
suggestion of books to a library patron) would be the inference that
because a patron is female, she may be interested in a romantic
novel. If the patron indicated otherwise during her interaction with
Grundy, this inference would be overridden. 


In our application we have been inspired by Rich?s use of
stereotypes. Analogous to the user characteristics she describes, we
have the tags (acquired, unacquired, ZPD) marked on various KUs in the
MOGUL model. However, we distinguish our approach from Rich in several
key ways. We define our stereotypes not just as groups of users within
a category but rather as collections of individuals who are generally
at the same point on the path toward acquiring a language. Therefore,
our collection of stereotypes is really a series of way-points a
typical user passes through along this acquisition path. Rather than a
specific characteristic of the user "triggering" a stereotype, the 


11


overall similarity of the user to a stereotype (i.e., looking at the
collection of characteristics as a whole picture) is what causes
stereotype data to be activated for that user (this will be discussed
in more depth later). Finally, because of our definition of a
stereotype, we are not merely concerned with enabling the system to
repair incorrect stereotype selection to home in on a static user, but
to allow for a user who is learning, so that the stereotype selection
criteria must include adaptation to data which is changing as the user
moves along the acquisition path to a new level. 


One problem that we have faced is that if our stereotypes are to
capture typical acquisition sequences, we must make explicit what that
acquisition sequence is. While there has been some work in linguistics
that indicates common patterns in L2 acquisition, no work has
specified these patterns in sufficient detail to be able to capture
them computationally. The next Section addresses this problem. 


The Order of Acquisition 

In other user modeling systems lacking data about a specific KU with
respect to a user, inferencing behavior is sometimes implemented to
empower the system to resolve missing data on the user based on
relationships between KUs about which the system has no knowledge
(call these "empty") and those containing known tags. Typically, this
is accomplished by modeling the acquisition relationships between the
KUs. Often, the relationships indicated are prerequisite
relationships; in other words, the model may show that KUm (for
example, algebra) is a prerequisite concept to understanding KUn (for
example, calculus). If, then, the user has exhibited a mastery of
calculus, the system can infer that the user may have also mastered
algebra even if there is no direct evidence for that conclusion. 


In our domain, we can establish analogous relationships which exist in
two dimensions: concurrent acquisition and order of
acquisition. Researchers in L2 acquisition have suggested that the
errors made by a language learner over time change in a systematic
fashion (Dulay and Burt, 1974), and furthermore that there is support
for a typical sequence of acquisition for language structures (Bailey
et al., 1974; Dulay and Burt, 1975; Gass, 1979; Krashen, 1982;
Larsen-Freeman, 1976; Schwartz and Sprouse, 1996; Schwartz, 1998),
sometimes called a ?built-in syllabus? for L2 acquisition (Corder,
1967; Higgins, 1995). There are competing models of the acquisition
process which account for this, and the details of the particular
sequence of acquisition (i.e., the built-in syllabus itself) are
rather vague in the literature. Without espousing a particular model,
we borrow from the consensus of these accounts the notion of the
acquisition of language occurring in some stereotypical sequence. We
describe here our computational model for capturing this idea. We
refer you to (Michaud and McCoy, 2004) for descriptions of how we have
made empirical efforts involving a corpus study of written samples
from users at various levels of proficiency in order to establish a
concrete order for this syllabus. 


We have incorporated the notion of a sequence of acquisition into an
architecture which is based on a partial ordering of language
structures (grouping together those structures which are acquired
concurrently, and ordering these groups according to stereotypical
sequences) in order to enable the ICICLE system to infer tags based on
these acquisition relationships. Below, we describe how these
acquisition orders are incorporated into our model of the user's
interlanguage state. 


SLALOM: An Organization on Grammatical Space 

The name of the stereotype architecture which is coupled to the MOGUL
model is SLALOM (Steps of LanguageAcquisition in aLayered Organization
Model). The design was originally proposed in (McCoy et al., 1996) and
has significantly evolved over the years (Michaud and McCoy, 1999,
2000; 


12


Figure 4: SLALOM: Steps of Language Acquisition in a Layered
Organization Model. 


Michaud et al., 2001). A very simplified representation of SLALOM?s
basic structure can be found in Figure 4, where each KU is represented
by a rectangular box. 


The first half of SLALOM?s name, Steps of Language Acquisition, refers
to how SLALOM captures the stereotypic linear order of acquisition
within certain "hierarchies"8 of morphological and/or syntactic
structures such as negation, noun phrase construction, or relative
clause formation. The figure depicts a hierarchy as a vertical stack
of boxes. As an example, we illustrate a Morphology hierarchy based on
the findings of Dulay and Burt (1975)9 . Within these hierarchies, a
given morphosyntactic KU is expected to be acquired subsequent to
those below it, and prior to those above it, according to the natural
order of a stereotypical learner from this particular L1 acquiring
English10 . The figure?s example represents the idea that "+ing
progressive" is typically learned before "+s plural nouns," which is
typically learned before "+ed past tense." 


Since we wish to coordinate the acquisition stages between the
hierarchies, dashed lateral connections in the figure represent the
second dimension of relationship stored between the KUs in the model,
namely that of concurrent acquisition11 . We call these lateral
groupings "layers," and the figure has one drawn in as an example12
. This is the source of the Layered Organization Model part of
SLALOM's name, referring to these layered groupings of KUs which
essentially illustrate a progression through the acquisitional
sequence representing those KUs being learned at certain stages of
acquisition. In particular, we expect that students who are just
beginning to learn written English will first struggle with those KUs
in the ?first layer? and then will progress ?up? the hierarchies. A
simple view of this progression can be seen in Figure 5 on page 14. In
reality, some KUs participate in more than one layer, which indicates
the relative speed with which some concepts are acquired (i.e., KUs
that span more than one layer take longer to master completely). 


Like Rich's stereotypes, SLALOM?s acquisition relationships allow the
ICICLE system to infer absent data on the user from that which has
been recorded. The uniqueness of the SLALOM concept is that it
captures a sequence of stereotypes, not as individual images, but
implicitly 

8 The term "hierarchy" refers to the fact that certain KUs are learned
before others. 

9 This is for example purposes only, since their morphology sequence
is relatively simple. The sequence we actually use can be found in
(Michaud, 2002). 

10 Although some research has shown high correlation between the
acquisition orders of learners from different L1s, we wish to
represent the most likely order possible and thus have restricted the
model to representing learners from a specific L1, in our case ASL. 

11 This relationship may also be indicated within a hierarchy, not
shown in this figure, to capture a partial ordering in which some
structures in the same hierarchy are acquired together. 

12 The contents of this layer are not based on empirical findings and
are for illustration only. 

13 

Figure 5: A simple view of stereotype progression in SLALOM. 


within one interlinked architecture. Its layers each indicate the
structures which may at one stage in the learning process form part of
that stage?s stereotypical ZPD. Those layers located "below" in the
model contain structures which are typically mastered before, while
those layers "above" are what is typically acquired later in the
learning process, at a later stereotypical stage. If a KU which is
untagged in MOGUL represents some language structure which SLALOM
shows to be "below" an Acquired structure or a ZPD structure, then the
system can infer that the untagged KU is probably also Acquired, even
if it has not occurred previously in the user?s observed language
production. Likewise, structures sharing the same layer in SLALOM
should share the same tags in MOGUL, and structures "above" the ZPD
layer can be inferred as Unacquired. These indirect conclusions are
not as strong as those based on actual empirical data from the
specific user, but in the absence of stronger data they can be used to
make planning decisions. 


In order to represent the acquisition sequence in this way, we must
"place" each KU into SLALOM in such a way that illustrates its
relationships to other KUs according to how a typical learner moves
through the acquisition sequence. A prototype placement has been
implemented, based on an investigation of acquisition data among deaf
students summarized in (Quigley et al., 1977) and (Wilbur, 1977). We
again refer the reader to (Michaud and McCoy, 2004) for a discussion
our ongoing work to develop SLALOM placements based on our own corpus
of writing data. 

Representing a Changing User 

Many user modeling systems are more concerned with how to utilize the
information stored within the model in order to drive system decisions
than with the tasks of initially building up that information about
the user (cf. Lesgold et al., 1992; Woolf, 1984) or with changing it
to reflect a dynamic user over time (cf. Paris, 1988, 1993). In those
systems, the user model is often static and assumed ?given? from some
outside source. Bull et al. (1995a) point out, however, that a student
model must be dynamic in order to change as learning occurs. Since
ICICLE's user model needs to capture fine details about the user
derived from user performance, and ICICLE will be used by an
individual over time and across the development of new skills, it is
central to the model design that it be established how this model is
initialized and updated over the course of time. 


Initialization 

The initialization data for ICICLE?s user model comes from the first
parses performed by the error identification component and the
subsequent first analysis of the user?s performance on the KUs
captured in the system. Once this information is received by the user
modeling component, it can assign tags in the model according to how
the user performed on his or her first writing sample and all of the
language structures it contains. 


One serious issue this raises is that the first parse of the first
piece of writing by any new user must be performed without the use of
the user model to help the system select between parses. This appears
to be a non-trivial problem, involving the need for information on
expected user 


14

Figure 6: The retrieval/update cycle between the error analysis phase
and the user model. 


performance without yet having any data on exhibited user
performance. One approach taken by similar systems such as EDGE
(Cawsey, 1993) is to base the initial state of its user model on a
general classification of the user provided to the system. We have
embraced a similar approach; a new user account is created with a
classification of "Low," "Middle," or "High" written English
proficiency, depending on the user's self-reported
classification. This is a stereotype classification only, essentially
placing the user?s current acquisition level at one of the layers in
SLALOM; it is not recorded within the MOGUL model, because it is not
considered to be as reliable as genuine grammatical competency data
specific to the individual. In the domain of language and writing
proficiency, it is difficult to obtain self-assigned or even
teacher-assigned classifications which are not highly subjective and
variable according to individual standards. This classification
suffices, however, to provide a "best guess" framework on which to
base the first analysis of a new user's work13 . 


Once ICICLE successfully performs this first analysis, an initial set
of tags is placed on the KUs representing language structures in
MOGUL. This information immediately takes precedence over the initial
SLALOM placement and determines any future relationship between the
user and any assumed stereotype, because it is our belief that the
MOGUL model, being based directly on the user's writing in
multi-sentence samples of potentially significant length, has a very
rich and reliable source of input on the user?s actual language
proficiency. In comparison to polling user knowledge where one
question is only likely to reveal one point of data (either the user
understands or does not understand the concept asked about), even a
short piece of writing is going to offer many points of data per
utterance. Every grammatical construct successfully or unsuccessfully
used, from determiner choice to verb tense, provides information about
the user. These points can be correlated to provide a map of those
constructs used correctly, those which are experiencing variation,
those which are occurring with error, and those which are absent;
therefore, even after only a single writing sample, the MOGUL profile
of the user?s interlanguage will contain a rich store of
directly-derived data about this individual. This avoids the sometimes
undesirable activity (Anderson et al., 1990; Rich, 1979) of having to
question the user extensively in order to determine appropriate system
actions. Finally, as discussed above, free-form writing for
communicative purposes is a more reliable source of interlanguage data
than grammar performance elicited through tasks such as translation or
fill-in-the-blank exercises. 


Updating 

Every time ICICLE performs an analysis of user text, the system has
new (and potentially different) observations of user performance to
add to MOGUL, and that information is sent along so that it may cause
the model to be updated accordingly. The relationship of the user
model to the error 


13 For our discussion of an evaluation in which we test how well
ICICLE adapts from an initial stereotype guess to a more accurate
portrayal of the user, please see (Michaud et al., 2005). 


15 


Figure 7: The "sliding window" of performance history. 

analysis phase is represented in Figure 6. The tags in MOGUL may be
deemed incorrect for one of two reasons: an existing tag might have
been based on a small sample size of user performance and more data
might show that it is wrong, or the user's proficiency may simply be
changing, as we hope it will over time. In either case, a tag may
possibly need to be revised either "upward" (e.g., from "ZPD" to
"Acquired") or "downward" (e.g., from "Acquired" to "ZPD"). ICICLE
must therefore have a mechanism for re-evaluating these tags and
updating their values. 


Glaser et al. (1987) note that learning assessment must be able to
account for changes in performance as new stages of acquisition are
manifested in learner performance. However, a model that can be
overwritten over time gives rise to the question of whether new data
should always champion over the old, as with the EDGE system, which
always overwrites previously recorded tags (Cawsey, 1993). The
guidelines given thus far of how observations are recorded in the
model have been that features used ?consistently correctly? are
acquired concepts, those used "with variation" are in the ZPD, and
those which appear "consistently incorrectly" have not yet been
acquired. As the system goes through more than one piece of the
student?s work and the amount of data increases, these judgments may
change, particularly if one or more of the pieces is too short to
contain several instances of certain language structures, or a
structure is simply not a commonly used one. Therefore, it makes sense
for the model to track certain figures (the number of times a
structure has been attempted, and the number of times it was executed
without error) across more than one piece of writing and to make
distinctions between figures collected within the most recent piece of
writing and those collected across others in the past (since the
user's proficiency will not change within a given piece, but there may
be change across a selection of them). This allows the system to
examine as much data as possible, strengthening its ability to make
these judgments. 


In this view, the user's writing is seen as a continuum of performance
events over time from the first session to the most recent. But since
the user?s proficiency is also changing, the system should not always
compute performance statistics which include events stretching back to
the beginning of his or her use of the system, when the performance
levels may have been different. Therefore, our model architecture
includes the maintenance of a "sliding window" of performance history
across the continuum of writing samples stretching into the past from
the current sample (see Figure 7). Ideally, this window includes
enough data to be robust, and yet be small enough to capture only the
"current" statistics14 . This latter requirement is particularly
important for the system's self-evaluation and deciding whether recent
explanatory attempts have succeeded. 


One advantage to using the ?sliding window? approach is that the user
model always reflects the user's "current" levels of proficiency, even
if that proficiency changes in a backward direction. It is, of course,
our hope that backsliding does not occur in our users, but in some
cases it has been observed that a language learner who has been
performing well on a structure will start to make errors again at a
later time. By always tracking how the user is performing only in the 


14 We have not yet implemented the sliding window in the prototype
system, because at this time we have not had the opportunity to
collect significant longitudinal data or to deploy the system with an
actual user over a period of time, as we intend for the future use of
the system. 


16 


"recent" past, the user modeling component will be able to revise its
tags in either direction as that performance changes. 


There are several possible reasons to account for the re-introduction
of errors in performance, including the over-generalization of rules,
the discard of unanalyzed formulae which formerly produced correct
performance without the underlying grammatical knowledge, or the
effect of the cognitive load created by learning new concepts or
focusing more on discourse-building tasks. Although some of these
situations represent a backsliding in performance while actual
proficiency may be advancing, it should be noted that the objective of
MOGUL is to represent a profile of the user's current performance in
order to select the most appropriate parses to match. Therefore, a
downward revision of a tag would be an accurate reflection of the
current production of errors, even if conceptually it seems
inappropriate to tag a structure as being Unacquired or in the ZPD if
the learner is actually very close to full mastery. The tag will catch
up to the learner's actual proficiency when the performance does15 . 


APPLICATIONS OF THE MODEL 

The information in the user model supplies two kinds of data to the
various ICICLE processes: direct data on the user?s performance from
MOGUL, and indirect data which can be inferred at need from the
relationships of untagged items to tagged ones in the SLALOM
hierarchies. The latter data will be less reliable, and ICICLE will
take note of this by weakening the conclusions drawn from it. Details
on how the user model is and will be applied are discussed below. 


Parse Disambiguation: Choosing a Parse Using Likely Rules 

As discussed earlier, a given learner from our population generates
written English from some Ii,or a subset of the parsing grammar which
includes some standard rules and some mal-rules depending on his or
her level of development. Our MOGUL model captures this Ii by
indicating which grammatical structures are acquired (and thus are
realized in Ii by standard rules), unacquired (realized in Ii by
mal-rules), and in the ZPD (realized in Ii by competing rules which
result in variable performance). 


This information is used by the Parse Tree Selector module of ICICLE
to select a parse tree that uses rules that are most likely to be in
Ii. Recall that a parse tree of a sentence is made up of rules in the
grammar that were used to parse the sentence. In the current
implementation, we use both portions of our user model (the direct
proficiency data in MOGUL, and the stereotypic acquisition data in
SLALOM) to score each parse tree with the intent of giving trees that
are made of rules that are believed in be in the user?s Ii higher
scores than those which are constructed of rules that are not likely
to be in the user?s Ii (and therefore unlikely to appear in their
language production). 


The scores are assigned as shown in Table 1. If a grammatical
structures has not been acquired by the user, we do not expect that
user to be able to execute it without error; therefore, a tree node
formed by a mal-rule representing that structure is scored +1, while a
node formed by a regular (correct) rule is scored -1. In the case of
an acquired structure, the expectation is reversed?that the user would
be likely to use it correctly, not commit an error while attempting
it. Therefore, nodes formed by regular rules are scored high (+1) and
nodes formed by mal-rules are scored low 


15 We do acknowledge that a problem with 'inaccurate' tags from a
proficiency point of view is how the user modeling component may use
SLALOM to draw incorrect inferences from these underestimated
tags. This is another of the reasons why inferences based on the
relationships in SLALOM are not as certain as those based directly on
actual performance from the learner. 


17 


Table 1: Calculating Node Scores. 

MOGUL Tag Rule Type In Ii?Score 

Unacquired Regular no -1 
Mal-Rule YES 1 

ZPD Regular YES 1 
Mal-Rule YES 1 

Acquired Regular YES 1 
Mal-Rule no -1 

(-1). Structures in a user's ZPD are realized by competing correct and
incorrect rules in Ii,so ZPD-marked structures are scored the same
regardless of whether the structure is executed correctly or poorly. 


Given this assignment of scores throughout the nodes of the trees,
each parse tree is assigned a single core which is the average score
of all of the rules used in the parse tree. A tree which has the
highest score16 is selected as the single most likely parse. 


One of the benefits of selecting a [-1, 1] interval over which to
express these scores is the notion of 1 representing a "strong
positive" and -1 a "strong negative." The value 0, being in between,
would reflect an inability to state a belief about the rule?s
status. Although 0 is never assigned to a rule node17, it is clear that
later iterations of the scoring mechanism may use the strength of
evidence on a tag to produce a measurement of the confidence in the
score. If this confidence measure were expressed in terms of a
percentage, e.g. 90%, it could be applied against either "yes" or "no"
scores by multiplication to affect the strength of the score. Low
confidence levels would bring the score closer to 0, the neutral
statement. 


We consider this to be a reasonable approach toward parse disambiguation given that true intelligent disambiguation requires recognition of learner intent (Corder, 1974); in the absence of this, we have only what Corder labeled "plausible interpretations." However, with this approach 
to a kind of "probabilistic" grammar (where rules in Ii are considered
more likely), ICICLE has access to information determining which of
these interpretations is more plausible than the others. This also
addresses the problem raised by Sleeman (1982) and Self (1988) of the
immense search space that is daunting in this kind of user modeling by
quickly giving access to the most likely interpretation. 


Focusing Instruction on the Frontier of Acquisition 

We discussed earlier the importance of focusing generated tutorial
text on the frontier of the learner's acquisition process, which was
later equated to those structures in ZPDi. As established above,
instruction and corrective feedback on aspects of the knowledge within
ZPDi may be beneficial, while instruction dealing with that outside of
the Zone is likely to be ineffective or even detrimental. When passing
the error list to the response module, therefore, the error
identification module of ICICLE will consult the MOGUL tags described
here in order to prune the errors so 


16 At this time there may be more than one parse tree receiving the
highest score; we are working to further strengthen the
differentiation power of the scoring procedure, as discussed in
(Michaud et al., 2005). 


17 Except in a small number of situations where a rule in the parsing
grammar does not represent a grammatical structure, as is the case
with some "fix-it" rules which merely insert certain key
features. These nodes are not included in the tree average. 


18 


that the tutorial responses are focused only on those errors at the
user's current level of language acquisition (in ZPDi). 


EVALUATING THE MODEL 

As mentioned above, ICICLE?s user model has been implemented in the
system with a MOGUL component and a prototype SLALOM architecture
representing stereotypical acquisition sequences. We have performed
several evaluations of the performance of the model and have published
the complete discussion of those evaluations and their results in
(Michaud et al., 2005). Here we briefly summarize the results. 


Currently the ICICLE system is a prototype implementation that has
only been tested in the laboratory by researchers using testing data
from our aforementioned collection of writing samples from students
who are deaf. We acknowledge that in order to evaluate whether or not
the ICICLE systems approach to language instruction is beneficial to a
"real user," much additional work would need to be done to ensure
system robustness in the face of naive users with unrestricted text
input. Given the limits of the prototype, however, theoretical testing
has illustrated that our approach to user modeling shows significant
promise in parse disambiguation, adaptation to the changing learner,
and personalization to a non-stereotypic acquisition journey. 


Parse Selection Based on the User Model 

We examined the ICICLE system's parse selection process to see if it
successfully makes "correct"18 parse selections based on the user
model. We provided the system with 15 randomly-selected sentences to
parse, and tested these sentences on each of three hypothetical,
stereotype-based users: a Low-level learner, a Middle-level learner,
and a High-level learner. We examined the parse selection choices for
each level, which in each case accurately reflected the stereotypic
performance expected by users at each of those phases in acquisition,
rewarding those interpretations containing structures we expect to be
in Ii for a learner of that level, while penalizing those we do not
believe to be in Ii. 

Following the User's Learning Journey 

In this evaluation, we provided ICICLE with an initial user stereotype
placement, and then provided the system with a series of utterances
representing higher performance than the stereotype placement
indicated. By doing this, we sought to simulate a user whose
proficiency was changing in a positive fashion, in order to see if the
system would recognize this difference and update the model
accordingly19 . Recall that we expect our user?s language proficiency
to be dynamic as learning progresses, and that eventually the user
model record for a user at step i in his or her language acquisition
will adapt when the user is at some later step >i. 


We selected a batch of 20 sentences from samples in our corpus, all
representing the successful execution of higher-level structures that
would be expected from the user as described in the initial stereotype
placement. The system successfully recognized the improvement in user
performance and updated the user model information appropriately; in
fact, when given the sentences in small 


18 As compared to a human judge. 

19 There is no bias toward upward revision of the user?s learning
status. Although we chose to illustrate a learner's upward transition
in this example, the ability of the user modeling component to adjust
the stereotype is the same whether it is being revised higher or
lower. 


19 


increments, allowing for a user model update between each utterance,
this update was achieved before the end of the 20 sentences and
resulted in a higher level of parse selection accuracy20 . 


The Unique User 

While some users may fit stereotypes well, each learner is unique, and
may have a unique set of mastered skills and skills which he or she
has yet to learn. In a final evaluation task, we explored the
integration of the two facets of the user model; specifically, we
tested ICICLE's ability to integrate the stereotype expectations which
SLALOM provides with the specific history of parses that it records in
the MOGUL model. 


For this task, we chose to create a profile representing an advanced
learner in whose language mastery there exist certain fossilized
structures that are executed with error in a fashion "atypical" of
that level of performance21. We sought to illustrate that if the
system has built up a performance history for this user illustrating
these differences, it will select parses more appropriate to the
user's actual interlanguage Ii even if that deviates from stereotype
expectations. 


For this evaluation, we chose from the corpus 35 example sentences to
represent our learner with fossilized errors. These sentences
exhibited advanced constructions, but contained errors in some basic
elements normally acquired by an advanced learner. We trained the
MOGUL model on portions of this selection of sentences and then tested
the recorded user data on held-back sentences to see if the uniqueness
of this user's language mastery was accurately reflected in
MOGUL. While the stereotype expectations of SLALOM would not have
always indicated the correct parse because of the decreased likelihood
of such "beginner errors," with the stereotype profile integrated with
the MOGUL profile of the user's individual performance, the system was
able to perform much better in the parse selection task. 


COMPARISON TO OTHER SYSTEMS 

Systems for the Deaf 

The use of computers with the deaf has a long history, stretching back
to the beginning of the 1960s (cf. Richardson, 1981, for a survey); as
early as 1968, computers with writing programs were being used at
NTID22 to assist the development of English syntax (Watson,
1979). Overviewed here are two of ICICLE's peers in the domain of
computer systems for the deaf in order to contrast their approaches
with ours. 


Ms. Plurabelle 

The Ms. Plurabelle grammar-checker (Loritz et al., 1993; Loritz and
Zambrano, 1990) was developed in conjunction with Gallaudet University
(the only liberal arts college for the deaf) with the intention of
creating a system capable of handling the written English of deaf
students with a high degree of accuracy. Like ICICLE, it consisted of
a parser and a feedback mechanism which pointed out errors to the
user. Unlike ICICLE, it did nothing to model the individual user. 


Although the creators of Ms. Plurabelle did agree with the goal of avoiding pointing out errors beyond the ability of the user to correct (Loritz and Zambrano, 1990), they did not avoid instructing 

20 When compared against a human judge. 

21 Fossilization is actually a common occurrence in second language
learning, reflecting some low-level linguistic elements which remain
unacquired even after the learner has significantly progressed. 

22 The National Technical Institute for the Deaf. 


20 


on concepts well-mastered by the student as discussed
earlier. Instead, they simply worked from an initialized proficiency
level representing the student (a numerical value between 1.0 and 5.9)
which allowed the system to prune error notifications which it judged
would be beyond the student''s understanding. This proficiency level
was static and was provided to the system at onset of use. 


In testing with actual users, it was found that even the pruned error
messages were so confusing to the users that they got themselves mired
in decoding the message content23 . Users were actually more
successful at determining the correct version of the sentence if they
were simply allowed to experiment with alternative hypotheses without
any feedback from the system (aside from flagging sentences with
errors). 


Loritz and Zambrano (1990) admit there are many limitations to the
Ms. Plurabelle system, including the fact that it stops parsing at the
first error in the sentence (and is therefore unable to report on more
than one error per sentence). For the domain of parsing the English of
deaf students, however, it was a clear improvement over commercial
grammar-checkers because the parser (ENGPARS) was a generalized
transition network trained on a corpus of writing from the population
and therefore it performed at a much higher level of accuracy than
systems which were designed for hearing individuals writing in their
native language. ENGPARS was reported to reach as high as 90%
accuracy. Within the context of a grammar-checker, this performance
improvement is significant and to be appreciated; we acknowledge that
ICICLE's parser has not yet been similarly tested with its real user
population. However, within the context of assisting students in
improving their grammatical performance, adapting to individual
students, or changing to meet the needs of a student over the learning
process, Ms. Plurabelle's approach would fall far short. Although the
system was trained to parse the writing of a population of users, it
is a general population, and the approach fails to account for a
heterogeneous group of learners whose different levels of grammatical
competence might result in different parses being more likely. This
type of adaptability is the core motivation behind our work with the
ICICLE system. 


Writing Safari 

The "Writing Safari" system (Farrow et al., 1994) presents a somewhat
complementary approach to that discussed above. Where Ms. Plurabelle
concentrated on enabling a system to parse the grammatical productions
of the deaf population without providing instruction to the students,
Writing Safari is more of a learning environment than a parser,
presenting to the user a kind of back-plot to the writing process,
inviting him to accompany a set of characters on a choice of trips to
Kadaku and exposing him to samples of writing from these characters in
a variety of genres-- journal entries, letters home, etc.--the
selection of which depends on the trip chosen. It leaves the actual
analysis of the writing in the hands of peers and teachers. The
instructional areas of the system are independently navigated by the
user; for instance, a special "danger zone" area of the knowledge
presented on the system warns the writer of "deafisms," or common
mistakes made in these contexts. 


Although Writing Safari in general represents a very different kind of
environment and emphasis from that envisioned for ICICLE, the
philosophies of the authors are not incompatible with ours. They
stress that errors should be viewed as "a reflection of a developing
understanding of how English grammar is used to express a certain set
of ideas," a view very much in tune with our approach to using user
performance as a key to describing his or her developing language
proficiency. Like the creators of Ms. Plurabelle, Farrow et al. note
that most grammar correction is wasted 


23 This motivates one of our current directions in the ICICLE project,
the implementation of an animated agent on the screen that can use
some signed communicate to express some of the feedback to the user. 


21 


because the receiving student is not developmentally ready for
it. They also mention that the ZPD theory illustrates the importance
and efficacy of adjusting instruction to the student. 


Discussion 

A survey conducted by Rose and Waldron (1984) showed an explosion of
computer use in programs for hearing-impaired and deaf students in the
mid-80s; McAnally et al. (1987) lamented, however, that while the use
of computers with deaf children at that time was becoming very
popular, the rising popularity had "outdistanced" the understanding of
how computerized systems could help deaf students develop written
language skills. It it our hope that ICICLE can prove this to no
longer be the case. 


The direction ICICLE takes with respect to deaf education could be
seen as sitting somewhere between the two approaches embodied by the
Ms. Plurabelle and Writing Safari systems. We wish to capture the
automated grammatical analysis that formed the core of the
Ms. Plurabelle system, while providing instruction and an encouraging
writing environment like Writing Safari. A top-down,
communication-centered instructional emphasis like that modeled in
Writing Safari could be very easily used in the classroom in
conjunction with the syntactic feedback from ICICLE. Although Farrow
et al. argue that grammar-checkers make deaf students over-reliant on
the computer''s ability to correct their errors24, and one could see
that becoming the case with students making use of a
Ms. Plurabelle-style system, ICICLE's focus on an attempt to instruct
the user--and, presumably, provoke him to correct the sentence on his
own without the system simply providing the answer-- provides less of
a crutch and more of an encouragement to learn the reasons behind the
error well enough to avoid repetition. 


Other Language Learning Systems 

This section briefly discusses other CALL systems which are comparable
to ICICLE and how they have addressed similar challenges to the ones
ICICLE faces. 


BELLOC 

The designers of BELLOC (Chanier et al., 1992) share with us the
approach of viewing the utterances produced by users as representative
of sets of internal rules modeling language. In their application for
English-speaking learners of French, users interact in written French
within the domain of negotiating an inheritance. When a user produces
an error, BELLOC seeks to determine the underlying rule (what we would
refer to as a "mal-rule," and Chanier et al. refer to as an
"applicable rule" or AR) in order to provide constructive and helpful
feedback to the learner. 


Like in ICICLE, their grammar is augmented by a set of typical
"learner's rules," or error-containing rules from their
population. They do not, however, take note of which of these rules
(if any) a given user can be expected to use. If correct rules fail to
parse a sentence, a theorem-prover attempts to select from the set of
error-containing rules to achieve a parse. For each sentence it cannot
parse with correct rules, the system narrows the candidate applicable
rules down to a specific set which could explain the resultant
utterance, and then explicitly queries the user on judgments of
certain error-containing phrases to determine which one the user
believes is acceptable French, using this selection to diagnose which
underlying rule is causing the current problem. This extra negotiation
with the user would clearly be overly intrusive if the system were
interpreting an extended piece of text, particularly if the user
produced several errors per utterance. 


24 The authors of this paper would argue that they do the same for
hearing students. 


22 


HyperTutor 

The HyperTutor system (Schuster and Burckett-Picker, 1996) is a
learning tool for Spanish-speaking English learners of reasonable
proficiency. It interacts with the student through a series of
translation tasks, presenting Spanish sentences which the student then
translates into English. Like ICICLE, the system gives notification on
correctness and explanations of the error(s) found, if any. 


Although the authors characterize the HyperTutor user model as
modeling the learner's inter-language, their approach is significantly
different from ours. The essential nature of their model is a store of
the language learning strategies the system has observed the student
using, the idea being that a learner applies these strategies to build
the interlanguage and to provoke transitions. Possible strategies
include applying the L1 grammar to the L2 or over-generalizing an L2
construct to a larger set of instances than is appropriate. The
HyperTutor consists therefore not of an actual profile of the
interlanguage Ii, but rather of what possible reasons may lie behind
an incorrect rule existing in Ii. While this supplies useful
information to a tutorial component, it would fall far short of
enabling the system to handle interpretation tasks outside of the
proscribed domain of phrase translation. 


German Tutor 

German Tutor (Heift and McFetridge, 1999), a CALL system for
English-speaking learners of German, accepts single sentences, parses
them, and provides the user with feedback on the most salient error
found. Its student modeling architecture is very similar to MOGUL; it
utilizes a database containing all of the grammatical "constraints"
the parser can recognize as met or broken (analogous to our KUs), each
holding a score from 0 to 30 representing the user's knowledge on this
constraint. A score from 0-9 represents expert knowledge, 10-20 is
intermediate, and 21-30 is novice. As the system records user
performance over time, these scores are incremented with each observed
failure to execute a constraint correctly, and decremented with each
success. 


The details of this user model, however, are lost in the parse
selection process, where the student's proficiency scores on each
constraint in the user model are averaged, yielding a general
proficiency score for the student. The possible parses are then
ordered from simple to difficult, and the student's general
proficiency level is used as an index into that list. This technique
fails to take advantage of the information about the student's
individual strengths and weaknesses that was stored in the model. 


Mr. Collins 

The CALL system Mr. Collins25 (COLLaboratively constructed,
INSpectable student model) (Bull, 1994, 1997; Bull et al., 1995a,b) is
perhaps ICICLE's closest kin in the field of CALL, although its domain
is restricted to the acquisition of Portuguese clitic pronoun usage,
and its interaction with the user is largely drill-and-test,
missing-constituent (fill-in-the-gap) questions. Mr. Collins'
instructional objective is also largely concerned with the learning
strategies being employed by the learner. 


Relevant to this work, however, is the fact that Mr. Collins models
the dynamic user through a sequence of student models illustrating the
acquisition order in the domain of Portuguese clitic pronoun usage,
spanning a spectrum from the novice to the expert (a native Portuguese
speaker). Like in ICICLE's user model, Mr. Collins combines this
acquisition information with direct information about the user. When
attempting to parse a sentence, an "expert model" is consulted
 

25 "Mr. Collins" actually refers to the user modeling component of a
larger system, but the name is also used for the entire system for
simplicity. 


23 


first--in order to try to parse the sentence as grammatical--and if
this fails, the current model of user's misconceptions is
attempted. If this fails as well, the system goes further back in the
user's path along the acquisition sequence to attempt a parse with
errors from earlier in the history. Subsequent to these attempts, the
last place consulted is the realm of grammatical transfers from other
languages which the student has learned. 


Discussion 

The majority of the systems discussed here have far more narrow goals
than ICICLE, primarily addressing very specific aspects of the L2
within strictly-defined exercises such as translation. This strongly
affects their user modeling requirements and objectives. BELLOC
operates within a restricted domain, HyperTutor only parses strict
translations of sentences it provides, and Mr. Collins instructs only
upon specific syntactic categories and not the broader grammar of the
language itself. Of the systems reviewed, only German Tutor accepts
free-form text as ICICLE does. It is a very ambitious undertaking and
it is clear from our work so far ICICLE has some growing yet to do
before it can consider this task truly conquered. 


Perhaps because ICICLE is such an ambitious project, our user modeling
effort is far more precise than those others discussed in this
chapter. BELLOC does not track the user at all; it has to chose
between competing interpretations through directly querying the
student, a time-consuming and intrusive task when there are many user
utterances and/or many errors per utterance. Although HyperTutor
shares ICICLE's goal of interpreting user actions through a view of
the interlanguage, it is clear that Hypertutor's portrayal of a
collection of learner strategies gives a very coarse-grained view of
what learner rules may be in that interlanguage. It proposes to offer
theories about which types of hypotheses may exist there without
giving specific information about which hypotheses are
there. HyperTutor also does not address the possibility of ambiguous
errors (where more than one parse tree, and therefore more than one
"cause" of the error, could account for a user's utterance), which is
a key component to the issues we face regularly with the ICICLE
system. 


German Tutor's user modeling technique is similar to the MOGUL facet
of our user model, but in their application they lose the precision
represented by tracking user performance on each individual structure
when they translate it into a global competence level. This approach
completely ignores the usefulness of knowing that an individual may
exhibit competence or preference for specific structures while he or
she struggles on others, and it relies heavily on the ability to
"rank" potential parses from most complicated to least. It also fails
to engage any kind of stereotypic inference structure as is embodied
in SLALOM. 


Finally, while Mr. Collins' approach is remarkably similar to ours, it
captures such a tiny piece of the second language acquisition
question--only pronouns, and a subset of pronoun usage at that--that
it does not face many of the complexities that we have addressed in
our user modeling component for ICICLE. 


It is therefore possible to conclude that the ICICLE approach to user
modeling through the MOGUL and SLALOM facets, while incorporating some
aspects of user modeling and user interpretation which have been seen
before, is nonetheless novel both in the precision of concept and
application of its user modeling component and in the ambitious scope
of the entire story of syntactic acquisition which is embodied within
the model. 


CONCLUSION 

In this paper, we have discussed how we address the adaptivity needs
of a CALL system for deaf learners of Written English by modeling the
user's current interlanguage grammar. This enables our 


24


system, ICICLE, to intelligently address the problem of natural
language parse disambiguation, and provides strong evidence to direct
selective tutorial efforts toward those domain elements on the edge of
acquisition. We have presented the architecture of a user modeling
component that allows us to infer the user's interlanguage on the
basis of his or her language production recorded in the MOGUL
component of the model. The user modeling component supplements this
performance data with knowledge derived from the SLALOM architecture
which captures the stereotypical precedence and concurrence of
acquisition among grammatical structures. The SLALOM architecture is
used like a stereotype to infer a more complete image of user
grammatical competence than is directly visible in his or her language
productions. Our end goal is an intelligent tutoring system with the
ability to respond appropriately to student learning difficulties,
adjusting to the individual as it interacts with him or her across the
learning journey. 


ACKNOWLEDGMENTS 

This work has been supported by NSF Grants #GER-9354869 and
#IIS-9978021. We would like to thank the ICICLE group at the
University of Delaware, in particular: Rashida Davis, Christopher
Pennington, Ezra Kissel, David Derman, H. Gregory Silber, and Michael
Bloodgood for their work on the ICICLE implementation. 


References 

Allen, J.: 1995, Natural Language Understanding. California: Benjamin/Cummings, second edition. 


Anderson, J. R., C. F. Boyle, A. T. Corbett, and M. W. Lewis: 1990, "Cognitive Modeling and Intelligent Tutoring". Artificial Intelligence 42(1), 7:49. 


Antworth, E. L.: 1990, PC-KIMMO: A two-level processor for morphological analysis, No. 16 in Occasional Publications in Academic Computing. Dallas, TX: Summer Institute of Linguistics. 


Bailey, N., C. Madden, and S. D. Krashen: 1974, "Is there a 'natural sequence' in adult second language learning?"". Language Learning 24(2), 235:243. 


Baker, C. and D. Cokely: 1980, American Sign Language: A Teacher's Resource Text on Grammar and Culture. Silver Spring, MD: TJ Publishers. 


Barnum, M.: 1984, "In Support of Bilingual/Bicultural Education for Deaf Children". American Annals of the Deaf 129, 404:408. 


Bialystok, E.: 1978, "A theoretical model of second language learning". Language Learning 28(1), 69:83. 


Brown, R. and C. Hanlon: 1970, "Derivational complexity and order of acquisition in child speech". In: J.R.Hayes (ed.): Cognition and the Development of Language. New York: John Wiley & Sons, Inc., Chapt. 1, pp. 11:54. 


Bull, S.: 1994, "Student Modelling for Second Language Acquisition". Computers & Education 23(1/2), 13:20. 


Bull, S.: 1997, "Promoting Effective Learning Strategy Use inCALL". Computer Assisted Language Learning 10(1), 3:39. 

25 


Bull, S., P. Brna, and H. Pain: 1995a, "Extending the scope of thestudent model". User Modeling and User-Adapted Interaction 5(1), 45:65. 

Bull, S., H. Pain, and P. Brna: 1995b, "Mr. Collins: a collaboratively constructed, inspectable student model for intelligent computer assisted language learning". Instructional Science 23(13), 65:87. 

Carroll, S. E.: 1995, "The irrelevance of verbal feedback to language learning". In: L. Eubank, L. Selinker, and M. Sharwood Smith (eds.): The Current State of Interlanguage: Studies in Honor of William E. Rutherford. Amsterdam and Philadelphia: John Benjamins Publishing Company, pp. 73:88. 

Cawsey, A.: 1993, Explanation and Interaction: The Computer Generation of Explanatory Dialogues. Cambridge, MA: MIT Press. 

Chanier, T., M. Pengelly, M. Twidale, and J. Self: 1992, "Conceptual modelling in error analysis in computer-assisted language learning systems". In: M. L. Swartz and M. Yazdani (eds.): Intelligent Tutoring Systems for Second-Language Learning, Vol. F80 of NATOASI Series. Berlin  Heidelberg: Springer-Verlag, pp. 125:150. 

Charrow, V. R. and J. D. Fletcher: 1974, "English as the Second Language of Deaf Children". Developmental Psychology 10(4), 463:470. 

Charrow, V. R. and R. B. Wilbur: 1975, "The deaf child as a linguistic minority". Theory into Practice 14(5), 353:359. 

Corder, S. P.: 1967, "The significance of learners' errors". International Review of Applied Linguistics 5(4), 161:170. 

Corder, S. P.: 1973, "The elicitation of interlanguage". In: G. Nickel (ed.): Special Issue of IRAL on the Occasion of Bertil Malmberg's 60th Birthday. Heidelberg, Germany: Julius Groos Verlag, pp. 51:63. International Review of Applied Linguistics. 

Corder, S. P.: 1974, "Error Analysis". In: J. P. B. Allen and S. P. Corder (eds.): Techniques in Applied Linguistics, Vol. 3 of The Edinburgh Course in Applied Linguistics. Oxford University Press, Chapt. 5, pp. 122:154. 

Desmarais, M. C., A. Maluf, and J. Jiu: 1996, "User-expertise modeling with empirically derived probabilistic implication networks". User modeling and user-adapted interaction 5(3/4), 283:315. 

Drasgow, E.: 1993, "Bilingual / Bicultural Deaf Education: An Overview". Sign Language Studies 80, 243:266. 

Dulay, H. C. and M. K. Burt: 1974, "Errors and Strategies in ChildSecond Language Acquisition". TESOL Quarterly 8(2), 129:136. 

Dulay, H. C. and M. K. Burt: 1975, "Natural Sequences in Child Second Language Acquisition". Language Learning 24(1). 

Ellis, R.: 1993, "The structural syllabus and second language acquisition". TESOL Quarterly 27(1), 91:113. 

Ellis, R.: 1994, The Study of Second Language Acquisition. New York: Oxford University Press. 


26


Erting, C.: 1978, "Language Policy and Deaf Ethnicity in the United States". Sign Language Studies 19, 139:152. 

Farrow, K., D. Power, and P. Freebody: 1994, "Computer-assisted writing development for deaf students: 'Writing Safari'. On-CALL Online 9(1). http://www.cltr.uq.edu.au/oncall/farrow91.html. 

Gass, S.: 1979, "Language transfer and universal grammatical relations". Language Learning 29(2), 327:344. 

Glaser, R., A. Lesgold, and S. Lajoie: 1987, "Toward a cognitive theory for the measurement of achievement". In: R. R. Ronning, J. A. Glover, J. C. Conoley, and J. C. Witt (eds.): The Influence of Cognitive Psychology on Testing, Vol.3 of Buros-Nebraska Symposium on Measurement and Testing. New Jersey: Lawrence Erlbaum Associates, Chapt. 3, pp. 41:85. 


Grishman, R., C. Macleod, and A. Meyers: 1994, "Comlex syntax: Building a Computational lexicon". In: Proceedings of the 15th International Conference on Computational Linguistics. Kyoto, Japan. 


Gutierrez, P.: 1994, "A Preliminary Study of Deaf Educational Policy". Bilingual Research Journal 18(3-4), 85:113. 

Heift, T. and P. McFetridge: 1999, "Exploiting the student model to empasize language teaching in natural language processing". In: M. B. Olsen (ed.): Proceedings of Computer-Mediated Language Assessment and Evaluation in Natural Language Processing, an ACL-IALL Symposium. College Park, Maryland, pp. 55:61. 

Higgins, J.: 1995, Computers and English Language Learning. Norwood, New Jersey: Ablex Publishing Corporation. 

Johnson, R. E., S. Liddell, and C. Erting: 1989, "Unlocking the Curriculum: Principles for Achieving Access in Deaf Education". Gallaudet Research Institute Working Paper 89(3). 

Karttunen, L.: 1983, "KIMMO: A general morphological processor". Texas Linguistic Forum 22, 163:186. 

Kelly, L. P.: 1987, "The influence of syntactic anomalies on the writing of a deaf college student". In: A. Matsuhashi (ed.): Writing in Real Time: Modeling Production Processes. Norwood, New Jersey: Ablex Publishing, Chapt. 7, pp. 161:196. 

Krashen, S. D.: 1981, Second Language Acquisition and Second Language Learning. New York: Pergamon Press. 

Krashen, S. D.: 1982, Principles and Practice in Second Language Acquisition. New York: Pergamon Press. 

Krashen, S. D.: 1983, "Newmark's 'Ignorance Hypothesis' and current second language theory". In: S. M. Gass and L. Selinker (eds.): Language Transfer in Language Learning, Series on Issues in Second Language Research. Rowley, Massachusetts: Newbury House Publishers, Inc., Chapt. 9, pp. 135:153. 


Krashen, S. D., J. Butler, R. Birkbaum, and J. Robertson: 1978, "Two studies in language acquisition and language learning". ITL:Review of Applied Linguistics 39/40, 73:92. 

27 


Larsen-Freeman, D. E.: 1976, "An explanation for the morpheme acquisition order of second language learners". Language Learning 25(1), 125:135. 

Lesgold, A., G. Eggan, S. Katz, and G. Rao: 1992, "Possibilities for assessment using computer-based apprenticeship environments". In: J. W. Regian and V. J. Shute (eds.): Cognitive Approaches to Automated Instruction. New Jersey: Lawrence Erlbaum Associates, Chapt. 3, pp. 49:80. 

Linton, F., B. Bell, and C. Bloom: 1996, "The student model of the LEAP intelligent tutoring system". In: Proceedings of the Fifth International Conference on User Modeling. Kailua-Kona, Hawaii, pp. 83:90, User Modeling, Inc. 

Loera, P. A. and D. Meichenbaum: 1993, "The 'potential' contributions of Cognitive Behavior Modification to literacy training for deaf students". American Annals of the Deaf 138(2), 87:95. 

Loritz, D., A. Parhizgar, and R. Zambrano: 1993, "Diagnostic Parsing in CALL". CAELL Journal 1(4), 9:12. 

Loritz, D. and R. Zambrano: 1990, "Using Artificial Intelligence to Teach English to Deaf People". Technical report, U.S. Department of Education, Office of Special Education and Rehabilitative Services, Technology, Educational Media, and Materials for the Handicapped Program. On Grant #H180P80020-89 to Georgetown University School of Languages and Linguistics, in consortium with Gallaudet University. 

Mangelsdorf, K.: 1989, "Parallels between speaking and writing in second language acquisition". In: D. M. Johnson and D. H. Roen (eds.): Richness in Writing: Empowering ESL Students. New York: Logman, Chapt. 8, pp. 134:145.
 
Matz, M.: 1982, "Towards a process model for high school algebra errors". In: D. Sleeman and J. Brown (eds.): Intelligent Tutoring Systems, Computers and People Series. Academic Press, Chapt. 2, pp. 25:50. 

McAnally, P. L., S. Rose, and S. P. Quigley: 1987, Language Learning Practices with Deaf Children. Boston: College-Hill Press. 

McCoy, K. F., C. A. Pennington, and L. Z. Suri: 1996, "English error correction: A syntactic user model based on principled mal-rule scoring". In: Proceedings of the Fifth International Conference on User Modeling. Kailua-Kona, Hawaii, pp. 59:66, User Modeling, Inc. 

Michaud, L. N.: 2002, "Modeling User Interlanguage in a Second Language Tutoring System for Deaf Users of American Sign Language". Ph.D. thesis, Dept. of Computer and Information Sciences, University of Delaware. Tech. Report #2002-08. 

Michaud, L. N. and K. F. McCoy: 1998, "Planning Text in a System for Teaching English as a Second Language to Deaf Learners.". In: Proceedings of Integrating Artificial Intelligence and Assistive Technology, an AAAI '98 Workshop. Madison, Wisconsin. 

Michaud, L. N. and K. F. McCoy: 1999, "Modeling User Language Proficiency in a Writing Tutor for Deaf Learners of English". In: M. B. Olsen (ed.): Proceedings of Computer-Mediated Language Assessment and Evaluation in Natural Language Processing, an ACL-IALL Symposium. College Park, Maryland, pp. 47:54. 

28 


Michaud, L. N. and K. F. McCoy: 2000, "Supporting Intelligent Tutoring in CALL By Modeling the User's Grammar". In: Proceedings of the Thirteenth International Florida Artificial Intelligence Research Society Conference(FLAIRS-2000). Orlando, Florida, pp. 50:54. 

Michaud, L. N. and K. F. McCoy: 2004, "Empirical Derivation of a Sequence of User Stereotypes". User Modeling and User-Adapted Interaction 14(4), 317:350. 

Michaud, L. N., K. F. McCoy, and R. Z. Davis: 2005, ?"A Model to Disambiguate Natural Language Parses on the Basis of User Language Proficiency: Design and Evaluation". User Modeling and User-Adapted Interaction 15(1), 55:84. Special Issue on Language-Based Interaction. 

Michaud, L. N., K. F. McCoy, and C. A. Pennington: 2000, "An Intelligent Tutoring System for Deaf Learners of Written English". In: Proceedings of the Fourth International ACM SIGCAPH Conference on Assistive Technologies (ASSETS 2000). Washington, D.C. 

Michaud, L. N., K. F. McCoy, and L. A. Stark: 2001, "Modeling the Acquisition of English: an Intelligent CALL Approach". In: M. Bauer, P. J. Gmytrasiewicz, and J. Vassileva (eds.): Proceedings of the 8th International Conference on User Modeling, Vol. 2109 of Lecture Notes in Artificial Intelligence. Sonthofen, Germany, pp. 14:23, Springer. 

Moores, D. F.: 1987, Educating the Deaf: Psychology,Principles, and Practices. Boston: Houghton Mifflin Company, 3 edition. 

Opwis, K.: 1993, "The flexible use of multiple mental domain representations". In: D. M. Towne, T de Jong, and H. Spada (eds.): Simulation-Based Experiential Learning, Vol. 122 of NATO ASI Series F: Computer and Systems Sciences. New York: Springer-Verlag, pp. 77:89. 

Paris, C. L.: 1988, "Tailoring object descriptions to a user's level of expertise". Computational Linguistics 14(3), 64:78. 

Paris, C. L.: 1993, User Modelling in Text Generation. Frances Pinter. 

Paul, P. V.: 1998, Literacy and Deafness: The Development of Reading, Writing, and Literate Thought. Boston: Allyn and Bacon. 

Pennington, M.: 1996, The Computer and the Non-Native Writer: A Natural Partnership, Written Language Series (Marcia Farr, series editor). Cresskill, New Jersey: Hampton Press. 

Ploetzner, R., H. Spada, M. Stumpf, and K. Opwis: 1990, "Learning qualitative and quantitative reasoning in a microworld for elastic impacts". European Journal of Psychology of Education 5(4), 501:516. 

Quigley, S. and P. Paul: 1984, "ASL and ESL?". Topics in Early Childhood Special Education 3(4), 17:26. 

Quigley, S. P. and C. M. King: 1982, "The language development of deaf children and youth". In: S. Rosenberg (ed.): Handbook of Applied Psycholinguistics: Major Thrusts of Research and Theory. Hillsdale, NJ: Lawrence Erlbaum Associates, Chapt. 9, pp. 429:475. 

Quigley, S. P., D. J. Power, and M. W. Steinkamp: 1977, "The language structure of deaf children". The Volta Review 79(2), 73:84. 

Ragnemalm, E. L.: 1996, "Student diagnosis in practice; Bridging a gap". User Modeling and User-Adapted Instruction 5, 93:116. 

29 


Rich, E.: 1979, "User modeling via stereotypes". Cognitive Science 3, 329:354. 

Richardson, J. E.: 1981, "Computer Assisted Instruction for the Hearing Impaired". The Volta Review 83, 328:335. 

Rose, S. and M. Waldron: 1984, "Microcomputer use in programs for hearing-impaired children: A national survey". American Annals of the Deaf 129, 338:342. 

Rueda, R.: 1990, "Assisted perfomance in writing instruction with learning-disabled students". In: L.C.Moll (ed.): Vygotsky and Education: Instructional Implications and Applications of Sociohistorical Psychology. New York: Cambridge University Press, Chapt. 17, pp. 403:426. 

Russell, W. K., S. P. Quigley, and D. J. Power: 1976, Linguistics and Deaf Children. Washington, D. C.: Alexander Graham Bell Association for the Deaf. 

Schneider, D. and K. F. McCoy: 1998, "Recognizing Syntactic Errors in the Writing of Second Language Learners". In: Proceedings of the Thirty-Sixth Annual Meeting of the Association for Computational Linguistics and the Seventeenth International Conference on Computational Linguistics, Vol.2. Universite de Montreal, Montreal, Quebec, Canada, pp. 1198:1204, Morgan Kaufmann Publishers. 

Schuster, E. and J. Burckett-Picker: 1996, "Interlanguage errors becoming the Target Language through student modeling". In: Proceedings of the Fifth International Conference on User Modeling. Kailua-Kona, Hawaii, pp. 99:103, User Modeling, Inc. 

Schwartz, B. D.: 1998, "On two hypotheses of 'transfer' in L2A: Minimal Trees and Absolute L1 Influence". In: S. Flynn, G. Martohardjono, and W. O'Neil (eds.): The Generative Study of Second Language Acquisition. Mahwah, NJ: Lawrence Erlbaum, Chapt. 3, pp. 35:59. 

Schwartz, B. D. and R. A. Sprouse: 1996, "L2 cognitive states and the Full Transfer/Full Access model". Second Language Research 12(1), 40:72. 

Self, J. A.: 1988, "Bypassing the intractable problem of student modelling". In: Proceedings of the 1st International Conference on Intelligent Tutoring Systems (ITS-88).Montreal, Quebec, Canada, pp. 18:24. 

Selinker, L.: 1971, "The psychologically relevant data of second-language learning". In: P. Pimsleur and T. Quinn (eds.): The Psychology of Second Language Learning: Papers from the Second International Congress of Applied Linguistics. Cambridge: University Press, Chapt. 4, pp. 35:43.
 
Selinker, L.: 1972, "Interlanguage". International Review of AppliedLinguistics 10(3), 209:231. 

Selinker, L.: 1992, Rediscovering Interlanguage. London and New York: Longman. 

Sleeman, D.: 1982, "Inferring (mal) rules from pupil's protocols". In: Proceedings of ECAI '82. Orsay, France, pp. 160:164. 

Spada, H.: 1993, "How the role of cognitive modeling for computerized instruction is changing". In: P. Brna, S. Ohlsson, and H. Pain (eds.): Proceedings of AI-ED '93, World Conference on Artificial Intelligence in Education. Edinburgh, Scotland, pp. 21:25. Invited talk. 

30 


Stevens, R.: 1980, "Education in Schools for Deaf Children". In: C. Baker and R. Battison (eds.): Sign Language and the Deaf Community: Essays in Honor of William C. Stokoe. National Association of the Deaf. 

Stewart, D. A.: 2001, "Pearls of Wisdom: What Stokoe told us about teaching deaf children". Sign Language Studies 1(4), 344:361. 

Strong, M.: 1988, "A bilingual approach to the education of young deaf children: ASL and English". In: M. Strong (ed.): Language Learning and deafness. Cambridge: Cambridge University Press, pp. 113:129. 

Suri, L. Z. and K. F. McCoy: 1993, "A Methodology for Developing an Error Taxonomy for a Computer Assisted Language Learning Tool for Second Language Learners". Technical Report TR-93-16, Department of Computer and Information Sciences, University of Delaware. 

Swisher, M. V.: 1989, "The language-learning situation of deaf students". TESOL Quarterly 23(2), 239:257. 

Tarone, E.: 1982, "Systematicity and attention in interlanguage". Language Learning 32(1), 69:84. 

Vygotsky, L. S.: 1986, Thought and Language. Cambridge, Massachusetts: The MIT Press. Translation revised and edited by Alex Kozulin; originally published in 1934. 

Washburn, G. N.: 1994, "Working in the ZPD: Fossilized and Nonfossilized Nonnative Speakers". In: J.P.Lantoff and G. Appel(eds.): Vygotskian Approaches to Second Language Research, Second Language Learning. Norwood, New Jersey: Ablex Publishing Corporation, Chapt. 4, pp. 69:81. 

Watson, P.: 1979, "The utilization of the computer with the hearing impaired and the handicapped". American Annals of the Deaf 124, 670:680. 

Wilbur, R. B.: 1977, "An explanation of deaf children's difficulty with certain syntactic structures of English". The Volta Review 79(2), 85?92. 

Wilbur, R. B., D. S. Montanelli, and S. P. Quigley: 1976, "Pronominalization in the Language of Deaf Students". Journal of Speech and Hearing Research 19(1). 

Woolf, B. and D. D. McDonald: 1984, "Building a Computer Tutor: Design Issues". IEEE Computer 17(9), 61:73. 

Woolf, B. P.: 1984, "Context Dependent Planning in a Machine Tutor". Ph.D. thesis, Department of Computer and Information Science, University of Massachusetts at Amherst. COINS Technical Report 84-21. 

31