CISC 889  -   Internet Information Gathering  -   Spring 2001

T/Th 11:00am--12:15pm, [Smith Hall 102a]

 
Instructor: Keith Decker 
Office: Room 204, The Green House (77 E. Delaware) 
Hours: W/F 3:00-4:00pm
Phone: 831-1959 
Email: decker@cis.udel.edu 

Course Description

It has been clear for some time that the Internet is a viable medium for supplying the data needed for making various types of decisions. As more useful data become available at different times and in multiple locations, however, it becomes more difficult and time-consuming for a person to collect and evaluate that data. This seminar will focus on current state-of-the-art approaches to information gathering, filtering, and integration including work in the heterogeneous database, information retrieval, and agent-oriented communities.

Format

Seminar. I'll teach the first few overview classes, and then we'll read important and new papers, and take turns leading discussions.  Students taking the course for credit should prepare 3 discussion (or, if needed, clarification questions) questions to be handed in after every class.  Good discussion questions should compare and contrast approaches, consider evaluation of the works, and relate them to your own particular research interests.

Grading:

Topics

Topics will be open to class interests, but here's some suggestions based on the class discussion:

Once we get through the overview, I can alter the topic list to focus on issues that you find interesting (again, trying to mix classic and new papers)

 
DATE TOPIC READING PERSON
2/6
Agents
Overview Decker
2/8
Agents
Keith Decker and Katia Sycara. Intelligent Adaptive Information Agents.
Journal of Intelligent Information Systems, 9 : 239--260, 1997. 
Decker
2/13 Agents DECAF: A Flexible Multi-Agent System Architecture [Graham, Decker, Mersic]

Personal Assistants for the Web: an MIT perspective [Lieberman]. In Intelligent Information Agents [Klusch, ed.]

Decker
2/15 Agents ACLs and CIAs: [Dignum] In Intelligent Information Agents [Klusch, ed.]

Agent Communication Languages: FIPA
Communicative Act Library XC00037G.pdf
Interaction Protocol Library XC00025D.pdf
you can browse some others HERE.

2/20 IR:Background Information Retrieval by C. J. van Rijsbergen: 
     Chapter 1 (html) (pdf)
Readings in Information Retrieval [RIR] edited byK.S. Jones and P. Willet:
     Chapter 1 Overview, 
     Chapter 2 Overview, 
     Chapter 2 Paper 2 [Luhn], 
     Chapter 2 Paper 4 [Maron & Kuhns] 
2/22 IR:Evaluation RIR 
     Chapter 3 Overview 
     Chapter 4 Overview 
     Chapter 2 Paper 6 [Salton & Lesk] 
     Chapter 4 Paper 4 [Keen] 
IR by CJvR 
      Chapter 7
2/27 IR: Indexing IR by CJvR 
Chapter 2
      Chapter 3, pages 22-36 only 
RIR
     Chapter 6 Overview
Chapter 6, Porter

FINAL PROJECT PROPOSALS DUE

3/1 IR:Retrieval 1
(vector)
RIR 
     Chapter 5 Overview 
     Chapter 5, Salton, Wong, and Yang
Chapter 6, Salton and Buckley, Term-Weighting Approaches in Automatic Text Retrieval
3/6 IR:Retrieval 2
(probabilistic)
IR by CJvR 
      Chapter 5 
      Chapter 6, to page 101 "Computational Details"
RIR
Chapter 5, Robertson.
3/8 IR:Retrieval 3
(other)
RIR 
Chapter 5, Turtle and Croft, Inference Networks for Document Retrieval
Chapter 6, Salton and Buckley, Improving Retrieval Performance by Relevance Feedback
3/13 Agents DECAF PROGRAMMING McGeary
3/15 Web Tech Web Naming and Addressing Overview (URIs, URLs, ...) (first page)
A Beginner's Guide to URLs

XML in 10 points
Taming the XML Beast I
Taming the XML Beast II
Well-formed XML documents

[for further info, see Extensible Markup Language (XML) at W3C

3/20 Web Tech XML Namespaces by Example
Namespace Myths Exploded

RDF: Resource Description Framework
W3C Metadata Activity Statement

3/22 Ontology What is an Ontology? [Gruber] 

FIPA Ontology Spec XC00086C.pdf
[Read only Informative Annex A,
and Section 5.1 (The OKBC Knowledge Model);
we will read the rest later]

Ontolingua, KIF,OKBC:
A Programmatic Foundation for Knowledge Base Interoperability,ksl-98-08.pdf [Chaudri, Farquhar,Fikes,Karp,&Rice] AAAI-98
OKBC, A Rich API on the Cheap [A.F.J. Rice]

FINAL PROJECT DETAILED PLAN DUE

3/27 SPRING BREAK
3/29 SPRING BREAK
4/3 Ontology Stefan Decker, Frank van Harmelen, Jeen Broekstra, Michael Erdmann, Dieter Fensel, Ian Horrocks, Michel Klein, Sergey Melnik: The Semantic Web - on the respective Roles of XML and RDF 

M.C.A. Klein et al.: The Relation between Ontologies and Schema-Languages: Translating OIL-Specifications to XML-Schema In: Proceedings of the Workshop on Applications of Ontologies and Problem-solving Methods, 14th European Conference on Artificial Intelligence ECAI-00, Berlin, Germany August 20-25, 2000.

OIL Frequently Asked Questions

OIL white paper

4/5 Ontology Rest of FIPA Ontology Spec XC00086C.pdf

DAML (www.daml.org)

DAML Overview: PDF

DAML Use Cases

daml+oil-walkthru.html

DAML reference.html

4/10 Ontology 2 ontology papers from CIA-4

Exploiting the ontological qualities of Web resources [Crowe, Shadbolt]

Automatic Ontology Construction for Multiagent based Software Gathering Service [Mena, Illarramendi, Goni]

4/12 Ontology Biological Ontology

Gene Ontology: tool for the unification of biology by the gene Ontology Consortium

An evaluation of Ontology Exchange Languages for bioinformatics by McEntire, Karp, Abernathy, et. al.

4/17 Ontology XML, bioinformatics, and data integration by Achard, Vaysseix, and Barillot

Comparision of functional annotation schemes for genomes by Rison, Hodgman, and Thornton

Gene Ontology presentation by Mike Cherry at Bio-ontologies workshop 2000

4/19 NO CLASS
4/24 Info Agents Integration of Information from Multiple Sources of Textual Data by Bergamaschi and Beneventano
4/26 Info Agents A Framework fo a scalable Agent architecture of Cooperating Heterogeneous Knowledge Sources by Ouksel
5/1 Query Planning "Wrapper Induction for Information Extraction" [Kushmerick, Weld, Doorenbos] 
"Initial Results on Wrapping Semi-Structured Web Pages" [Hsu] 
"STALKER: Learning Extraction Rules for Semistructured, Web-based Information Sources" [Muslea, Minton, Knoblock] 
(this web site has code and examples)
5/3 Query Planning "Flexible and scalable cost-based query planning in mediators: A transformational approach" [Ambite & Knoblock]
5/8 Query Planning "Efficient Execution of Information-Gathering Plans"[Friedman & Weld] 

"The Ariadne approach to web-based information integration" [Knoblock, Minton, Ambite, Ashish, Muslea, Philpot,Tejada]

5/10 Applications Bioinformatics:

UDel Multi-Agent Annotation System [Decker,Khan,Schmidt,Michaud]

GeneWeaver [copy from the folder]

5/15 Applications INFOSLEUTH [Unruh et al.]
BIG [Lesser et al.]
5/22 FINAL PROJECTS DEMO+REPORT DUE


May 14, 2001
decker@cis.udel.edu