This paper presents our research on identifying the paragraphs in a multimodal document that are most relevant to a constituent information graphic. Instead of taking the whole article into account when ranking an information graphic for retrieval, we will explore considering only the most relevant paragraphs.
While this paper is primarily about a system for providing sight-impaired individuals with access to information graphics in multimodal documents, it includes work whose original motivation was the retrieval of graphics from a digital library. The paper presents a methodology for identifying the paragraph in a multimodal article that is most relevant to an information graphic. Our mixture model for retrieving graphics from a digital library will consider only the paragraphs that are most relevant to an information graphic in deciding whether to retrieve the graphic in response to a user query.
This paper presents our research on a general methodology for disambiguating terms in short queries. Our method does not rely on prior identification of the entities to be disambiguated and can determine when a sequence of words should be disambiguated as a single entity rather than as a sequence of individual disambiguations. Furthermore, our method does not rely on capitalization since users are notoriously poor at correct capitalization of terms in their queries; this is in contrast to the text of formal documents where correct capitalization can be used to identify sequences of words that represent a named entity. The work presented in this paper has subsequently been extended for use in query expansion.
The portion of this PhD thesis that is relevant to the graph retrieval project is Chapter 8 on identifying the paragraph in a multimodal document that is relevant to an information graphic. Since only a small portion of a multimodal document's text may be relevant to a constituent information graphic, our mixture model for retrieving graphs from a digital library will use only the relevant paragraphs (along with information from the graphic itself) in deciding whether to retrieve a graphic in response to a user query.
This thesis presents an approach to disambiguating terms in short queries. The method does not rely on capitalization of named entities or prior identification of the terms to be disambiguated. The thesis also proposes extending the approach for use in query expansion.
This paper presents our learned model for analyzing full-sentence user queries and hypothesizing the content of the independent and dependent axes of bar charts and line graphs that might be relevant to the query. Natural language processing techniques are used to extract features from the query and machine learning is employed to build a model for hypothesizing the content of the axes. Results have shown that our models can achieve accuracy higher than 80\% on a corpus of collected user queries.
This paper presents our methodology for reasoning about the content of a graphic in deciding its relevance to a user query. In particular, the paper focuses on extracting from the user's full-sentence query the category of intended message of potentially relevant graphs. Our learned model achieves 81% accuracy on a corpus of collected user queries.
This paper presents our implemented and evaluated methodology for disambiguating terms in search queries and for augmenting queries with expansion terms. By exploiting Wikipedia articles and their reference relations, our method is able to disambiguate terms in particularly short queries with few context words and to effectively expand queries for retrieval of short documents such as tweets. Our strategy can determine when a sequence of words should be treated as a single entity rather than as a sequence of individual entities.
This thesis describes human subject experiments to collect a corpus of user-written queries for retrieving graphics. It also presents the design, implementation and evaluation of decision trees for analyzing a user query to identify the content of the independent and dependent axes and the category of intended message of graphs that might be relevant to the query.
This paper presents several mixture models for retrieving information graphics from a digital library. Experimental results show that the mixture model using both structural relevance and message relevance performed significantly better than a bag of words baseline model or models using just structural or message relevance.
This paper presents graph retrieval models built using learning to rank algorithms on 56 features. The results show that models developed from all 56 features are significantly better than models that do not take into account structural features and message content features. Analysis of the trees produced by the learning to rank algorithms showed that 70% of the top ten features were structural or message content featues. graphics from a digital library.
This paper presents a novel methodology for retrieving infographics from a digital library that takes into account a graphic's structural and message content. The retrieval methodology can be summarized thus: 1) hypothesize requisite structural and message content from a natural language query, 2) measure the relevance of each candidate infographic to the requisite structural and message content hypothesized from the user query, and 3) integrate these relevance measurements via a linear combination model in order to produce a ranked list of infographics in response to a user query.