Machine Learning

A new direction in our work is the investigation of machine learning techniques for processing language. In our work on understanding information graphics, we have developed a Bayesian network that recognizes the intended message of a bar chart. Further information on this project can be found in our work on Digital Libraries

We have also investigated effective machine learning techniques for identifying dialogue acts. Although Transformation-Based Learning (TBL) has a number of attractive features, it has not previously been applied to discourse level problems. This work, pursued with Ken Samuel and K. Vijay-Shanker, investigates modifications and enhancements of TBL and application of TBL to the problem of recognizing dialogue acts. One limitation of TBL is that the algorithm quickly becomes intractable if the number of potential rules under consideration is not severely limited. This research produced a Monte Carlo version of TBL that overcomes this limitation by improving training time efficiency without significant degradation in performance on unseen data. It also provided other enhancements such as a committee method that enables TBL to associate confidence measures with the assigned dialogue act tags. Other contributions of the research include an entropy minimization approach to identifying useful dialogue act cues. Although additional improvements remain to be investigated, the modified TBL algorithm has already achieved a success rate equivalent to the best reported results on the dialogue act tagging problem.

Relevant Publications

  Stephanie Elzer, Sandra Carberry, Ingrid Zukerman, Daniel Chester, Nancy Green, and Seniz Demir.  A Probabilistic Framework for Recognizing Intention in Information Graphics.   Proceedings of the Nineteenth International Conference on Artificial Intelligence (IJCAI-05), 2005.    pdf version

Sandra Carberry, Ken Samuel, K. Vijay-Shanker, and Andrew Wilson. Randomized Rule Selection in Transformation-Based Learning: A Comparative Study. Natural Language Engineering, 7(2), pp. 99-116, 2001. postscript version

Samuel, Ken and Sandra Carberry and K. Vijayashanker. Automatically Selecting Useful Phraes for Dialogue Act Tagging. Proceeding of the Meeting of the Pacific Association for Computational Linguistics, 1999.

Samuel, Ken and Sandra Carberry and K. Vijay-Shanker) An Investigation of Transformation-Based Learning in Discourse. Proceedings of the International Conference on Machine Learning (ICML) , pp. 497-505, 1998.

Samuel, Ken, Sandra Carberry, and K. Vijay-Shanker. Dialogue Act Tagging with Transformation-Based Learning. Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 1150-1156, 1998.

Samuel, Ken and Sandra Carberry and K. Vijay-Shanker. Computing Dialogue Acts from Features with Transformation-Based Learning. Proceedings of the AAAI Spring Symposium on Applying Machine Learning to Discourse Processing, pp. 90-97, 1998.