Proceedings of the 2009 Conference on Empirical Methods on Natural Language Processing (EMNLP 2009 at ACL/IJCNLP 2009), Suntec, Singapore, 2009
Abstract. We present a system that finds short definitions of terms on Web pages. It employs a Maximum Entropy classifier, but it is trained on automatically generated examples; hence, it is in effect unsupervised. We use ROUGE-W to generate training examples from encyclopedias and Web snippets, a method that outperforms an alternative centroid-based one. After training, our system can be used to find definitions of terms that are not covered by encyclopedias. The system outperforms a comparable publicly available system, as well as a previously published form of our system.