arXiv:1503.01190 Abstract | arXiv Analytics

arXiv:1503.01190 [cs.CL]Abstract References Reviews Resources

Statistical modality tagging from rule-based annotations and crowdsourcing

Vinodkumar Prabhakaran, Michael Bloodgood, Mona Diab, Bonnie Dorr, Lori Levin, Christine D. Piatko, Owen Rambow, Benjamin Van Durme

Published 2015-03-04Version 1

We explore training an automatic modality tagger. Modality is the attitude that a speaker might have toward an event or state. One of the main hurdles for training a linguistic tagger is gathering training data. This is particularly problematic for training a tagger for modality because modality triggers are sparse for the overwhelming majority of sentences. We investigate an approach to automatically training a modality tagger where we first gathered sentences based on a high-recall simple rule-based modality tagger and then provided these sentences to Mechanical Turk annotators for further annotation. We used the resulting set of training data to train a precise modality tagger using a multi-class SVM that delivers good performance.

Comments: 8 pages, 6 tables; appeared in Proceedings of the Workshop on Extra-Propositional Aspects of Meaning in Computational Linguistics, July 2012; In Proceedings of the Workshop on Extra-Propositional Aspects of Meaning in Computational Linguistics, pages 57-64, Jeju, Republic of Korea, July 2012. Association for Computational Linguistics

Categories: cs.CL, cs.LG, stat.ML

Subjects: I.2.7, I.2.6, I.5.1, I.5.4

Keywords: statistical modality tagging, rule-based annotations, high-recall simple rule-based modality tagger, training data, automatic modality tagger

Related articles: Most relevant | Search more

arXiv:cs/9905001 [cs.CL] (Published 1999-05-02)

Supervised Grammar Induction Using Training Data with Limited Constituent Information

Rebecca Hwa

arXiv:1903.04167 [cs.CL] (Published 2019-03-11)

Partially Shuffling the Training Data to Improve Language Models

Ofir Press

arXiv:2304.07854 [cs.CL] (Published 2023-04-16)

Towards Better Instruction Following Language Models for Chinese: Investigating the Impact of Training Data and Evaluation

Yunjie Ji, Yan Gong, Yong Deng, Yiping Peng, Qiang Niu, Baochang Ma, Xiangang Li

arXiv Analytics

arXiv:1503.01190 [cs.CL]Abstract References Reviews Resources

Statistical modality tagging from rule-based annotations and crowdsourcing

Links

Toolbox

arXiv:1503.01190 [cs.CL]AbstractReferencesReviewsResources

Statistical modality tagging from rule-based annotations and crowdsourcing

Links

Toolbox

arXiv:1503.01190 [cs.CL]Abstract References Reviews Resources