Abstract
Crowdsourcing is an accessible and cost-effective alternative to traditional methods of collecting and annotating data. The ap- plication of crowdsourcing to simple tasks has been well inves- tigated. However, complex tasks like semantic annotation trans- fer require workers to take simultaneous decisions on chunk segmentation and labeling while acquiring on-the-go domain- specific knowledge. The increased task complexity may gener- ate low judgment agreement and/or poor performance. The goal of this paper is to cope with these crowdsourcing requirements with semantic priming and unsupervised quality control mecha- nisms. We aim at an automatic quality control that takes into ac- count different levels of workers’ expertise and annotation task performance. We investigate the judgment selection and aggre- gation techniques on the task of cross-language semantic an- notation transfer. We propose stochastic modeling techniques to estimate the task performance of a worker on a particular judgment with respect to the whole worker group. These esti- mates are used for the selection of the best judgments as well as weighted consensus-based annotation aggregation. We demon- strate that the technique is useful for increasing the quality of collected annotations.
Leave a Reply