UW Interactive Data Lab
Subgroups of annotators for each dataset, plotted using principal components analysis (PCA). Expert labels are shown in each figure as a larger gold dot. For each dataset, we observe that no single cluster surrounds the "expert" labels: subgroups of annotators differ from these points in systematic ways.
Crowdsourcing is a common strategy for collecting the "gold standard" labels required for many natural language applications. Crowdworkers differ in their responses for many reasons, but existing approaches often treat disagreements as "noise" to be removed through filtering or aggregation. In this paper, we introduce the workflow design pattern of crowd-parting: separating workers based on shared patterns in responses to a crowdsourcing task. We illustrate this idea using an automated clustering-based method to identify divergent, but valid, worker interpretations in crowdsourced entity annotations collected over two distinct corpora – Wikipedia articles and Tweets. We demonstrate how the intermediate-level view provide by crowd-parting analysis provides insight into sources of disagreement not easily gleaned from viewing either individual annotation sets or aggregated results. We discuss several concrete applications for how this approach could be applied directly to improving the quality and efficiency of crowdsourced annotation tasks.