8,334 research outputs found
T-Crowd: Effective Crowdsourcing for Tabular Data
Crowdsourcing employs human workers to solve computer-hard problems, such as
data cleaning, entity resolution, and sentiment analysis. When crowdsourcing
tabular data, e.g., the attribute values of an entity set, a worker's answers
on the different attributes (e.g., the nationality and age of a celebrity star)
are often treated independently. This assumption is not always true and can
lead to suboptimal crowdsourcing performance. In this paper, we present the
T-Crowd system, which takes into consideration the intricate relationships
among tasks, in order to converge faster to their true values. Particularly,
T-Crowd integrates each worker's answers on different attributes to effectively
learn his/her trustworthiness and the true data values. The attribute
relationship information is also used to guide task allocation to workers.
Finally, T-Crowd seamlessly supports categorical and continuous attributes,
which are the two main datatypes found in typical databases. Our extensive
experiments on real and synthetic datasets show that T-Crowd outperforms
state-of-the-art methods in terms of truth inference and reducing the cost of
crowdsourcing
- …
