13 déc
13/12/2018 09:00

Sciences & Société

Soutenance de thèse : Tarek AWWAD

Context-Aware Worker Selection For Efficient Quality Control In Crowdsourcing

Soutenance en cotutelle internationale entre l’Université de Passau (Passau, Allemagne), et l’INSA Lyon

Doctorant : Tarek AWWAD

Laboratoire INSA :  LIRIS
Ecole doctorale : ED512 InfoMaths

Crowdsourcing has proved its ability to address large scale data collection tasks at a low cost and in a short time. However, due to the dependence on unknown workers, the quality of the crowdsourcing process is questionable and must be controlled. Indeed, maintaining the efficiency of crowdsourcing requires the time and cost overhead related to this quality control to stay low. Current quality control techniques suffer from high time and budget overheads and from their dependency on prior knowledge about individual workers. In this thesis, we address these limitation by proposing the CAWS (Context-Aware Worker Selection) method which operates in two phases: in an offline phase, the correlations between the worker declarative profiles and the task types are learned. Then, in an online phase, the learned profile models are used to select the most reliable online workers for the incoming tasks depending on their types. Using declarative profiles helps eliminate any probing process, which reduces the time and the budget while maintaining the crowdsourcing quality. In order to evaluate CAWS, we introduce an information-rich dataset called CrowdED (Crowdsourcing Evaluation Dataset). The generation of CrowdED relies on a constrained sampling approach that allows to produce a dataset which respects the requester budget and type constraints. Through its generality and richness, CrowdED helps also in plugging the benchmarking gap present in the crowdsourcing community. Using CrowdED, we evaluate the performance of CAWS in terms of the quality, the time and the budget gain. Results shows that automatic grouping is able to achieve a learning quality similar to job-based grouping, and that CAWS is able to outperform the state-of-theart profile-based worker selection when it comes to quality, especially when strong budget ant time constraints exist. Finally, we propose CREX (CReate Enrich eXtend) which provides the tools to select and sample input tasks and to automatically generate custom crowdsourcing campaign sites in order to extend and enrich CrowdED.