Search CORE

74 research outputs found

InterPoll: Crowd-Sourced Internet Polls

Author: Livshits Benjamin
Mytkowicz Todd
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 1st Summit on Advances in Programming Languages (SNAPL 2015)
Publication date: 01/01/2015
Field of study

Crowd-sourcing is increasingly being used to provide answers to online polls and surveys. However, existing systems, while taking care of the mechanics of attracting crowd workers, poll building, and payment, provide little to help the survey-maker or pollster in obtaining statistically significant results devoid of even the obvious selection biases. This paper proposes InterPoll, a platform for programming of crowd-sourced polls. Pollsters express polls as embedded LINQ queries and the runtime correctly reasons about uncertainty in those polls, only polling as many people as required to meet statistical guarantees. To optimize the cost of polls, InterPoll performs query optimization, as well as bias correction and power analysis. The goal of InterPoll is to provide a system that can be reliably used for research into marketing, social and political science questions. This paper highlights some of the existing challenges and how InterPoll is designed to address most of them. In this paper we summarize some of the work we have already done and give an outline for future work

DROPS Dagstuhl Research Online Publication Server

Recommended from our members

Experimental Evidence of Chaotic Dynamics in Computer Hardware ; CU-CS-1031-07

Author: Bradley Elizabeth
Diwan Amer
Mytkowicz Todd
Publication venue: CU Scholar
Publication date: 01/06/2007
Field of study

CU Scholar Institutional Repository

Recommended from our members

Observer Effect and Measurement Bias in Performance Analysis ; CU-CS-1042-08

Author: Diwan Amer
Hauswirth Matthias
Mytkowicz Todd
Sweeney Peter
Publication venue: CU Scholar
Publication date: 01/06/2008
Field of study

CU Scholar Institutional Repository

Jumping the ORDER BY Barrier in Large-Scale Pattern Matching

Author: Barnett Mike
Lupei Daniel
Maleki Saeed
Musuvathi Madan
Mytkowicz Todd
Publication venue
Publication date: 05/07/2017
Field of study

Event-series pattern matching is a major component of large-scale data analytics pipelines enabling a wide range of system diagnostics tasks. A precursor to pattern matching is an expensive ``shuffle the world'' stage wherein data are ordered by time and shuffled across the network. Because many existing systems treat the pattern matching engine as a black box, they are unable to optimizing the entire data analytics pipeline, and in particular, this costly shuffle. This paper demonstrates how to optimize such queries. We first translate an expressive class of regular-expression like patterns to relational queries such that they can benefit from decades of progress in relational optimizers, and then we introduce the technique of abstract pattern matching, a linear time preprocessing step which, adapting ideas from symbolic execution and abstract interpretation, discards events from the input guaranteed not to appear in successful matches. Abstract pattern matching first computes a conservative representation of the output-relevant domain of every transition in a pattern based on the (unary) predicates of that transition. It then further refines these domains based on the structure of the pattern (i.e., paths through the pattern) as well as any of the pattern's join predicates across transitions. The outcome is an abstract filter that when applied to the original stream excludes events that are guaranteed not to participate in a match. We implemented and applied abstract pattern matching in COSMOS/Scope to an industrial benchmark where we obtained up to 3 orders of magnitude reduction in shuffled data and 1.23x average speedup in total processing time

Infoscience - École polytechnique fédérale de Lausanne