Search CORE

322 research outputs found

Optimal Crowdsourced Classification with a Reject Option in the Presence of Spammers

Author: Li Qunwei
Varshney Pramod K.
Publication venue
Publication date: 26/10/2017
Field of study

We explore the design of an effective crowdsourcing system for an

M

-ary classification task. Crowd workers complete simple binary microtasks whose results are aggregated to give the final decision. We consider the scenario where the workers have a reject option so that they are allowed to skip microtasks when they are unable to or choose not to respond to binary microtasks. We present an aggregation approach using a weighted majority voting rule, where each worker's response is assigned an optimized weight to maximize crowd's classification performance.Comment: submitted to ICASSP 201

arXiv.org e-Print Archive

Crossref

Multi-object Classification via Crowdsourcing with a Reject Option

Author: Li Qunwei
Varshney Lav R.
Varshney Pramod K.
Vempaty Aditya
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/06/2016
Field of study

Consider designing an effective crowdsourcing system for an

M

-ary classification task. Crowd workers complete simple binary microtasks whose results are aggregated to give the final result. We consider the novel scenario where workers have a reject option so they may skip microtasks when they are unable or choose not to respond. For example, in mismatched speech transcription, workers who do not know the language may not be able to respond to microtasks focused on phonological dimensions outside their categorical perception. We present an aggregation approach using a weighted majority voting rule, where each worker's response is assigned an optimized weight to maximize the crowd's classification performance. We evaluate system performance in both exact and asymptotic forms. Further, we consider the setting where there may be a set of greedy workers that complete microtasks even when they are unable to perform it reliably. We consider an oblivious and an expurgation strategy to deal with greedy workers, developing an algorithm to adaptively switch between the two based on the estimated fraction of greedy workers in the anonymous crowd. Simulation results show improved performance compared with conventional majority voting.Comment: two column, 15 pages, 8 figures, submitted to IEEE Trans. Signal Proces

arXiv.org e-Print Archive

Crossref

Bifacial dye-sensitized solar cells : a strategy to enhance overall efficiency based on transparent polyaniline electrode

Author: Huang Miaoliang
Li Yan
Lin Jianming
Meng Lijian
Tang Qunwei
Wu Jihui
Yue Gentian
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Dye-sensitized solar cell (DSSC) is a promising solution to global energy and environmental problems because of its clean, low-cost, high efficiency, good durability, and easy fabrication. However, enhancing the efficiency of the DSSC still is an important issue. Here we devise a bifacial DSSC based on a transparent polyaniline (PANI) counter electrode (CE). Owing to the sunlight irradiation simultaneously from the front and the rear sides, more dye molecules are excited and more carriers are generated, which results in the enhancement of short-circuit current density and therefore overall conversion efficiency. The photoelectric properties of PANI can be improved by modifying with 4-aminothiophenol (4-ATP). The bifacial DSSC with 4-ATP/PANI CE achieves a light-to-electric energy conversion efficiency of 8.35%, which is increased by ,24.6% compared to the DSSC irradiated from the front only. This new concept along with promising results provides a new approach for enhancing the photovoltaic performances of solar cells.The authors acknowledge the financial joint support by the National High Technology Research and Development Program of China (No. 2009AA03Z217), the National Natural Science Foundation of China (nos. 90922028, U1205112, 51002053, 61306077), Seed Fund from Ocean University of China, and Fundamental Research Funds for the Central Universities (201313001)

Universidade do Minho: RepositoriUM

Repositório Científico do Instituto Politécnico do Porto

Crossref

PubMed Central

On Classification in Human-driven and Data-driven Systems

Author: Li Qunwei
Publication venue: SURFACE at Syracuse University
Publication date: 21/12/2018
Field of study

Classification systems are ubiquitous, and the design of effective classification algorithms has been an even more active area of research since the emergence of machine learning techniques. Despite the significant efforts devoted to training and feature selection in classification systems, misclassifications do occur and their effects can be critical in various applications. The central goal of this thesis is to analyze classification problems in human-driven and data-driven systems, with potentially unreliable components and design effective strategies to ensure reliable and effective classification algorithms in such systems. The components/agents in the system can be machines and/or humans. The system components can be unreliable due to a variety of reasons such as faulty machines, security attacks causing machines to send falsified information, unskilled human workers sending imperfect information, or human workers providing random responses. This thesis first quantifies the effect of such unreliable agents on the classification performance of the systems and then designs schemes that mitigate misclassifications and their effects by adapting the behavior of the classifier on samples from machines and/or humans and ensure an effective and reliable overall classification. In the first part of this thesis, we study the case when only humans are present in the systems, and consider crowdsourcing systems. Human workers in crowdsourcing systems observe the data and respond individually by providing label related information to a fusion center in a distributed manner. In such systems, we consider the presence of unskilled human workers where they have a reject option so that they may choose not to provide information regarding the label of the data. To maximize the classification performance at the fusion center, an optimal aggregation rule is proposed to fuse the human workers\u27 responses in a weighted majority voting manner. Next, the presence of unreliable human workers, referred to as spammers, is considered. Spammers are human workers that provide random guesses regarding the data label information to the fusion center in crowdsourcing systems. The effect of spammers on the overall classification performance is characterized when the spammers can strategically respond to maximize their reward in reward-based crowdsourcing systems. For such systems, an optimal aggregation rule is proposed by adapting the classifier based on the responses from the workers. The next line of human-driven classification is considered in the context of social networks. The classification problem is studied to classify a human whether he/she is influential or not in propagating information in social networks. Since the knowledge of social network structures is not always available, the influential agent classification problem without knowing the social network structure is studied. A multi-task low rank linear influence model is proposed to exploit the relationships between different information topics. The proposed approach can simultaneously predict the volume of information diffusion for each topic and automatically classify the influential nodes for each topic. In the third part of the thesis, a data-driven decentralized classification framework is developed where machines interact with each other to perform complex classification tasks. However, the machines in the system can be unreliable due to a variety of reasons such as noise, faults and attacks. Providing erroneous updates leads the classification process in a wrong direction, and degrades the performance of decentralized classification algorithms. First, the effect of erroneous updates on the convergence of the classification algorithm is analyzed, and it is shown that the algorithm linearly converges to a neighborhood of the optimal classification solution. Next, guidelines are provided for network design to achieve faster convergence. Finally, to mitigate the impact of unreliable machines, a robust variant of ADMM is proposed, and its resilience to unreliable machines is shown with an exact convergence to the optimal classification result. The final part of research in this thesis considers machine-only data-driven classification problems. First, the fundamentals of classification are studied in an information theoretic framework. We investigate the nonparametric classification problem for arbitrary unknown composite distributions in the asymptotic regime where both the sample size and the number of classes grow exponentially large. The notion of discrimination capacity is introduced, which captures the largest exponential growth rate of the number of classes relative to the samples size so that there exists a test with asymptotically vanishing probability of error. Error exponent analysis using the maximum mean discrepancy is provided and the discrimination rate, i.e., lower bound on the discrimination capacity is characterized. Furthermore, an upper bound on the discrimination capacity based on Fano\u27s inequality is developed

Syracuse University Research Facility and Collaborative Environment

Learning Graph Neural Networks with Approximate Gradient Descent

Author: Li Qunwei
Zhong Wenliang
Zou Shaofeng
Publication venue
Publication date: 06/12/2020
Field of study

The first provably efficient algorithm for learning graph neural networks (GNNs) with one hidden layer for node information convolution is provided in this paper. Two types of GNNs are investigated, depending on whether labels are attached to nodes or graphs. A comprehensive framework for designing and analyzing convergence of GNN training algorithms is developed. The algorithm proposed is applicable to a wide range of activation functions including ReLU, Leaky ReLU, Sigmod, Softplus and Swish. It is shown that the proposed algorithm guarantees a linear convergence rate to the underlying true parameters of GNNs. For both types of GNNs, sample complexity in terms of the number of nodes or the number of graphs is characterized. The impact of feature dimension and GNN structure on the convergence rate is also theoretically characterized. Numerical experiments are further provided to validate our theoretical analysis.Comment: 23 pages, accepted at AAAI 202

arXiv.org e-Print Archive

Crossref

Association for the Advancement of Artificial Intelligence: AAAI Publications