651 research outputs found

    An Army of Me: Sockpuppets in Online Discussion Communities

    Full text link
    In online discussion communities, users can interact and share information and opinions on a wide variety of topics. However, some users may create multiple identities, or sockpuppets, and engage in undesired behavior by deceiving others or manipulating discussions. In this work, we study sockpuppetry across nine discussion communities, and show that sockpuppets differ from ordinary users in terms of their posting behavior, linguistic traits, as well as social network structure. Sockpuppets tend to start fewer discussions, write shorter posts, use more personal pronouns such as "I", and have more clustered ego-networks. Further, pairs of sockpuppets controlled by the same individual are more likely to interact on the same discussion at the same time than pairs of ordinary users. Our analysis suggests a taxonomy of deceptive behavior in discussion communities. Pairs of sockpuppets can vary in their deceptiveness, i.e., whether they pretend to be different users, or their supportiveness, i.e., if they support arguments of other sockpuppets controlled by the same user. We apply these findings to a series of prediction tasks, notably, to identify whether a pair of accounts belongs to the same underlying user or not. Altogether, this work presents a data-driven view of deception in online discussion communities and paves the way towards the automatic detection of sockpuppets.Comment: 26th International World Wide Web conference 2017 (WWW 2017

    User Intent Prediction in Information-seeking Conversations

    Full text link
    Conversational assistants are being progressively adopted by the general population. However, they are not capable of handling complicated information-seeking tasks that involve multiple turns of information exchange. Due to the limited communication bandwidth in conversational search, it is important for conversational assistants to accurately detect and predict user intent in information-seeking conversations. In this paper, we investigate two aspects of user intent prediction in an information-seeking setting. First, we extract features based on the content, structural, and sentiment characteristics of a given utterance, and use classic machine learning methods to perform user intent prediction. We then conduct an in-depth feature importance analysis to identify key features in this prediction task. We find that structural features contribute most to the prediction performance. Given this finding, we construct neural classifiers to incorporate context information and achieve better performance without feature engineering. Our findings can provide insights into the important factors and effective methods of user intent prediction in information-seeking conversations.Comment: Accepted to CHIIR 201

    SentiBench - a benchmark comparison of state-of-the-practice sentiment analysis methods

    Get PDF
    In the last few years thousands of scientific papers have investigated sentiment analysis, several startups that measure opinions on real data have emerged and a number of innovative products related to this theme have been developed. There are multiple methods for measuring sentiments, including lexical-based and supervised machine learning methods. Despite the vast interest on the theme and wide popularity of some methods, it is unclear which one is better for identifying the polarity (i.e., positive or negative) of a message. Accordingly, there is a strong need to conduct a thorough apple-to-apple comparison of sentiment analysis methods, \textit{as they are used in practice}, across multiple datasets originated from different data sources. Such a comparison is key for understanding the potential limitations, advantages, and disadvantages of popular methods. This article aims at filling this gap by presenting a benchmark comparison of twenty-four popular sentiment analysis methods (which we call the state-of-the-practice methods). Our evaluation is based on a benchmark of eighteen labeled datasets, covering messages posted on social networks, movie and product reviews, as well as opinions and comments in news articles. Our results highlight the extent to which the prediction performance of these methods varies considerably across datasets. Aiming at boosting the development of this research area, we open the methods' codes and datasets used in this article, deploying them in a benchmark system, which provides an open API for accessing and comparing sentence-level sentiment analysis methods

    Introduction – Regional Monitoring Programs

    Get PDF
    There is increasing interest in the initiation of regional or statewide monitoring programs that are less extensive than national efforts such as the Breeding Bird Survey. A number of regional programs have been in existence for a decade or more, so the papers in this section represented an effort to bring together the collective experience of the people who had developed these programs, and to hear about the benefits and drawbacks of their particular designs. Speakers reviewed why they felt there was a need for a regional monitoring effort, examined the designs and response variables associated with their regional monitoring program, presented the short- and longer-term results from the program, discussed the logistic and scientific successes and failures of each program, and presented recommendations for those who might be interested in starting their own regional monitoring program. Below, we provide a brief overview of some important points that emerged from this session, and how these regional efforts might be included as integral parts of broader national monitoring efforts that seem to be emerging

    Using Spatial Models To Map Bird Distributions Along The Madison River

    Get PDF
    The Avian Science Center developed predictive maps of species distributions for the Madison River based on newly available riverine system data from the National Wetlands Inventory (NWI) and the Natural Heritage Program’s Landscape Integrity Model. We used a maximum entropy model (MaxEnt) to predict species distributions using species occurrence locations collected from 2003-2010. Models performed well for 13 species, demonstrating that available environmental data layers, including NWI, can be used to successfully predict species distributions along the Madison River for a number of important riparian bird species. These models allow fine-scale mapping of habitat suitability for riparian birds, which fills gaps in current data on species distributions, and can be used to prioritize riparian conservation and restoration projects

    Maintaining Bird Diversity in Western Larch/Douglas-fir Forests

    Get PDF
    Bird occurrences were evaluated under four stand conditions in western larch/Douglas-fir forests: clearcut, partial cut, unlogged (fragmented), and contiguous forest. Frequencies were noted for foraging guilds, tree gleaners, flycatchers, nesting guilds, tree drillers, and primary cavity nesters. Managers should consider a diversity of habitat conditions if maintaining habitat for bird species is an objective

    Bird Populations in Logged and Unlogged Western Larch/Douglas-fir Forest in Northwestern Montana

    Get PDF
    Of 32 species of abundant breeding birds, populations of 10 species differed significantly between small cutting units and adjacent uncut forest. Foliage foragers and tree gleaners were less abundant in cutting units, while flycatching species and ground foragers were more common there. Of nesting guilds, conifer tree nesters were least abundant in cutting units, and ground nesters were more common there. Results suggest that bird management should consider diverse community-level habitat needs and that if maintenance of tree-dependent species is important, broadleaf trees and snags of all species should be retained

    On-treatment follow-up in real-world studies of direct oral anticoagulants in atrial fibrillation: Association with treatment effects.

    Get PDF
    Background Numerous observational studies support the safety and effectiveness of the direct oral anticoagulants (DOAC) for stroke prevention in atrial fibrillation (AF), but these data are often limited to short duration of follow-up. We aimed to assess the length of on-treatment follow-up in the accumulated real-world evidence and the relationship between follow-up duration and estimates of DOAC effectiveness and safety. Methods We searched the literature for observational studies reporting comparative effectiveness and safety outcomes of DOACs versus warfarin. In random-effects meta-analyses, we assessed associations of specific DOACs vs warfarin for stroke/systematic embolism (SE) and major bleeding. In meta-regression analyses, we assessed the correlation between the reported on-treatment follow-up with the effect sizes for stroke/SE and major bleeding outcomes. Results In 45 eligible observational studies, the average on-treatment follow-up was <1 year for all DOACs. In meta-analyses, all DOACs showed significantly lower risks of stroke/SE, but only dabigatran and apixaban showed lower risks for major bleeding compared to warfarin. There was no correlation between follow-up duration and magnitude of stroke/SE reduction for any of the DOACs. Longer follow-up correlated with greater major bleeding reduction for dabigatran (p = 0.006) and rivaroxaban (p = 0.033) as compared to warfarin, but it correlated with smaller major bleeding reduction for apixaban (p = 0.004). Conclusions The numerous studies of DOAC effectiveness and safety in the routine AF practice pertain to short treatment follow-up. Study follow-up correlates significantly with DOAC-specific vs warfarin associations for major bleeding
    corecore