220 research outputs found

    Machine Learning in Automated Text Categorization

    Full text link
    The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last ten years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this problem is based on machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of preclassified documents, the characteristics of the categories. The advantages of this approach over the knowledge engineering approach (consisting in the manual definition of a classifier by domain experts) are a very good effectiveness, considerable savings in terms of expert manpower, and straightforward portability to different domains. This survey discusses the main approaches to text categorization that fall within the machine learning paradigm. We will discuss in detail issues pertaining to three different problems, namely document representation, classifier construction, and classifier evaluation.Comment: Accepted for publication on ACM Computing Survey

    Beyond chance? The persistence of performance in online poker

    Get PDF
    A major issue in the widespread controversy about the legality of poker and the appropriate taxation of winnings is whether poker should be considered a game of skill or a game of chance. To inform this debate we present an analysis into the role of skill in the performance of online poker players, using a large database with hundreds of millions of player-hand observations from real money ring games at three different stakes levels. We find that players whose earlier profitability was in the top (bottom) deciles perform better (worse) and are substantially more likely to end up in the top (bottom) performance deciles of the following time period. Regression analyses of performance on historical performance and other skill-related proxies provide further evidence for persistence and predictability. Simulations point out that skill dominates chance when performance is measured over 1,500 or more hands of play

    Design and development of a complex narrative intervention delivered by text messages to reduce binge drinking among socially disadvantaged men

    Get PDF
    Background: Socially disadvantaged men are at high risk of suffering from alcohol-related harm. Disadvantaged groups are less likely to engage with health promotion. There is a need for interventions that reach large numbers at low cost and which promote high levels of engagement with the behaviour change process. The aim of this study was to design a theoretically and empirically based text message intervention to reduce binge drinking by socially disadvantaged men. Results: Following MRC guidance, the intervention was developed in four stages. Stage 1 developed a detailed behaviour change strategy based on existing literature and theory from several areas. These included the psychological theory that would underpin the intervention, alcohol brief interventions, text message interventions, effective behaviour change techniques, narratives in behaviour change interventions and communication theory. In addition, formative research was carried out. A logic model was developed to depict the pathways between intervention inputs, processes and outcomes for behaviour change. Stage 2 created a narrative which illustrated and modelled key steps in the strategy. Stage 3 rendered the intervention into a series of text messages and ensured that appropriate behavioural change techniques were incorporated. Stage 4 revised the messages to ensure comprehensive coverage of the behaviour change strategy and coherence of the narrative. It also piloted the intervention and made final revisions to it. Conclusions: The structured, systematic approach to design created a narrative intervention which had a strong theoretical and empirical basis. The use of a narrative helped make the intervention realistic and allowed key behaviour change techniques to be modelled by characters. The narrative was intended to promote engagement with the intervention. The intervention was rendered into a series of short text messages, and subsequent piloting showed they were acceptable in the target group. Delivery of an intervention by text message offers a low-cost, low-demand method that can reach large numbers of people. This approach provides a framework for the design of behaviour change interventions which could be used for interventions to tackle other health behaviours

    Speech Communication

    Get PDF
    Contains reports on five research projects.C.J. Lebel FellowshipNational Institutes of Health (Grant 5 T32 NSO7040)National Institutes of Health (Grant 5 R01 NS04332)National Institutes of Health (Grant 5 R01 NS21183)National Institutes of Health (Grant 5 P01 NS13126)National Institutes of Health (Grant 1 PO1-NS23734)National Science Foundation (Grant BNS 8418733)U.S. Navy - Naval Electronic Systems Command (Contract N00039-85-C-0254)U.S. Navy - Naval Electronic Systems Command (Contract N00039-85-C-0341)U.S. Navy - Naval Electronic Systems Command (Contract N00039-85-C-0290)National Institutes of Health (Grant RO1-NS21183), subcontract with Boston UniversityNational Institutes of Health (Grant 1 PO1-NS23734), subcontract with the Massachusetts Eye and Ear Infirmar

    Speech Communication

    Get PDF
    Contains reports on five research projects.C.J. Lebel FellowshipNational Institutes of Health (Grant 5 T32 NS07040)National Institutes of Health (Grant 5 R01 NS04332)National Science Foundation (Grant 1ST 80-17599)U.S. Navy - Naval Electronic Systems Command Contract (N00039-85-C-0254)U.S. Navy - Naval Electronic Systems Command Contract (N00039-85-C-0341)U.S. Navy - Naval Electronic Systems Command Contract (N00039-85-C-0290

    Automatic construction of rule-based ICD-9-CM coding systems

    Get PDF
    Background: In this paper we focus on the problem of automatically constructing ICD-9-CM coding systems for radiology reports. ICD-9-CM codes are used for billing purposes by health institutes and are assigned to clinical records manually following clinical treatment. Since this labeling task requires expert knowledge in the field of medicine, the process itself is costly and is prone to errors as human annotators have to consider thousands of possible codes when assigning the right ICD-9-CM labels to a document. In this study we use the datasets made available for training and testing automated ICD-9-CM coding systems by the organisers of an International Challenge on Classifying Clinical Free Text Using Natural Language Processing in spring 2007. The challenge itself was dominated by entirely or partly rule-based systems that solve the coding task using a set of hand crafted expert rules. Since the feasibility of the construction of such systems for thousands of ICD codes is indeed questionable, we decided to examine the problem of automatically constructing similar rule sets that turned out to achieve a remarkable accuracy in the shared task challenge. Results: Our results are very promising in the sense that we managed to achieve comparable results with purely hand-crafted ICD-9-CM classifiers. Our best model got a 90.26 % F measure on the training dataset and an 88.93 % F measure on the challenge test dataset, using the micro-averaged Fβ=1 measure, the official evaluatio

    Accreditation Standard Guideline Initiative for Tai Chi and Qigong Instructors and Training Institutions.

    Get PDF
    Evidence of the health and wellbeing benefits of Tai Chi and Qigong (TQ) have emerged in the past two decades, but TQ is underutilized in modern health care in Western countries due to lack of promotion and the availability of professionally qualified TQ instructors. To date, there are no government regulations for TQ instructors or for training institutions in China and Western countries, even though TQ is considered to be a part of Traditional Chinese medicine that has the potential to manage many chronic diseases. Based on an integrative health care approach, the accreditation standard guideline initiative for TQ instructors and training institutions was developed in collaboration with health professionals, integrative medicine academics, Tai Chi and Qigong master instructors and consumers including public safety officers from several countries, such as Australia, Canada, China, Germany, Italy, Korea, Sweden and USA. In this paper, the rationale for organizing the Medical Tai Chi and Qigong Association (MTQA) is discussed and the accreditation standard guideline for TQ instructors and training institutions developed by the committee members of MTQA is presented. The MTQA acknowledges that the proposed guidelines are broad, so that the diversity of TQ instructors and training institutions can be integrated with recognition that these guidelines can be developed with further refinement. Additionally, these guidelines face challenges in understanding the complexity of TQ associated with different principles, philosophies and schools of thought. Nonetheless, these guidelines represent a necessary first step as primary resource to serve and guide health care professionals and consumers, as well as the TQ community

    Increasing recruitment to randomised trials: a review of randomised controlled trials

    Get PDF
    BACKGROUND: Poor recruitment to randomised controlled trials (RCTs) is a widespread and important problem. With poor recruitment being such an important issue with respect to the conduct of randomised trials, a systematic review of controlled trials on recruitment methods was undertaken in order to identify strategies that are effective. METHODS: We searched the register of trials in Cochrane library from 1996 to end of 2004. We also searched Web of Science for 2004. Additional trials were identified from personal knowledge. Included studies had to use random allocation and participants had to be allocated to different methods of recruitment to a 'real' randomised trial. Trials that randomised participants to 'mock' trials and trials of recruitment to non-randomised studies (e.g., case control studies) were excluded. Information on the study design, intervention and control, and number of patients recruited was extracted by the 2 authors. RESULTS: We identified 14 papers describing 20 different interventions. Effective interventions included: telephone reminders; questionnaire inclusion; monetary incentives; using an 'open' rather than placebo design; and making trial materials culturally sensitive. CONCLUSION: Few trials have been undertaken to test interventions to improve trial recruitment. There is an urgent need for more RCTs of recruitment strategies

    Automatic medical encoding with SNOMED categories

    Get PDF
    BACKGROUND: In this paper, we describe the design and preliminary evaluation of a new type of tools to speed up the encoding of episodes of care using the SNOMED CT terminology. METHODS: The proposed system can be used either as a search tool to browse the terminology or as a categorization tool to support automatic annotation of textual contents with SNOMED concepts. The general strategy is similar for both tools and is based on the fusion of two complementary retrieval strategies with thesaural resources. The first classification module uses a traditional vector-space retrieval engine which has been fine-tuned for the task, while the second classifier is based on regular variations of the term list. For evaluating the system, we use a sample of MEDLINE. SNOMED CT categories have been restricted to Medical Subject Headings (MeSH) using the SNOMED-MeSH mapping provided by the UMLS (version 2006). RESULTS: Consistent with previous investigations applied on biomedical terminologies, our results show that performances of the hybrid system are significantly improved as compared to each single module. For top returned concepts, a precision at high ranks (P0) of more than 80% is observed. In addition, a manual and qualitative evaluation on a dozen of MEDLINE abstracts suggests that SNOMED CT could represent an improvement compared to existing medical terminologies such as MeSH. CONCLUSION: Although the precision of the SNOMED categorizer seems sufficient to help professional encoders, it is concluded that clinical benchmarks as well as usability studies are needed to assess the impact of our SNOMED encoding method in real settings. AVAILABILITIES : The system is available for research purposes on: http://eagl.unige.ch/SNOCat
    corecore