19,086 research outputs found
Four-Letter Super Connoisseur\u27s Ladders
Four-letter words are famously well connected to each other. Fewer than one per cent of words connect to no other, whereas over 70 per cent in each of the four possible positions-on average, there are 23 neighbours for each word. For our present purpose, note that more than three-quarters are heterograms. This means that Connoisseur\u27s Ladders (those with sequential replacement between heterograms, plus a relationship between the first and last words) become commonplace. On the other hand, the number of such ladders is restricted by the relatively small number (fewer than 20,000) of four-letter words available
Recommended from our members
Systematic identification and correction of spelling errors in the Foundational Model of Anatomy
We describe a method for automating the detection and correction of spelling errors in the Foundational Model of Anatomy (FMA). The FMA was tokenized into 4893 distinct words; misspellings were identified and corrected using the National Library of Medicine’s SPECIALIST GSpell Spelling Suggestion API. We identified 43 errors occurring in 97 terms, and 6 words of questionable or inconsistent spelling occurring in 26 terms. These errors are replicated in other reference terminologies that use the FMA. Our approach may be useful as part of a quality assurance process for other large-scale biomedical knowledge resources
Recommended from our members
A lightweight, pattern-based approach to identification and formalisation of TimeML expressions in clinical narratives
General Architecture for Text Engineering (GATE) components for identifying clinical events and temporal expressions are developed and evaluated against a corpus of 120 discharge summaries
Recommended from our members
The therapeutic use of videogames within secure forensic settings: a review of the literature and application to practice
Engagement in leisure pursuits that involves the use of tools and objects and the exploration of a new environment can provide a success experience that leads to increased feelings of competence and mastery. Such experiences are considered important in the rehabilitation of forensic clients. The findings from videogame research within a general population are compared with those among mental health and forensic clients. Within the general population, videogames may provide opportunities for social interaction and the expression of creativity and humour as well as offering a graded approach to building computer skills. Within a forensic population, videogames have been found to be a normalising, age-appropriate and culturally appropriate activity, useful in engaging clients and improving self-concept and locus of control. The findings suggest that videogame play offers access to a safe virtual environment that encourages exploration and mastery and that it may be a useful therapeutic tool in secure settings where such opportunities are often limited. The use and potential contraindications of videogames within a forensic setting, the content of certain games and their possible influence on behaviour and the implications for future research are also discussed
Recommended from our members
Creating sustainability through Smart City Projects
Smart Cities are a key mechanism for facilitating sustainability – be that in the use of resources (e.g. energy, water), the running of city infrastructure (e.g. transport) or in terms of social policy (e.g. politics). Using our experience of a Smart City project, MK:Smart, we describe what role citizen-led innovation could have in promoting long-term sustainable change. Beyond this we detail some of the barriers to success we have identified in the hope that design patterns might help us address these challenges
Recommended from our members
Coreference resolution in clinical discharge summaries, progress notes, surgical and pathology reports: a unified lexical approach
We developed a lexical rule-based system that uses a unified approach to resolving coreference across a wide variety of clinical records comprising discharge summaries, progress notes, pathology, radiology and surgical reports from two corpora (Ontology Development and Information Extraction (ODIE) and i2b2/VA) provided for the fifth i2b2/VA shared task. Taking the unweighted mean between 4 coreference metrics, validation of the system against the i2b2/VA corpus attained an overall F-score of 87.7% across all mention classes, with a maximum of 93.1% for coreference of persons, and a minimum of 77.2% for coreference of tests. For the ODIE corpus the overall F-score across all mention classes was 79.4%, with a maximum of 82.0% for coreference of persons and a minimum of 13.1% for coreference of diagnostic reagents. For the ODIE corpus our results are comparable to the mean reported inter-annotator agreement with the gold standard. We discuss the four categories of errors we identified, and how these might be addressed. The system uses a number of reusable modules and techniques that may be of benefit to the research community
Recommended from our members
A tool for enhancing MetaMap performance when annotating clinical guideline documents with UMLS concepts
We developed a tool that integrates the National Library of Medicine's MetaMap software with GATE, an open-source text an- alytics framework. The tool allows non-ASCII encoded documents of numerous formats to be annotated with UMLS concepts. We created a GATE pipeline to chunk cardiovascular disease guideline text into default segments (blank-line delimited), XML element content, sentences and phrases, which were sequentially submitted to MetaMap for annotation. XML element, sentence and phrase chunking allowed term extraction and mapping to be completed in around 1/3 of the time taken with de- fault chunking, although with slight loss of accuracy (F1.0s=0.94-0.99). However, phrase chunking allows more complex input to be processed in real time, which is not possible with the other approaches. We discuss the results in relation to use of MetaMap's --term processing option for generating pre- and post-coordinated mappings from composite phrases
Six-Letter Connoisseur\u27s Ladders
By the time we reach six-letter words, conditions for superior ladders are much improved: nearly one-half of six-letter words are heterograms (all letters are different), the fraction of isolanos (words which have no neighbours) decreases to less than one-tenth, there is a reasonable number of onalosis (words which have neighbours for each letter change), and each word has on average almost six neighbours (one for each letter). In this article, we therefore only consider ladders in which the terminal words are heterograms, with corresponding letters different, and with letters replaced in order. Even so, there are about fifty thousand of them
A study’s got to know its limitations
Background: All research has room for improvement, but authors do not always clearly acknowledge the limitations of their work. In this brief report, we sought to identify the prevalence of limitations statements in the medRxiv COVID-19 SARS-CoV-2 dataset. Methods: We combined automated methods with manual review to analyse manuscripts for the presence, or absence, either of a defined limitations section in the text, or as part of the general discussion. Results: We identified a structured limitations statement in 28% of the manuscripts, and overall 52% contained at least one mention of a study limitation. Over one-third of manuscripts contained none of the terms that might typically be associated with reporting of limitations. Overall our method performed with precision of 0.97 and recall of 0.91. Conclusion: The presence or absence of limitations statements can be identified with reasonable confidence using automated tools. We suggest that it might be beneficial to require a defined, structured statement about study limitations, either as part of the submission process, or clearly delineated within the manuscript
What Others Say About This Work? Scalable Extraction of Citation Contexts from Research Papers
This work presents a new, scalable solution to the problem of extracting citation contexts: the textual fragments surrounding citation references. These citation contexts can be used to navigate digital libraries of research papers to help users in deciding what to read. We have developed a prototype system which can retrieve, on-demand, citation contexts from the full text of over 15 million research articles in the Mendeley catalog for a given reference research paper. The evaluation results show that our citation extraction system provides additional functionality over existing tools, has two orders of magnitude faster runtime performance, while providing a 9% improvement in F-measure over the current state-of-the-art
- …
