Search CORE

2,410 research outputs found

Optimism in Active Learning with Gaussian Processes

Author: A Kapoor
B Settles
DD Lewis
Publication venue: HAL CCSD
Publication date: 09/11/2015
Field of study

International audienceIn the context of Active Learning for classification, the classification error depends on the joint distribution of samples and their labels which is initially unknown. The minimization of this error requires estimating this distribution. Online estimation of this distribution involves a trade-off between exploration and exploitation. This is a common problem in machine learning for which multi-armed bandit theory, building upon Optimism in the Face of Uncertainty, has been proven very efficient these last years. We introduce two novel algorithms that use Optimism in the Face of Uncertainty along with Gaussian Processes for the Active Learning problem. The evaluation lead on real world datasets shows that these new algorithms compare positively to state-of-the-art methods

HAL-CentraleSupelec

CiteSeerX

HAL - Université de Franche-Comté

Crossref

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Portail HAL UNIV-RENNES

Hierarchical Re-estimation of Topic Models for Measuring Topical Diversity

Author: A Solow
C Rao
CD Manning
DD Lewis
DM Blei
DQ Nguyen
H Azarbonyad
H Soleimani
M Dehghani
Publication venue
Publication date: 01/01/2017
Field of study

A high degree of topical diversity is often considered to be an important characteristic of interesting text documents. A recent proposal for measuring topical diversity identifies three elements for assessing diversity: words, topics, and documents as collections of words. Topic models play a central role in this approach. Using standard topic models for measuring diversity of documents is suboptimal due to generality and impurity. General topics only include common information from a background corpus and are assigned to most of the documents in the collection. Impure topics contain words that are not related to the topic; impurity lowers the interpretability of topic models and impure topics are likely to get assigned to documents erroneously. We propose a hierarchical re-estimation approach for topic models to combat generality and impurity; the proposed approach operates at three levels: words, topics, and documents. Our re-estimation approach for measuring documents' topical diversity outperforms the state of the art on PubMed dataset which is commonly used for diversity experiments.Comment: Proceedings of the 39th European Conference on Information Retrieval (ECIR2017

arXiv.org e-Print Archive

Crossref

UvA-DARE

International Migration, Integration and Social Cohesion online publications

T ${}^2$ K ${}^2$ : The Twitter Top-K Keywords Benchmark

Author: A Guille
AE Gattiker
CD Manning
D Kılınç
DD Lewis
F Ravat
J Darmont
J Ferrarons
J Gray
J O’Shea
JD Cooper
K Spärck Jones
K Spärck Jones
L Wang
S Bringay
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/09/2017
Field of study

Information retrieval from textual data focuses on the construction of vocabularies that contain weighted term tuples. Such vocabularies can then be exploited by various text analysis algorithms to extract new knowledge, e.g., top-k keywords, top-k documents, etc. Top-k keywords are casually used for various purposes, are often computed on-the-fly, and thus must be efficiently computed. To compare competing weighting schemes and database implementations, benchmarking is customary. To the best of our knowledge, no benchmark currently addresses these problems. Hence, in this paper, we present a top-k keywords benchmark, T

{}^2

{}^2

, which features a real tweet dataset and queries with various complexities and selectivities. T

{}^2

{}^2

helps evaluate weighting schemes and database implementations in terms of computing performance. To illustrate T

{}^2

{}^2

's relevance and genericity, we successfully performed tests on the TF-IDF and Okapi BM25 weighting schemes, on one hand, and on different relational (Oracle, PostgreSQL) and document-oriented (MongoDB) database implementations, on the other hand

arXiv.org e-Print Archive

Crossref

HAL Descartes

HAL

HAL: Hyper Article en Ligne

Hal-Diderot

Prevalence of Disorders Recorded in Dogs Attending Primary-Care Veterinary Practices in England

Author: A Agresti
A Egenvall
A Egenvall
A Egenvall
A Egenvall
A Hillier
A Kathrani
A Neuber
AB Stone
AJ German
AJ German
AM Lourenço-Martins
B Wilson
BN Bonnett
C Albuquerque
C Mellersh
Cheryl S. Rosenfeld
CL Aragon
Dan G. O′Neill
Dave C. Brodbelt
David B. Church
DD Sleator
DN Irion
DR Ownby
E Friedmann
EM Lund
F Walsh
FC Calboli
FD McMillan
FJD Smith
G Chodick
G Dank
G Leroy
GC Knight
GM Gobar
GP Page
H Soll-Johanning
HD Pedersen
HE Hein
J Virués-Ortega
JE Houlton
JF Summers
JI Hudson
JK Murray
JLN Wood
JM Fleming
JS Rand
KD Mandl
L Asher
L Asher
L Kearsley-Fleet
LM Collins
LM Collins
M Aickin
MB Willis
MK Rust
MP Starkey
MR Slater
MW Neff
MY Powers
N Pearce
NJ Rooney
NJ Rooney
P Froom
P Royston
Paul D. McGreevy
PC Bartlett
PD McGreevy
PD McGreevy
Peter C. Thomson
PM Mountziaris
R Bender
R Feise
R Marsella
R Medzhitov
RM Batt
RMA Packer
S Brady
S Crispin
S Greenland
S Platt
SFJ Hodgman
T Lewis
T Lewis
TP Bellumori
TW Lewis
U John
VJ Adams
WB Lober
Å Vilson
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/03/2014
Field of study

Purebred dog health is thought to be compromised by an increasing occurence of inherited diseases but inadequate prevalence data on common disorders have hampered efforts to prioritise health reforms. Analysis of primary veterinary practice clinical data has been proposed for reliable estimation of disorder prevalence in dogs. Electronic patient record (EPR) data were collected on 148,741 dogs attending 93 clinics across central and south-eastern England. Analysis in detail of a random sample of EPRs relating to 3,884 dogs from 89 clinics identified the most frequently recorded disorders as otitis externa (prevalence 10.2%, 95% CI: 9.1-11.3), periodontal disease (9.3%, 95% CI: 8.3-10.3) and anal sac impaction (7.1%, 95% CI: 6.1-8.1). Using syndromic classification, the most prevalent body location affected was the head-and-neck (32.8%, 95% CI: 30.7-34.9), the most prevalent organ system affected was the integument (36.3%, 95% CI: 33.9-38.6) and the most prevalent pathophysiologic process diagnosed was inflammation (32.1%, 95% CI: 29.8-34.3). Among the twenty most-frequently recorded disorders, purebred dogs had a significantly higher prevalence compared with crossbreds for three: otitis externa (P = 0.001), obesity (P = 0.006) and skin mass lesion (P = 0.033), and popular breeds differed significantly from each other in their prevalence for five: periodontal disease (P = 0.002), overgrown nails (P = 0.004), degenerative joint disease (P = 0.005), obesity (P = 0.001) and lipoma (P = 0.003). These results fill a crucial data gap in disorder prevalence information and assist with disorder prioritisation. The results suggest that, for maximal impact, breeding reforms should target commonly-diagnosed complex disorders that are amenable to genetic improvement and should place special focus on at-risk breeds. Future studies evaluating disorder severity and duration will augment the usefulness of the disorder prevalence information reported herein

Humane Society Institute for Science and Policy

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

WBI Studies Repository (WellBeing International)

A Compromise between Neutrino Masses and Collider Signatures in the Type-II Seesaw Model

Author: A Bookstein
BK Ghosh
CJ Rijsbergen van
D Angluin
D Cohn
DD Lewis
DJC MacKay
DT Davis
G Salton
HS Seung
J Hwang
M Plutowski
ME Maron
N Fuhr
N Fuhr
P Biebricher
P McCullagh
P. E. Hart
PE Utgoff
PJ Hayes
RO Duda
S Robertson
TM Mitchell
WA Gale
WB Croft
WG Cochran
WS Cooper
WS Cooper
Y Freund
Publication venue
Publication date: 01/01/1994
Field of study

A natural extension of the standard

SU(2)_{\rm L} \times U(1)_{\rm Y}

gauge model to accommodate massive neutrinos is to introduce one Higgs triplet and three right-handed Majorana neutrinos, leading to a

6\times 6

neutrino mass matrix which contains three

3\times 3

sub-matrices

M_{\rm L}

M_{\rm D}

and

M_{\rm R}

. We show that three light Majorana neutrinos (i.e., the mass eigenstates of

\nu_e

\nu_\mu

and

\nu_\tau

) are exactly massless in this model, if and only if

M_{\rm L} = M_{\rm D} M_{\rm R}^{-1} M_{\rm D}^T

exactly holds. This no-go theorem implies that small but non-vanishing neutrino masses may result from a significant but incomplete cancellation between

M_{\rm L}

and

M_{\rm D} M_{\rm R}^{-1} M_{\rm D}^T

terms in the Type-II seesaw formula, provided three right-handed Majorana neutrinos are of

{\cal O}(1)

TeV and experimentally detectable at the LHC. We propose three simple Type-II seesaw scenarios with the

A_4 \times U(1)_{\rm X}

flavor symmetry to interpret the observed neutrino mass spectrum and neutrino mixing pattern. Such a TeV-scale neutrino model can be tested in two complementary ways: (1) searching for possible collider signatures of lepton number violation induced by the right-handed Majorana neutrinos and doubly-charged Higgs particles; and (2) searching for possible consequences of unitarity violation of the

3\times 3

neutrino mixing matrix in the future long-baseline neutrino oscillation experiments.Comment: RevTeX 19 pages, no figure

arXiv.org e-Print Archive

Crossref

One-carbon metabolism in cancer

Author: AJ MacFarlane
Alice C Newman
AS Tibbetts
B Chaneton
BC Blount
C Commisso
CA Lewis
CF Labuschagne
CF Woeller
D Hanahan
D Hanahan
D Kim
DD Anderson
E Currie
E Mullarky
F Daidone
F Kottakis
G Kikuchi
GM DeNicola
Gregory S. Ducker
GS Ducker
J Liu
J Meiser
J Ye
Jason W. Locasale
JJ Kamphorst
JW Locasale
K Snell
K Snell
K Snell
Katherine R. Mattaini
L Galluzzi
M Goulian
M Jain
M Kulis
Mahya Mehrmohamadi
ME Pacold
Ming Yang
MJ Osborn
ML Rose
ML Rose
OD Maddocks
OD Maddocks
Oliver D K Maddocks
Oliver D. K. Maddocks
PM Ueland
R Possemato
S Farber
S Kit
S Pollari
SJ Mentch
Swetha Bolusani
T Shlomi
WC Zhang
Y Fu
Z Ser
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/05/2017
Field of study

Cells require one-carbon units for nucleotide synthesis, methylation and reductive metabolism, and these pathways support the high proliferative rate of cancer cells. As such, anti-folates, drugs that target one-carbon metabolism, have long been used in the treatment of cancer. Amino acids, such as serine are a major one-carbon source, and cancer cells are particularly susceptible to deprivation of one-carbon units by serine restriction or inhibition of de novo serine synthesis. Recent work has also begun to decipher the specific pathways and sub-cellular compartments that are important for one-carbon metabolism in cancer cells. In this review we summarise the historical understanding of one-carbon metabolism in cancer, describe the recent findings regarding the generation and usage of one-carbon units and explore possible future therapeutics that could exploit the dependency of cancer cells on one-carbon metabolism

Crossref

Enlighten

The Search for Invariance: Repeated Positive Testing Serves the Goals of Causal Learning

Author: A Coenen
A Gopnik
A Gopnik
A Gopnik
A Gopnik
A Karmiloff-Smith
AM Johnston
B Inhelder
B Schwartz
B Sodian
B Weslake
C Cook
C Hitchcock
C Zimmerman
C Zimmerman
C Zimmerman
CM Walker
CRM McKenzie
D Klahr
D Klahr
D Klahr
D Kuhn
D Kuhn
D Kuhn
D Lewis
DD Tukey
DJ Navarro
EB Bonawitz
GD Heyman
GL Wells
HJ Einhorn
J Baron
J Friedrich
J Klayman
J Woodward
J Woodward
J Woodward
J Woodward
JE Tschirgi
JJ Gibson
JL Mackie
JR Saffran
K Dunbar
KS Kendler
L Schauble
L Schauble
L Schauble
M Friedman
M Oaksford
M Redhead
M Strevens
ME Gorman
MJ Mahoney
N Valanides
N Vasilyeva
NE Wetherick
P Kitcher
P Ylikoski
PC Wason
PC Wason
PC Wason
PC Wason
PG Devine
PN Johnson-Laird
R Vogel
R Wu
RB Skov
RD Tweney
RS Nickerson
RS Siegler
S Carey
S Carey
S Croker
SA Gelman
SA Siler
SA Sloman
SA Sloman
SA Sloman
SC Yang
T Blanchard
T Gerstenberg
T Lombrozo
TF Icard
TJP Schijndel van
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

Positive testing is characteristic of exploratory behavior, yet it seems to be at odds with the aim of information seeking. After all, repeated demonstrations of one’s current hypothesis often produce the same evidence and fail to distinguish it from potential alternatives. Research on the development of scientific reasoning and adult rule learning have both documented and attempted to explain this behavior. The current chapter reviews this prior work and introduces a novel theoretical account—the Search for Invariance (SI) hypothesis—which suggests that producing multiple positive examples serves the goals of causal learning. This hypothesis draws on the interventionist framework of causal reasoning, which suggests that causal learners are concerned with the invariance of candidate hypotheses. In a probabilistic and interdependent causal world, our primary goal is to determine whether, and in what contexts, our causal hypotheses provide accurate foundations for inference and intervention—not to disconfirm their alternatives. By recognizing the central role of invariance in causal learning, the phenomenon of positive testing may be reinterpreted as a rational information-seeking strategy

Crossref

eScholarship - University of California

The epidemiology of osteonecrosis: findings from the GPRD and THIN databases in the UK

Author: C. Cooper
M. Steinbuch
R. Stevenson
R. Miday
N. B. Watts
Y Assouline-Dayan
MA Mont
DD Gladman
M Abu-Shakra
CC Mok
J Calvo-Alén
M Etminan
TP Staa van
TP Staa van
TP Staa van
KE Wurst
SL Thomas
JD Lewis
JC Fink
R Rizzoli
SB Woo
SL Ruggiero
RE Marx
G Talamo
Publication venue: Springer Nature
Publication date: 01/01/2009
Field of study

Summary We conducted a case–control study to examine osteonecrosis (ON) incidence, patient characteristics, and selected potential risk factors using two health record databases in the UK. Statistically significant risk factors for ON included systemic corticosteroid use, hospitalization, referral or specialist visit, bone fracture, any cancer, osteoporosis, connective tissue disease, and osteoarthritis.Introduction The purpose of this case–control study was to examine the incidence of osteonecrosis (ON), patient characteristics, and selected potential risk factors for ON using two health record databases in the UK: the General Practice Research Database and The Health Improvement Network.Methods ON cases (n? =?792) were identified from 1989 to 2003 and individually matched (age, sex, and medical practice) up to six controls (n?=?4,660) with no record of ON. Possible risk factors were considered for inclusion based on a review of published literature. Annual incidence rates were computed, and a multivariable logistic regression model was derived to evaluate selected risk factors.Results ON of the hip represented the majority of cases (75.9%). Statistically significant risk factors for ON were systemic corticosteroid use in the previous 2 years, hospitalization, referral or specialist visit, bone fracture, any cancer, osteoporosis, connective tissue disease, and osteoarthritis within the past 5 years. Only 4.4% of ON cases were exposed to bisphosphonates within the previous 2 years.Conclusions This study provides further perspective on the descriptive epidemiology of ON. Studies utilizing more recent data may further elucidate the understanding of ON key predictors.<br/

Southampton (e-Prints Soton)

Crossref

Springer - Publisher Connector

PubMed Central

Oxford University Research Archive