Search CORE

70,760 research outputs found

Acid rain: Mesoscale model

Author: Hsu H. M.
Publication venue
Publication date
Field of study

A mesoscale numerical model of the Florida peninsula was formulated and applied to a dry, neutral atmosphere. The prospective use of the STAR-100 computer for the submesoscale model is discussed. The numerical model presented is tested under synoptically undisturbed conditions. Two cases, differing only in the direction of the prevailing geostrophic wind, are examined: a prevailing southwest wind and a prevailing southeast wind, both 6 m/sec at all levels initially

NASA Technical Reports Server

Calculating and understanding the value of any type of match evidence when there are potential testing errors

Author: Fenton NE
Hsu A
Neil M
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 11/10/2013
Field of study

It is well known that Bayes’ theorem (with likelihood ratios) can be used to calculate the impact of evidence, such as a ‘match’ of some feature of a person. Typically the feature of interest is the DNA profile, but the method applies in principle to any feature of a person or object, including not just DNA, fingerprints, or footprints, but also more basic features such as skin colour, height, hair colour or even name. Notwithstanding concerns about the extensiveness of databases of such features, a serious challenge to the use of Bayes in such legal contexts is that its standard formulaic representations are not readily understandable to non-statisticians. Attempts to get round this problem usually involve representations based around some variation of an event tree. While this approach works well in explaining the most trivial instance of Bayes’ theorem (involving a single hypothesis and a single piece of evidence) it does not scale up to realistic situations. In particular, even with a single piece of match evidence, if we wish to incorporate the possibility that there are potential errors (both false positives and false negatives) introduced at any stage in the investigative process, matters become very complex. As a result we have observed expert witnesses (in different areas of speciality) routinely ignore the possibility of errors when presenting their evidence. To counter this, we produce what we believe is the first full probabilistic solution of the simple case of generic match evidence incorporating both classes of testing errors. Unfortunately, the resultant event tree solution is too complex for intuitive comprehension. And, crucially, the event tree also fails to represent the causal information that underpins the argument. In contrast, we also present a simple-to-construct graphical Bayesian Network (BN) solution that automatically performs the calculations and may also be intuitively simpler to understand. Although there have been multiple previous applications of BNs for analysing forensic evidence—including very detailed models for the DNA matching problem, these models have not widely penetrated the expert witness community. Nor have they addressed the basic generic match problem incorporating the two types of testing error. Hence we believe our basic BN solution provides an important mechanism for convincing experts—and eventually the legal community—that it is possible to rigorously analyse and communicate the full impact of match evidence on a case, in the presence of possible error

Crossref

Queen Mary Research Online

Random design analysis of ridge regression

Author: Hsu Daniel
Kakade Sham M.
Zhang Tong
Publication venue
Publication date: 01/01/2014
Field of study

This work gives a simultaneous analysis of both the ordinary least squares estimator and the ridge regression estimator in the random design setting under mild assumptions on the covariate/response distributions. In particular, the analysis provides sharp results on the ``out-of-sample'' prediction error, as opposed to the ``in-sample'' (fixed design) error. The analysis also reveals the effect of errors in the estimated covariance structure, as well as the effect of modeling errors, neither of which effects are present in the fixed design setting. The proofs of the main results are based on a simple decomposition lemma combined with concentration inequalities for random vectors and matrices

arXiv.org e-Print Archive

Crossref

Hong Kong University of Science and Technology Institutional Repository

Identifiability and Unmixing of Latent Parse Trees

Author: Hsu Daniel
Kakade Sham M.
Liang Percy
Publication venue
Publication date: 01/01/2012
Field of study

This paper explores unsupervised learning of parsing models along two directions. First, which models are identifiable from infinite data? We use a general technique for numerically checking identifiability based on the rank of a Jacobian matrix, and apply it to several standard constituency and dependency parsing models. Second, for identifiable models, how do we estimate the parameters efficiently? EM suffers from local optima, while recent work using spectral methods cannot be directly applied since the topology of the parse tree varies across sentences. We develop a strategy, unmixing, which deals with this additional complexity for restricted classes of parsing models

arXiv.org e-Print Archive

CiteSeerX

Defining and Estimating Intervention Effects for Groups that will Develop an Auxiliary Outcome

Author: Hsu Chi-Yuan
Joffe Marshall M.
Small Dylan
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2007
Field of study

It has recently become popular to define treatment effects for subsets of the target population characterized by variables not observable at the time a treatment decision is made. Characterizing and estimating such treatment effects is tricky; the most popular but naive approach inappropriately adjusts for variables affected by treatment and so is biased. We consider several appropriate ways to formalize the effects: principal stratification, stratification on a single potential auxiliary variable, stratification on an observed auxiliary variable and stratification on expected levels of auxiliary variables. We then outline identifying assumptions for each type of estimand. We evaluate the utility of these estimands and estimation procedures for decision making and understanding causal processes, contrasting them with the concepts of direct and indirect effects. We motivate our development with examples from nephrology and cancer screening, and use simulated data and real data on cancer screening to illustrate the estimation methods.Comment: Published at http://dx.doi.org/10.1214/088342306000000655 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

ScholarlyCommons@Penn

On the volatility of volatility

Author: Black
Black
Brian M. Murray
Stephen D.H. Hsu
Publication venue: 'Elsevier BV'
Publication date: 01/01/2006
Field of study

The Chicago Board Options Exchange (CBOE) Volatility Index, VIX, is calculated based on prices of out-of-the-money put and call options on the S&P 500 index (SPX). Sometimes called the "investor fear gauge," the VIX is a measure of the implied volatility of the SPX, and is observed to be correlated with the 30-day realized volatility of the SPX. Changes in the VIX are observed to be negatively correlated with changes in the SPX. However, no significant correlation between changes in the VIX and changes in the 30-day realized volatility of the SPX are observed. We investigate whether this indicates a mispricing of options following large VIX moves, and examine the relation to excess returns from variance swaps.Comment: 15 pages, 12 figures, LaTe

arXiv.org e-Print Archive

Crossref

CERN Document Server

Learning Topic Models and Latent Bayesian Networks Under Expansion Constraints

Author: Anandkumar Animashree
Hsu Daniel
Javanmard Adel
Kakade Sham M.
Publication venue
Publication date: 24/09/2012
Field of study

Unsupervised estimation of latent variable models is a fundamental problem central to numerous applications of machine learning and statistics. This work presents a principled approach for estimating broad classes of such models, including probabilistic topic models and latent linear Bayesian networks, using only second-order observed moments. The sufficient conditions for identifiability of these models are primarily based on weak expansion constraints on the topic-word matrix, for topic models, and on the directed acyclic graph, for Bayesian networks. Because no assumptions are made on the distribution among the latent variables, the approach can handle arbitrary correlations among the topics or latent factors. In addition, a tractable learning method via

\ell_1

optimization is proposed and studied in numerical experiments.Comment: 38 pages, 6 figures, 2 tables, applications in topic models and Bayesian networks are studied. Simulation section is adde

arXiv.org e-Print Archive

CiteSeerX

eScholarship - University of California