63,397 research outputs found
Affine Hirsch foliations on 3-manifolds
This paper is devoted to discussing affine Hirsch foliations on
-manifolds. First, we prove that up to isotopic leaf-conjugacy, every closed
orientable -manifold admits , or affine Hirsch foliations.
Furthermore, every case is possible.
Then, we analyze the -manifolds admitting two affine Hirsch foliations
(abbreviated as Hirsch manifolds). On the one hand, we construct Hirsch
manifolds by using exchangeable braided links (abbreviated as DEBL Hirsch
manifolds); on the other hand, we show that every Hirsch manifold virtually is
a DEBL Hirsch manifold.
Finally, we show that for every , there are only finitely
many Hirsch manifolds with strand number . Here the strand number of a
Hirsch manifold is a positive integer defined by using strand numbers of
braids.Comment: 30pages, 4 figures, to appear at Algebr. Geom. Topo
Comment: Monitoring Networked Applications With Incremental Quantile Estimation
Comment: Monitoring Networked Applications With Incremental Quantile
Estimation [arXiv:0708.0302]Comment: Published at http://dx.doi.org/10.1214/088342306000000628 in the
Statistical Science (http://www.imstat.org/sts/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Number of paths versus number of basis functions in American option pricing
An American option grants the holder the right to select the time at which to
exercise the option, so pricing an American option entails solving an optimal
stopping problem. Difficulties in applying standard numerical methods to
complex pricing problems have motivated the development of techniques that
combine Monte Carlo simulation with dynamic programming. One class of methods
approximates the option value at each time using a linear combination of basis
functions, and combines Monte Carlo with backward induction to estimate optimal
coefficients in each approximation. We analyze the convergence of such a method
as both the number of basis functions and the number of simulated paths
increase. We get explicit results when the basis functions are polynomials and
the underlying process is either Brownian motion or geometric Brownian motion.
We show that the number of paths required for worst-case convergence grows
exponentially in the degree of the approximating polynomials in the case of
Brownian motion and faster in the case of geometric Brownian motion.Comment: Published at http://dx.doi.org/10.1214/105051604000000846 in the
Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute
of Mathematical Statistics (http://www.imstat.org
Boosting with early stopping: Convergence and consistency
Boosting is one of the most significant advances in machine learning for
classification and regression. In its original and computationally flexible
version, boosting seeks to minimize empirically a loss function in a greedy
fashion. The resulting estimator takes an additive function form and is built
iteratively by applying a base estimator (or learner) to updated samples
depending on the previous iterations. An unusual regularization technique,
early stopping, is employed based on CV or a test set. This paper studies
numerical convergence, consistency and statistical rates of convergence of
boosting with early stopping, when it is carried out over the linear span of a
family of basis functions. For general loss functions, we prove the convergence
of boosting's greedy optimization to the infinimum of the loss function over
the linear span. Using the numerical convergence result, we find early-stopping
strategies under which boosting is shown to be consistent based on i.i.d.
samples, and we obtain bounds on the rates of convergence for boosting
estimators. Simulation studies are also presented to illustrate the relevance
of our theoretical results for providing insights to practical aspects of
boosting. As a side product, these results also reveal the importance of
restricting the greedy search step-sizes, as known in practice through the work
of Friedman and others. Moreover, our results lead to a rigorous proof that for
a linearly separable problem, AdaBoost with \epsilon\to0 step-size becomes an
L^1-margin maximizer when left to run to convergence.Comment: Published at http://dx.doi.org/10.1214/009053605000000255 in the
Annals of Statistics (http://www.imstat.org/aos/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Impact of regularization on Spectral Clustering
The performance of spectral clustering can be considerably improved via
regularization, as demonstrated empirically in Amini et. al (2012). Here, we
provide an attempt at quantifying this improvement through theoretical
analysis. Under the stochastic block model (SBM), and its extensions, previous
results on spectral clustering relied on the minimum degree of the graph being
sufficiently large for its good performance. By examining the scenario where
the regularization parameter is large we show that the minimum degree
assumption can potentially be removed. As a special case, for an SBM with two
blocks, the results require the maximum degree to be large (grow faster than
) as opposed to the minimum degree.
More importantly, we show the usefulness of regularization in situations
where not all nodes belong to well-defined clusters. Our results rely on a
`bias-variance'-like trade-off that arises from understanding the concentration
of the sample Laplacian and the eigen gap as a function of the regularization
parameter. As a byproduct of our bounds, we propose a data-driven technique
\textit{DKest} (standing for estimated Davis-Kahan bounds) for choosing the
regularization parameter. This technique is shown to work well through
simulations and on a real data set.Comment: 37 page
Superheat: An R package for creating beautiful and extendable heatmaps for visualizing complex data
The technological advancements of the modern era have enabled the collection
of huge amounts of data in science and beyond. Extracting useful information
from such massive datasets is an ongoing challenge as traditional data
visualization tools typically do not scale well in high-dimensional settings.
An existing visualization technique that is particularly well suited to
visualizing large datasets is the heatmap. Although heatmaps are extremely
popular in fields such as bioinformatics for visualizing large gene expression
datasets, they remain a severely underutilized visualization tool in modern
data analysis. In this paper we introduce superheat, a new R package that
provides an extremely flexible and customizable platform for visualizing large
datasets using extendable heatmaps. Superheat enhances the traditional heatmap
by providing a platform to visualize a wide range of data types simultaneously,
adding to the heatmap a response variable as a scatterplot, model results as
boxplots, correlation information as barplots, text information, and more.
Superheat allows the user to explore their data to greater depths and to take
advantage of the heterogeneity present in the data to inform analysis
decisions. The goal of this paper is two-fold: (1) to demonstrate the potential
of the heatmap as a default visualization method for a wide range of data types
using reproducible examples, and (2) to highlight the customizability and ease
of implementation of the superheat package in R for creating beautiful and
extendable heatmaps. The capabilities and fundamental applicability of the
superheat package will be explored via three case studies, each based on
publicly available data sources and accompanied by a file outlining the
step-by-step analytic pipeline (with code).Comment: 26 pages, 10 figure
- …
