887 research outputs found
A Theoretical Analysis of NDCG Type Ranking Measures
A central problem in ranking is to design a ranking measure for evaluation of
ranking functions. In this paper we study, from a theoretical perspective, the
widely used Normalized Discounted Cumulative Gain (NDCG)-type ranking measures.
Although there are extensive empirical studies of NDCG, little is known about
its theoretical properties. We first show that, whatever the ranking function
is, the standard NDCG which adopts a logarithmic discount, converges to 1 as
the number of items to rank goes to infinity. On the first sight, this result
is very surprising. It seems to imply that NDCG cannot differentiate good and
bad ranking functions, contradicting to the empirical success of NDCG in many
applications. In order to have a deeper understanding of ranking measures in
general, we propose a notion referred to as consistent distinguishability. This
notion captures the intuition that a ranking measure should have such a
property: For every pair of substantially different ranking functions, the
ranking measure can decide which one is better in a consistent manner on almost
all datasets. We show that NDCG with logarithmic discount has consistent
distinguishability although it converges to the same limit for all ranking
functions. We next characterize the set of all feasible discount functions for
NDCG according to the concept of consistent distinguishability. Specifically we
show that whether NDCG has consistent distinguishability depends on how fast
the discount decays, and 1/r is a critical point. We then turn to the cut-off
version of NDCG, i.e., NDCG@k. We analyze the distinguishability of NDCG@k for
various choices of k and the discount functions. Experimental results on real
Web search datasets agree well with the theory.Comment: COLT 201
Study on coalescence dynamics of unequal-sized microbubbles captive on solid substrate
The dynamics of bubble coalescence are of importance for a number of industrial processes, in which the size inequality of the parent bubbles plays a significant role in mass transport, topological change and overall motion. In this study, coalescence of unequal-sized microbubbles captive on a solid substrate was observed from cross-section view using synchrotron high-speed imaging technique and a microfluidic gas generation device. The bridging neck growth and surface wave propagation at the early stage of coalescence were investigated by experimental and numerical methods. The results show that theoretical half-power-law of neck growth rate is still valid when viscous effect is neglected. However, the inertial-capillary time scale is associated with the initial radius of the smaller parent microbubble. The surface wave propagation rate on the larger parent microbubble is proportional to the inertial-capillary time scale
Understanding Microbubble Coalescence Using High-Speed Imaging and Lattice Boltzmann Method Simulation
poster abstractMicrobubble coalescence is one of the important research areas of bubble dynamics. The purpose of this research is to seek deeper understanding and relative mathematical relation on microbubble coalescence. To fulfill that, we conducted both experiments and simulations. For the part of experiment, we fabricated a microfluidic gas generator with better performance leading corresponding fluidic chemical reaction. After that we utilized ultrafast synchrotron X-ray imaging facility at the Advanced Photon Source of Argonne National Laboratory to capture the gas generating and microbubble merging phenomena using high speed imaging. These experiments show how the microbubbles with the same ratio contact and merge in the reaction channel and different concentration of reactants. As for the part of simulation, we lead the simulation basing on lattice Boltzmann method to simulate microbubble coalescence in water with unequal diameter ratio. Focuses are on the effects of size inequality of parent bubbles on the coalescence geometry and time. The “coalescence preference” of coalesced bubble closer to the larger parent bubble is well captured. A power-law relation between the preferential relative distance and size inequality is consistent to the recent experimental observations. Meanwhile, the coalescence time also exhibits power-law scaling, indicating that unequal bubbles coalesce faster than equal bubbles
Dynamic post-earthquake image segmentation with an adaptive spectral-spatial descriptor
The region merging algorithm is a widely used segmentation technique for very high resolution (VHR) remote sensing images. However, the segmentation of post-earthquake VHR images is more difficult due to the complexity of these images, especially high intra-class and low inter-class variability among damage objects. Herein two key issues must be resolved: the first is to find an appropriate descriptor to measure the similarity of two adjacent regions since they exhibit high complexity among the diverse damage objects, such as landslides, debris flow, and collapsed buildings. The other is how to solve over-segmentation and under-segmentation problems, which are commonly encountered with conventional merging strategies due to their strong dependence on local information. To tackle these two issues, an adaptive dynamic region merging approach (ADRM) is introduced, which combines an adaptive spectral-spatial descriptor and a dynamic merging strategy to adapt to the changes of merging regions for successfully detecting objects scattered globally in a post-earthquake image. In the new descriptor, the spectral similarity and spatial similarity of any two adjacent regions are automatically combined to measure their similarity. Accordingly, the new descriptor offers adaptive semantic descriptions for geo-objects and thus is capable of characterizing different damage objects. Besides, in the dynamic region merging strategy, the adaptive spectral-spatial descriptor is embedded in the defined testing order and combined with graph models to construct a dynamic merging strategy. The new strategy can find the global optimal merging order and ensures that the most similar regions are merged at first. With combination of the two strategies, ADRM can identify spatially scattered objects and alleviates the phenomenon of over-segmentation and under-segmentation. The performance of ADRM has been evaluated by comparing with four state-of-the-art segmentation methods, including the fractal net evolution approach (FNEA, as implemented in the eCognition software, Trimble Inc., Westminster, CO, USA), the J-value segmentation (JSEG) method, the graph-based segmentation (GSEG) method, and the statistical region merging (SRM) approach. The experiments were conducted on six VHR subarea images captured by RGB sensors mounted on aerial platforms, which were acquired after the 2008 Wenchuan Ms 8.0 earthquake. Quantitative and qualitative assessments demonstrated that the proposed method offers high feasibility and improved accuracy in the segmentation of post-earthquake VHR aerial images
Study on the Application of Expressway Construction Based on Sponge City Concept
Currently, in the active trial stage, sponge city concept has been applied in many low-impact development facilities on expressways in China, but many applications are not widely used. From four angles of pavement, slope, interchange and service area of expressway, this paper explains the adverse effects brought by rainwater, and then gives, and it carries out scene analysis combined with actual engineering projects with some feasible application schemes. Finally, the concept of sponge city at present is summarized, and it is considered that the optimal planning and design can be made only after establishing a reliable mathematical model and carrying out quantitative analysis
Shape-selective formation of monodisperse copper nanospheres and nanocubes via disproportionation reaction route and their optical properties
Synthesis of stable and monodisperse Cu nanocrystals of controlled morphology has been a long-standing challenge. In this Article, we report a facile disproportionation reaction approach for the synthesis of such nanocrystals in organic solvents. Either spherical or cubic shapes can be produced, depending on conditions. The typical Cu nanospheres are single crystals with a size of 23.4 ± 1.5 nm, and can self-assemble into three-dimensional (3D) nanocrystal superlattices with a large scale. By manipulating the chemical additives, monodisperse Cu nanocubes with tailorable sizes have also been obtained. The probable formation mechanism of these Cu nanocrystals is discussed. The narrow size distribution results in strong surface plasmon resonance (SPR) peaks even though the resonance is located in the interband transition region. Double SPR peaks are observed in the extinction spectra for the Cu nanocubes with relative large sizes. Theoretical simulation of the extinction spectra indicates that the SPR band located at longer wavelengths is caused by assembly of Cu nanocubes into more complex structures. The synthesis procedure that we report here is expected to foster systematic investigations on the physical properties and self-assembly of Cu nanocrystals with shape and size singularity for their potential applications in photonic and nanoelectronic devices. © 2014 American Chemical Society
Provably learning a multi-head attention layer
The multi-head attention layer is one of the key components of the
transformer architecture that sets it apart from traditional feed-forward
models. Given a sequence length , attention matrices
, and
projection matrices , the corresponding multi-head attention layer transforms length- sequences of -dimensional
tokens via .
In this work, we initiate the study of provably learning a multi-head attention
layer from random examples and give the first nontrivial upper and lower bounds
for this problem:
- Provided satisfy certain
non-degeneracy conditions, we give a -time algorithm that learns
to small error given random labeled examples drawn uniformly from .
- We prove computational lower bounds showing that in the worst case,
exponential dependence on is unavoidable.
We focus on Boolean to mimic the discrete nature of tokens in
large language models, though our techniques naturally extend to standard
continuous settings, e.g. Gaussian. Our algorithm, which is centered around
using examples to sculpt a convex body containing the unknown parameters, is a
significant departure from existing provable algorithms for learning
feedforward networks, which predominantly exploit algebraic and rotation
invariance properties of the Gaussian distribution. In contrast, our analysis
is more flexible as it primarily relies on various upper and lower tail bounds
for the input distribution and "slices" thereof.Comment: 105 pages, comments welcom
- …
