183 research outputs found
Experimental test of Heisenberg's measurement uncertainty relation based on statistical distances
Incompatible observables can be approximated by compatible observables in
joint measurement or measured sequentially, with constrained accuracy as
implied by Heisenberg's original formulation of the uncertainty principle.
Recently, Busch, Lahti, and Werner proposed inaccuracy trade-off relations
based on statistical distances between probability distributions of measurement
outcomes [Phys. Rev. Lett. 111, 160405 (2013); Phys. Rev. A 89, 012129 (2014)].
Here we reform their theoretical framework, derive an improved relation for
qubit measurement, and perform an experimental test on a spin system. The
relation reveals that the worst-case inaccuracy is tightly bounded from below
by the incompatibility of target observables, and is verified by the experiment
employing joint measurement in which two compatible but typically
non-commutative observables on one qubit are measured simultaneously
Confidence-aware Non-repetitive Multimodal Transformers for TextCaps
When describing an image, reading text in the visual scene is crucial to
understand the key information. Recent work explores the TextCaps task, i.e.
image captioning with reading Optical Character Recognition (OCR) tokens, which
requires models to read text and cover them in generated captions. Existing
approaches fail to generate accurate descriptions because of their (1) poor
reading ability; (2) inability to choose the crucial words among all extracted
OCR tokens; (3) repetition of words in predicted captions. To this end, we
propose a Confidence-aware Non-repetitive Multimodal Transformers (CNMT) to
tackle the above challenges. Our CNMT consists of a reading, a reasoning and a
generation modules, in which Reading Module employs better OCR systems to
enhance text reading ability and a confidence embedding to select the most
noteworthy tokens. To address the issue of word redundancy in captions, our
Generation Module includes a repetition mask to avoid predicting repeated word
in captions. Our model outperforms state-of-the-art models on TextCaps dataset,
improving from 81.0 to 93.0 in CIDEr. Our source code is publicly available.Comment: 9 pages; Accepted by AAAI 202
Quantum Anomaly Detection with a Spin Processor in Diamond
In the processing of quantum computation, analyzing and learning the pattern
of the quantum data are essential for many tasks. Quantum machine learning
algorithms can not only deal with the quantum states generated in the preceding
quantum procedures, but also the quantum registers encoding classical problems.
In this work, we experimentally demonstrate the anomaly detection of quantum
states encoding audio samples with a three-qubit quantum processor consisting
of solid-state spins in diamond. By training the quantum machine with a few
normal samples, the quantum machine can detect the anomaly samples with a
minimum error rate of 15.4%. These results show the power of quantum anomaly
detection in dealing with machine learning tasks and the potential to detect
abnormal output of quantum devices.Comment: 10 pages, 8 figure
Video Background Music Generation: Dataset, Method and Evaluation
Music is essential when editing videos, but selecting music manually is
difficult and time-consuming. Thus, we seek to automatically generate
background music tracks given video input. This is a challenging task since it
requires plenty of paired videos and music to learn their correspondence.
Unfortunately, there exist no such datasets. To close this gap, we introduce a
dataset, benchmark model, and evaluation metric for video background music
generation. We introduce SymMV, a video and symbolic music dataset, along with
chord, rhythm, melody, and accompaniment annotations. To the best of our
knowledge, it is the first video-music dataset with high-quality symbolic music
and detailed annotations. We also propose a benchmark video background music
generation framework named V-MusProd, which utilizes music priors of chords,
melody, and accompaniment along with video-music relations of semantic, color,
and motion features. To address the lack of objective metrics for video-music
correspondence, we propose a retrieval-based metric VMCP built upon a powerful
video-music representation learning model. Experiments show that with our
dataset, V-MusProd outperforms the state-of-the-art method in both music
quality and correspondence with videos. We believe our dataset, benchmark
model, and evaluation metric will boost the development of video background
music generation
EA-BEV: Edge-aware Bird' s-Eye-View Projector for 3D Object Detection
In recent years, great progress has been made in the Lift-Splat-Shot-based
(LSS-based) 3D object detection method, which converts features of 2D camera
view and 3D lidar view to Bird's-Eye-View (BEV) for feature fusion. However,
inaccurate depth estimation (e.g. the 'depth jump' problem) is an obstacle to
develop LSS-based methods. To alleviate the 'depth jump' problem, we proposed
Edge-Aware Bird's-Eye-View (EA-BEV) projector. By coupling proposed edge-aware
depth fusion module and depth estimate module, the proposed EA-BEV projector
solves the problem and enforces refined supervision on depth. Besides, we
propose sparse depth supervision and gradient edge depth supervision, for
constraining learning on global depth and local marginal depth information. Our
EA-BEV projector is a plug-and-play module for any LSS-based 3D object
detection models, and effectively improves the baseline performance. We
demonstrate the effectiveness on the nuScenes benchmark. On the nuScenes 3D
object detection validation dataset, our proposed EA-BEV projector can boost
several state-of-the-art LLS-based baselines on nuScenes 3D object detection
benchmark and nuScenes BEV map segmentation benchmark with negligible increment
of inference time
Resonant Quantum Principal Component Analysis
Principal component analysis has been widely adopted to reduce the dimension
of data while preserving the information. The quantum version of PCA (qPCA) can
be used to analyze an unknown low-rank density matrix by rapidly revealing the
principal components of it, i.e. the eigenvectors of the density matrix with
largest eigenvalues. However, due to the substantial resource requirement, its
experimental implementation remains challenging. Here, we develop a resonant
analysis algorithm with the minimal resource for ancillary qubits, in which
only one frequency scanning probe qubit is required to extract the principal
components. In the experiment, we demonstrate the distillation of the first
principal component of a 44 density matrix, with the efficiency of
86.0% and fidelity of 0.90. This work shows the speed-up ability of quantum
algorithm in dimension reduction of data and thus could be used as part of
quantum artificial intelligence algorithms in the future.Comment: 10 pages, 7 figures, have been waiting for the reviewers' responses
for over 3 month
Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft
Many reinforcement learning environments (e.g., Minecraft) provide only
sparse rewards that indicate task completion or failure with binary values. The
challenge in exploration efficiency in such environments makes it difficult for
reinforcement-learning-based agents to learn complex tasks. To address this,
this paper introduces an advanced learning system, named Auto MC-Reward, that
leverages Large Language Models (LLMs) to automatically design dense reward
functions, thereby enhancing the learning efficiency. Auto MC-Reward consists
of three important components: Reward Designer, Reward Critic, and Trajectory
Analyzer. Given the environment information and task descriptions, the Reward
Designer first design the reward function by coding an executable Python
function with predefined observation inputs. Then, our Reward Critic will be
responsible for verifying the code, checking whether the code is
self-consistent and free of syntax and semantic errors. Further, the Trajectory
Analyzer summarizes possible failure causes and provides refinement suggestions
according to collected trajectories. In the next round, Reward Designer will
further refine and iterate the dense reward function based on feedback.
Experiments demonstrate a significant improvement in the success rate and
learning efficiency of our agents in complex tasks in Minecraft, such as
obtaining diamond with the efficient ability to avoid lava, and efficiently
explore trees and animals that are sparse in the plains biome.Comment: Accepted by CVPR202
Enhanced volcanic activity and long-term warmth in the middle Eocene revealed by mercury and osmium isotopes from IODP Expedition 369 Site U1514
Rapid plate reorganization may have influenced global climate during the Eocene; however, its linkage remains poorly constrained, particularly during the middle Eocene. To elucidate this tectonic–climatic relationship, here, we conducted a comprehensive analysis based on high-resolution mercury (Hg) and osmium (Os) abundance and isotope data obtained from the complete Eocene sedimentary sequence of Site U1514, drilled in the Mentelle Basin off southwest Australia. The Hg signals in this sedimentary sequence, which are characterized by significantly high enrichment and insignificant mass-independent fractionation (Δ199Hg) signal, confirm that the middle Eocene (∼45–38 Ma) was a period of persistent, increased volcanism, accompanied by intense tectonic activity. In particular, a remarkable seafloor volcanic eruption persisted for approximately 1.5 million years (∼42.0–40.5 Ma), immediately preceding the Middle Eocene Climate Optimum (MECO). Contemporaneously, the trends toward a slightly more radiogenic seawater 187Os/188Os (Osi) composition denote the prevalence of intensified continental weathering under a warm, humid climate during the middle Eocene, a phenomenon particularly evident during the MECO. Importantly, the Hg and Os records from Site U1514 reveal the occurrence of a multi-million-year warming reversal amid the long-term Eocene cooling trend, which likely contributed to significant CO2 reduction during the late Eocene. These findings significantly enhance our understanding of Eocene climate dynamics, which are fundamentally linked to intensive tectonic-driven volcanic activity and associated continental chemical weathering
- …
