Search CORE

183 research outputs found

Experimental test of Heisenberg's measurement uncertainty relation based on statistical distances

Author: Chen Zhihua
Du Jiangfeng
Fei Shao-Ming
Kong Fei
Li Zhaokai
Liu Ying
Ma Wenchao
Ma Zhihao
Peng Xinhua
Shi Fazhan
Shi Mingjun
Wang Hengyan
Publication venue: 'American Physical Society (APS)'
Publication date: 22/04/2016
Field of study

Incompatible observables can be approximated by compatible observables in joint measurement or measured sequentially, with constrained accuracy as implied by Heisenberg's original formulation of the uncertainty principle. Recently, Busch, Lahti, and Werner proposed inaccuracy trade-off relations based on statistical distances between probability distributions of measurement outcomes [Phys. Rev. Lett. 111, 160405 (2013); Phys. Rev. A 89, 012129 (2014)]. Here we reform their theoretical framework, derive an improved relation for qubit measurement, and perform an experimental test on a spin system. The relation reveals that the worst-case inaccuracy is tightly bounded from below by the incompatibility of target observables, and is verified by the experiment employing joint measurement in which two compatible but typically non-commutative observables on one qubit are measured simultaneously

arXiv.org e-Print Archive

Crossref

Confidence-aware Non-repetitive Multimodal Transformers for TextCaps

Author: Bao Renda
Liu Si
Wang Zhaokai
Wu Qi
Publication venue
Publication date: 21/03/2021
Field of study

When describing an image, reading text in the visual scene is crucial to understand the key information. Recent work explores the TextCaps task, i.e. image captioning with reading Optical Character Recognition (OCR) tokens, which requires models to read text and cover them in generated captions. Existing approaches fail to generate accurate descriptions because of their (1) poor reading ability; (2) inability to choose the crucial words among all extracted OCR tokens; (3) repetition of words in predicted captions. To this end, we propose a Confidence-aware Non-repetitive Multimodal Transformers (CNMT) to tackle the above challenges. Our CNMT consists of a reading, a reasoning and a generation modules, in which Reading Module employs better OCR systems to enhance text reading ability and a confidence embedding to select the most noteworthy tokens. To address the issue of word redundancy in captions, our Generation Module includes a repetition mask to avoid predicting repeated word in captions. Our model outperforms state-of-the-art models on TextCaps dataset, improving from 81.0 to 93.0 in CIDEr. Our source code is publicly available.Comment: 9 pages; Accepted by AAAI 202

arXiv.org e-Print Archive

Crossref

Association for the Advancement of Artificial Intelligence: AAAI Publications

Quantum Anomaly Detection with a Spin Processor in Diamond

Author: Chai Zihua
Du Jiangfeng
Guo Yuhang
Li Zhaokai
Liu Ying
Shi Fazhan
Wang Mengqi
Wang Ya
Publication venue
Publication date: 02/03/2024
Field of study

In the processing of quantum computation, analyzing and learning the pattern of the quantum data are essential for many tasks. Quantum machine learning algorithms can not only deal with the quantum states generated in the preceding quantum procedures, but also the quantum registers encoding classical problems. In this work, we experimentally demonstrate the anomaly detection of quantum states encoding audio samples with a three-qubit quantum processor consisting of solid-state spins in diamond. By training the quantum machine with a few normal samples, the quantum machine can detect the anomaly samples with a minimum error rate of 15.4%. These results show the power of quantum anomaly detection in dealing with machine learning tasks and the potential to detect abnormal output of quantum devices.Comment: 10 pages, 8 figure

arXiv.org e-Print Archive

Video Background Music Generation: Dataset, Method and Evaluation

Author: Bao Chenxi
Li Xiaobo
Liao Yue
Liu Si
Lu Miao
Peng Stanley
Wang Baisen
Wang Zhaokai
Zhuo Le
Publication venue
Publication date: 21/11/2022
Field of study

Music is essential when editing videos, but selecting music manually is difficult and time-consuming. Thus, we seek to automatically generate background music tracks given video input. This is a challenging task since it requires plenty of paired videos and music to learn their correspondence. Unfortunately, there exist no such datasets. To close this gap, we introduce a dataset, benchmark model, and evaluation metric for video background music generation. We introduce SymMV, a video and symbolic music dataset, along with chord, rhythm, melody, and accompaniment annotations. To the best of our knowledge, it is the first video-music dataset with high-quality symbolic music and detailed annotations. We also propose a benchmark video background music generation framework named V-MusProd, which utilizes music priors of chords, melody, and accompaniment along with video-music relations of semantic, color, and motion features. To address the lack of objective metrics for video-music correspondence, we propose a retrieval-based metric VMCP built upon a powerful video-music representation learning model. Experiments show that with our dataset, V-MusProd outperforms the state-of-the-art method in both music quality and correspondence with videos. We believe our dataset, benchmark model, and evaluation metric will boost the development of video background music generation

arXiv.org e-Print Archive

EA-BEV: Edge-aware Bird' s-Eye-View Projector for 3D Object Detection

Author: Fanyi
Feng
Haotian
Hu
Hu
Jingwen
Laifeng
Su
Tianpeng
Wang
Wangzhi
Zhang
Zhang
Zhaokai
Publication venue
Publication date: 31/03/2023
Field of study

In recent years, great progress has been made in the Lift-Splat-Shot-based (LSS-based) 3D object detection method, which converts features of 2D camera view and 3D lidar view to Bird's-Eye-View (BEV) for feature fusion. However, inaccurate depth estimation (e.g. the 'depth jump' problem) is an obstacle to develop LSS-based methods. To alleviate the 'depth jump' problem, we proposed Edge-Aware Bird's-Eye-View (EA-BEV) projector. By coupling proposed edge-aware depth fusion module and depth estimate module, the proposed EA-BEV projector solves the problem and enforces refined supervision on depth. Besides, we propose sparse depth supervision and gradient edge depth supervision, for constraining learning on global depth and local marginal depth information. Our EA-BEV projector is a plug-and-play module for any LSS-based 3D object detection models, and effectively improves the baseline performance. We demonstrate the effectiveness on the nuScenes benchmark. On the nuScenes 3D object detection validation dataset, our proposed EA-BEV projector can boost several state-of-the-art LLS-based baselines on nuScenes 3D object detection benchmark and nuScenes BEV map segmentation benchmark with negligible increment of inference time

arXiv.org e-Print Archive

Resonant Quantum Principal Component Analysis

Author: Chai Zihua
Du Jiangfeng
Guo Yuhang
Ji Wentao
Li Zhaokai
Lloyd Seth
Shi Fazhan
Wang Mengqi
Wang Ya
Publication venue
Publication date: 06/04/2021
Field of study

Principal component analysis has been widely adopted to reduce the dimension of data while preserving the information. The quantum version of PCA (qPCA) can be used to analyze an unknown low-rank density matrix by rapidly revealing the principal components of it, i.e. the eigenvectors of the density matrix with largest eigenvalues. However, due to the substantial resource requirement, its experimental implementation remains challenging. Here, we develop a resonant analysis algorithm with the minimal resource for ancillary qubits, in which only one frequency scanning probe qubit is required to extract the principal components. In the experiment, we demonstrate the distillation of the first principal component of a 4

\times

4 density matrix, with the efficiency of 86.0% and fidelity of 0.90. This work shows the speed-up ability of quantum algorithm in dimension reduction of data and thus could be used as part of quantum artificial intelligence algorithms in the future.Comment: 10 pages, 7 figures, have been waiting for the reviewers' responses for over 3 month

arXiv.org e-Print Archive

DSpace@MIT

Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft

Author: Dai Jifeng
Li Hao
Li Hongsheng
Lu Lewei
Qiao Yu
Wang Xiaogang
Wang Zhaokai
Yang Xue
Zhou Jie
Zhu Xizhou
Publication venue
Publication date: 30/03/2024
Field of study

Many reinforcement learning environments (e.g., Minecraft) provide only sparse rewards that indicate task completion or failure with binary values. The challenge in exploration efficiency in such environments makes it difficult for reinforcement-learning-based agents to learn complex tasks. To address this, this paper introduces an advanced learning system, named Auto MC-Reward, that leverages Large Language Models (LLMs) to automatically design dense reward functions, thereby enhancing the learning efficiency. Auto MC-Reward consists of three important components: Reward Designer, Reward Critic, and Trajectory Analyzer. Given the environment information and task descriptions, the Reward Designer first design the reward function by coding an executable Python function with predefined observation inputs. Then, our Reward Critic will be responsible for verifying the code, checking whether the code is self-consistent and free of syntax and semantic errors. Further, the Trajectory Analyzer summarizes possible failure causes and provides refinement suggestions according to collected trajectories. In the next round, Reward Designer will further refine and iterate the dense reward function based on feedback. Experiments demonstrate a significant improvement in the success rate and learning efficiency of our agents in complex tasks in Minecraft, such as obtaining diamond with the efficient ability to avoid lava, and efficiently explore trees and animals that are sparse in the plains biome.Comment: Accepted by CVPR202

arXiv.org e-Print Archive

Enhanced volcanic activity and long-term warmth in the middle Eocene revealed by mercury and osmium isotopes from IODP Expedition 369 Site U1514

Author: Chang Taesoo
Kim Jihun
Lim Dhongil
Ownsworth Emma
Selby David
Wang Wei
Xu Zhaokai
Yin Runsheng
Publication venue: Elsevier
Publication date: 09/01/2024
Field of study

Rapid plate reorganization may have influenced global climate during the Eocene; however, its linkage remains poorly constrained, particularly during the middle Eocene. To elucidate this tectonic–climatic relationship, here, we conducted a comprehensive analysis based on high-resolution mercury (Hg) and osmium (Os) abundance and isotope data obtained from the complete Eocene sedimentary sequence of Site U1514, drilled in the Mentelle Basin off southwest Australia. The Hg signals in this sedimentary sequence, which are characterized by significantly high enrichment and insignificant mass-independent fractionation (Δ199Hg) signal, confirm that the middle Eocene (∼45–38 Ma) was a period of persistent, increased volcanism, accompanied by intense tectonic activity. In particular, a remarkable seafloor volcanic eruption persisted for approximately 1.5 million years (∼42.0–40.5 Ma), immediately preceding the Middle Eocene Climate Optimum (MECO). Contemporaneously, the trends toward a slightly more radiogenic seawater 187Os/188Os (Osi) composition denote the prevalence of intensified continental weathering under a warm, humid climate during the middle Eocene, a phenomenon particularly evident during the MECO. Importantly, the Hg and Os records from Site U1514 reveal the occurrence of a multi-million-year warming reversal amid the long-term Eocene cooling trend, which likely contributed to significant CO2 reduction during the late Eocene. These findings significantly enhance our understanding of Eocene climate dynamics, which are fundamentally linked to intensive tectonic-driven volcanic activity and associated continental chemical weathering

Durham Research Online