683 research outputs found
Device-free Localization using Received Signal Strength Measurements in Radio Frequency Network
Device-free localization (DFL) based on the received signal strength (RSS)
measurements of radio frequency (RF)links is the method using RSS variation due
to the presence of the target to localize the target without attaching any
device. The majority of DFL methods utilize the fact the link will experience
great attenuation when obstructed. Thus that localization accuracy depends on
the model which describes the relationship between RSS loss caused by
obstruction and the position of the target. The existing models is too rough to
explain some phenomenon observed in the experiment measurements. In this paper,
we propose a new model based on diffraction theory in which the target is
modeled as a cylinder instead of a point mass. The proposed model can will
greatly fits the experiment measurements and well explain the cases like link
crossing and walking along the link line. Because the measurement model is
nonlinear, particle filtering tracing is used to recursively give the
approximate Bayesian estimation of the position. The posterior Cramer-Rao lower
bound (PCRLB) of proposed tracking method is also derived. The results of field
experiments with 8 radio sensors and a monitored area of 3.5m 3.5m show that
the tracking error of proposed model is improved by at least 36 percent in the
single target case and 25 percent in the two targets case compared to other
models.Comment: This paper has been withdrawn by the author due to some mistake
Unabridged phase diagram for single-phased FeSexTe1-x thin films
A complete phase diagram and its corresponding physical properties are
essential prerequisites to understand the underlying mechanism of iron based
superconductivity. For the structurally simplest 11 (FeSeTe) system, earlier
attempts using bulk samples have not been able to do so due to the fabrication
difficulties. Here, thin FeSexTe1-x films with the Se content covering the full
range were fabricated by using pulsed laser deposition method. Crystal
structure analysis shows that all films retain the tetragonal structure in room
temperature. Significantly, the highest superconducting transition temperature
(TC = 20 K) occurs in the newly discovered domain, 0.6 - 0.8. The single-phased
superconducting dome for the full Se doping range is the first of its kind in
iron chalcogenide superconductors. Our results present a new avenue to explore
novel physics as well as to optimize superconductors
MMFace4D: A Large-Scale Multi-Modal 4D Face Dataset for Audio-Driven 3D Face Animation
Audio-Driven Face Animation is an eagerly anticipated technique for
applications such as VR/AR, games, and movie making. With the rapid development
of 3D engines, there is an increasing demand for driving 3D faces with audio.
However, currently available 3D face animation datasets are either
scale-limited or quality-unsatisfied, which hampers further developments of
audio-driven 3D face animation. To address this challenge, we propose MMFace4D,
a large-scale multi-modal 4D (3D sequence) face dataset consisting of 431
identities, 35,904 sequences, and 3.9 million frames. MMFace4D exhibits two
compelling characteristics: 1) a remarkably diverse set of subjects and corpus,
encompassing actors spanning ages 15 to 68, and recorded sentences with
durations ranging from 0.7 to 11.4 seconds. 2) It features synchronized audio
and 3D mesh sequences with high-resolution face details. To capture the subtle
nuances of 3D facial expressions, we leverage three synchronized RGBD cameras
during the recording process. Upon MMFace4D, we construct a non-autoregressive
framework for audio-driven 3D face animation. Our framework considers the
regional and composite natures of facial animations, and surpasses contemporary
state-of-the-art approaches both qualitatively and quantitatively. The code,
model, and dataset will be publicly available.Comment: 10 pages, 8 figures. This paper has been submitted to IEEE
Transaction on MultiMedia, which is the extension of our MM2023 paper
arXiv:2308.05428. The dataset is now publicly available, see Project page at
https://wuhaozhe.github.io/mmface4d
Towards Seamless Adaptation of Pre-trained Models for Visual Place Recognition
Recent studies show that vision models pre-trained in generic visual learning
tasks with large-scale data can provide useful feature representations for a
wide range of visual perception problems. However, few attempts have been made
to exploit pre-trained foundation models in visual place recognition (VPR). Due
to the inherent difference in training objectives and data between the tasks of
model pre-training and VPR, how to bridge the gap and fully unleash the
capability of pre-trained models for VPR is still a key issue to address. To
this end, we propose a novel method to realize seamless adaptation of
pre-trained models for VPR. Specifically, to obtain both global and local
features that focus on salient landmarks for discriminating places, we design a
hybrid adaptation method to achieve both global and local adaptation
efficiently, in which only lightweight adapters are tuned without adjusting the
pre-trained model. Besides, to guide effective adaptation, we propose a mutual
nearest neighbor local feature loss, which ensures proper dense local features
are produced for local matching and avoids time-consuming spatial verification
in re-ranking. Experimental results show that our method outperforms the
state-of-the-art methods with less training data and training time, and uses
about only 3% retrieval runtime of the two-stage VPR methods with RANSAC-based
spatial verification. It ranks 1st on the MSLS challenge leaderboard (at the
time of submission). The code is released at
https://github.com/Lu-Feng/SelaVPR.Comment: ICLR202
CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition
Over the past decade, most methods in visual place recognition (VPR) have
used neural networks to produce feature representations. These networks
typically produce a global representation of a place image using only this
image itself and neglect the cross-image variations (e.g. viewpoint and
illumination), which limits their robustness in challenging scenes. In this
paper, we propose a robust global representation method with cross-image
correlation awareness for VPR, named CricaVPR. Our method uses the attention
mechanism to correlate multiple images within a batch. These images can be
taken in the same place with different conditions or viewpoints, or even
captured from different places. Therefore, our method can utilize the
cross-image variations as a cue to guide the representation learning, which
ensures more robust features are produced. To further facilitate the
robustness, we propose a multi-scale convolution-enhanced adaptation method to
adapt pre-trained visual foundation models to the VPR task, which introduces
the multi-scale local information to further enhance the cross-image
correlation-aware representation. Experimental results show that our method
outperforms state-of-the-art methods by a large margin with significantly less
training time. The code is released at https://github.com/Lu-Feng/CricaVPR.Comment: Accepted by CVPR202
Strip-MLP: Efficient Token Interaction for Vision MLP
Token interaction operation is one of the core modules in MLP-based models to
exchange and aggregate information between different spatial locations.
However, the power of token interaction on the spatial dimension is highly
dependent on the spatial resolution of the feature maps, which limits the
model's expressive ability, especially in deep layers where the feature are
down-sampled to a small spatial size. To address this issue, we present a novel
method called \textbf{Strip-MLP} to enrich the token interaction power in three
ways. Firstly, we introduce a new MLP paradigm called Strip MLP layer that
allows the token to interact with other tokens in a cross-strip manner,
enabling the tokens in a row (or column) to contribute to the information
aggregations in adjacent but different strips of rows (or columns). Secondly, a
\textbf{C}ascade \textbf{G}roup \textbf{S}trip \textbf{M}ixing \textbf{M}odule
(CGSMM) is proposed to overcome the performance degradation caused by small
spatial feature size. The module allows tokens to interact more effectively in
the manners of within-patch and cross-patch, which is independent to the
feature spatial size. Finally, based on the Strip MLP layer, we propose a novel
\textbf{L}ocal \textbf{S}trip \textbf{M}ixing \textbf{M}odule (LSMM) to boost
the token interaction power in the local region. Extensive experiments
demonstrate that Strip-MLP significantly improves the performance of MLP-based
models on small datasets and obtains comparable or even better results on
ImageNet. In particular, Strip-MLP models achieve higher average Top-1 accuracy
than existing MLP-based models by +2.44\% on Caltech-101 and +2.16\% on
CIFAR-100. The source codes will be available
at~\href{https://github.com/Med-Process/Strip_MLP{https://github.com/Med-Process/Strip\_MLP}
- …
