568 research outputs found
Invariant Teacher and Equivariant Student for Unsupervised 3D Human Pose Estimation
We propose a novel method based on teacher-student learning framework for 3D
human pose estimation without any 3D annotation or side information. To solve
this unsupervised-learning problem, the teacher network adopts
pose-dictionary-based modeling for regularization to estimate a physically
plausible 3D pose. To handle the decomposition ambiguity in the teacher
network, we propose a cycle-consistent architecture promoting a 3D
rotation-invariant property to train the teacher network. To further improve
the estimation accuracy, the student network adopts a novel graph convolution
network for flexibility to directly estimate the 3D coordinates. Another
cycle-consistent architecture promoting 3D rotation-equivariant property is
adopted to exploit geometry consistency, together with knowledge distillation
from the teacher network to improve the pose estimation performance. We conduct
extensive experiments on Human3.6M and MPI-INF-3DHP. Our method reduces the 3D
joint prediction error by 11.4% compared to state-of-the-art unsupervised
methods and also outperforms many weakly-supervised methods that use side
information on Human3.6M. Code will be available at
https://github.com/sjtuxcx/ITES.Comment: Accepted in AAAI 202
Beyond von Neumann: weakly programmable processor arrays and their programming
The age of parallelism is here. For a sustainable software development for massively parallel architectures the von Neumann model need to be replaced by one with native support for parallelism. We suggest a data flow model for signal processing applications. This will make it possible to reuse software implementations for different targets and future platform generations. We also outline our development tool flow for compiling CAL, a data flow language, to parallel architectures. We also present our processor array, which can be configured to handle massively parallel computations. We demonstrate its power by implementing part of a software radio receiver
Energy Efficient MIMO Channel Pre-processor Using a Low Complexity On-Line Update Scheme
This paper presents a low-complexity energy efficient channel pre-processing update scheme, targeting the emerging 3GPP long term evolution advanced (LTE-A) downlink. Upon channel matrix renewals, the number of explicit QR decompositions (QRD) and channel matrix inversions are reduced since only the upper triangular matrices R and R^-1 are updated, based on an on-line update decision mechanism. The proposed channel pre-processing updater has been designed as a dedicated unit in a 65nm CMOS technology, resulting in a core area of 0.242mm2 (equivalent gate count of 116K). Running at a 330MHz clock, each QRD or R^-1 update consumes 4 or 2 times less energy compared to one exact state-of-the-art QRD in open literature
Energy Efficient SQRD Processor for LTE-A using a Group-sort Update Scheme
This paper presents an energy-efficient sorted QR decomposition (SQRD) processor for 3GPP LTE-Advanced (LTE-A) systems. The processor adopts a hybrid decomposition scheme to reduce computational complexity and provides a wide-range of performance complexity trade-offs. Based on the energy distribution of spatial channels, it switches between the brute-force SQRD and a low-complexity group-sort QR-update strategy, which is proposed in this work to effectively utilize the LTE-A pilot pattern. As a proof of concept, a run-time reconfigurable vector processor is developed to efficiently implement this adaptive-switching QR decomposition algorithm. In a 65nm CMOS technology, the proposed SQRD processor occupies 0.71 mm2 core area and has a throughput of up to 100MQRD/s. Compared to the brute-force approach, an energy reduction of 5~33% is achieved
Scaling Laws of RoPE-based Extrapolation
The extrapolation capability of Large Language Models (LLMs) based on Rotary
Position Embedding is currently a topic of considerable interest. The
mainstream approach to addressing extrapolation with LLMs involves modifying
RoPE by replacing 10000, the rotary base of in the
original RoPE, with a larger value and providing longer fine-tuning text. In
this work, we first observe that fine-tuning a RoPE-based LLM with either a
smaller or larger base in pre-training context length could significantly
enhance its extrapolation performance. After that, we propose
\textbf{\textit{Scaling Laws of RoPE-based Extrapolation}}, a unified framework
from the periodic perspective, to describe the relationship between the
extrapolation performance and base value as well as tuning context length. In
this process, we also explain the origin of the RoPE-based extrapolation issue
by \textbf{\textit{critical dimension for extrapolation}}. Besides these
observations and analyses, we achieve extrapolation up to 1 million context
length within only 16K training length on LLaMA2 7B and 13B.Comment: 26 pages, 12 figures, Accepted by ICLR 202
Enhancing Small Medical Learners with Privacy-preserving Contextual Prompting
Large language models (LLMs) demonstrate remarkable medical expertise, but
data privacy concerns impede their direct use in healthcare environments.
Although offering improved data privacy protection, domain-specific small
language models (SLMs) often underperform LLMs, emphasizing the need for
methods that reduce this performance gap while alleviating privacy concerns. In
this paper, we present a simple yet effective method that harnesses LLMs'
medical proficiency to boost SLM performance in medical tasks under
privacy-restricted scenarios. Specifically, we mitigate patient privacy issues
by extracting keywords from medical data and prompting the LLM to generate a
medical knowledge-intensive context by simulating clinicians' thought
processes. This context serves as additional input for SLMs, augmenting their
decision-making capabilities. Our method significantly enhances performance in
both few-shot and full training settings across three medical
knowledge-intensive tasks, achieving up to a 22.57% increase in absolute
accuracy compared to SLM fine-tuning without context, and sets new
state-of-the-art results in two medical tasks within privacy-restricted
scenarios. Further out-of-domain testing and experiments in two general domain
datasets showcase its generalizability and broad applicability. Our code can be
found at https://github.com/XZhang97666/PrivacyBoost-SLM
Effect of acupuncture on BDNF signaling pathways in several nervous system diseases
In recent years, the understanding of the mechanisms of acupuncture in the treatment of neurological disorders has deepened, and considerable progress has been made in basic and clinical research on acupuncture, but the relationship between acupuncture treatment mechanisms and brain-derived neurotrophic factor (BDNF) has not yet been elucidated. A wealth of evidence has shown that acupuncture exhibits a dual regulatory function of activating or inhibiting different BDNF pathways. This review focuses on recent research advances on the effect of acupuncture on BDNF and downstream signaling pathways in several neurological disorders. Firstly, the signaling pathways of BDNF and its function in regulating plasticity are outlined. Furthermore, this review discusses explicitly the regulation of BDNF by acupuncture in several nervous system diseases, including neuropathic pain, Parkinson’s disease, cerebral ischemia, depression, spinal cord injury, and other diseases. The underlying mechanisms of BDNF regulation by acupuncture are also discussed. This review aims to improve the theoretical system of the mechanism of acupuncture action through further elucidation of the mechanism of acupuncture modulation of BDNF in the treatment of neurological diseases and to provide evidence to support the wide application of acupuncture in clinical practice
- …
