111 research outputs found

    Structure and function of Zika virus NS5 protein: perspectives for drug design.

    Get PDF
    Zika virus (ZIKV) belongs to the positive-sense single-stranded RNA-containing Flaviviridae family. Its recent outbreak and association with human diseases (e.g. neurological disorders) have raised global health concerns, and an urgency to develop a therapeutic strategy against ZIKV infection. However, there is no currently approved antiviral against ZIKV. Here we present a comprehensive overview on recent progress in structure-function investigation of ZIKV NS5 protein, the largest non-structural protein of ZIKV, which is responsible for replication of the viral genome, RNA capping and suppression of host interferon responses. Structural comparison of the N-terminal methyltransferase domain and C-terminal RNA-dependent RNA polymerase domain of ZIKV NS5 with their counterparts from related viruses provides mechanistic insights into ZIKV NS5-mediated RNA replication, and identifies residues critical for its enzymatic activities. Finally, a collection of recently identified small molecule inhibitors against ZIKV NS5 or its closely related flavivirus homologues are also discussed

    Model Tells You Where to Merge: Adaptive KV Cache Merging for LLMs on Long-Context Tasks

    Full text link
    How to efficiently serve Large Language Models (LLMs) has become a pressing issue because of their huge computational cost in their autoregressive generation process. To mitigate computational costs, LLMs often employ the KV Cache technique to improve the generation speed. While improving the computational efficiency, the storage requirements of the KV cache are substantial, particularly in long-context scenarios, leading to significant memory consumption. Existing KV cache eviction methods often degrade the performance of LLMs in long-context scenarios due to the information loss introduced by eviction. In this paper, we propose a novel KV cache merging approach, called KVMerger, to achieve adaptive KV cache compression for long-context tasks without significant performance degradation under constrained memory budgets. Our approach is inspired by the intriguing observation that key states exhibit high similarity at the token level within a single sequence. To facilitate merging, we develop an effective yet straightforward merging set identification algorithm to identify suitable KV states for merging. Our merging set identification algorithm stimulates the second observation that KV cache sparsity, from similarity perspective, is independent of the dataset and remains persistent at the model level. Subsequently, we propose a Gaussian kernel weighted merging algorithm to selectively merge all states within each merging set. We conduct extensive experiments to demonstrate the effectiveness of KVMerger for long-context tasks under constrained memory budgets, applying it to models including Llama2-7B-chat and Llama2-13B-chat. Using the LongBench and ZeroScroll benchmarks, we compare our method with other KV cache compression techniques, including H2O and CaM, showing that our method achieves superior performance across tasks with both 50% and 35% KV cache budgets

    MM-Instruct: Generated Visual Instructions for Large Multimodal Model Alignment

    Full text link
    This paper introduces MM-Instruct, a large-scale dataset of diverse and high-quality visual instruction data designed to enhance the instruction-following capabilities of large multimodal models (LMMs). While existing visual instruction datasets often focus on question-answering, they struggle to generalize to broader application scenarios such as creative writing, summarization, or image analysis. To address these limitations, we propose a novel approach to constructing MM-Instruct that leverages the strong instruction-following capabilities of existing LLMs to generate novel visual instruction data from large-scale but conventional image captioning datasets. MM-Instruct first leverages ChatGPT to automatically generate diverse instructions from a small set of seed instructions through augmenting and summarization. It then matches these instructions with images and uses an open-sourced large language model (LLM) to generate coherent answers to the instruction-image pairs. The LLM is grounded by the detailed text descriptions of images in the whole answer generation process to guarantee the alignment of the instruction data. Moreover, we introduce a benchmark based on the generated instruction data to evaluate the instruction-following capabilities of existing LMMs. We demonstrate the effectiveness of MM-Instruct by training a LLaVA-1.5 model on the generated data, denoted as LLaVA-Instruct, which exhibits significant improvements in instruction-following capabilities compared to LLaVA-1.5 models. The MM-Instruct dataset, benchmark, and pre-trained models are available at https://github.com/jihaonew/MM-Instruct.Comment: Dataset and models are available at https://github.com/jihaonew/MM-Instruc

    A novel bio-inspired caterpillar fungus (Ophiocordyceps sinensis) optimizer for SOFC parameter identification via GRNN

    Get PDF
    Accurate parameter identification is crucial for the optimal control and performance assessment of solid oxide fuel cells (SOFCs) due to the high non-linearity in its modeling. To solve this, this study develops a novel caterpillar fungus optimizer (CFO) for SOFC parameter identification, coupled with generalized regression neural network (GRNN) for data preprocessing. The proposed CFO is characterized by powerful searching capabilities and strategic operators designed to overcome the challenges of local optimums. For a comprehensive validation, twenty-three standard benchmark functions are applied for analysis, demonstrating the effectiveness of CFO in finding the optimal solution and proficiency in escaping local optimums. Regarding the implementation for SOFC parameter identification, initially, GRNN is employed to filter out noise from the experimental data. The refined data are then transferred to CFO alongside four other competitive algorithms to identify unknown SOFC parameters. In this work, two widely studied SOFC models, i.e., electrochemical model (ECM) and simple electrochemical model (SECM) are adopted for validation under MATLAB and SimuNPS. The simulation results demonstrate that CFO, after data preprocessing, can identify the optimal parameters with robustness, speed, and accuracy. For instance, it achieves a maximum improvement in identification accuracy of 94.41 % and 94.10 % for ECM and SECM, respectively

    6G autonomous radio access network empowered by artificial intelligence and network digital twin

    Get PDF
    The sixth-generation (6G) mobile network implements the social vision of digital twins and ubiquitous intelligence. Contrary to the fifth-generation (5G) mobile network that focuses only on communications, 6G mobile networks must natively support new capabilities such as sensing, computing, artificial intelligence (AI), big data, and security while facilitating Everything as a Service. Although 5G mobile network deployment has demonstrated that network automation and intelligence can simplify network operation and maintenance (O&M), the addition of external functionalities has resulted in low service efficiency and high operational costs. In this study, a technology framework for a 6G autonomous radio access network (RAN) is proposed to achieve a high-level network autonomy that embraces the design of native cloud, native AI, and network digital twin (NDT). First, a service-based architecture is proposed to re-architect the protocol stack of RAN, which flexibly orchestrates the services and functions on demand as well as customizes them into cloud-native services. Second, a native AI framework is structured to provide AI support for the diverse use cases of network O&M by orchestrating communications, AI models, data, and computing power demanded by AI use cases. Third, a digital twin network is developed as a virtual environment for the training, pre-validation, and tuning of AI algorithms and neural networks, avoiding possible unexpected losses of the network O&M caused by AI applications. The combination of native AI and NDT can facilitate network autonomy by building closed-loop management and optimization for RAN
    corecore