40 research outputs found
HyperCLOVA X Technical Report
We introduce HyperCLOVA X, a family of large language models (LLMs) tailored
to the Korean language and culture, along with competitive capabilities in
English, math, and coding. HyperCLOVA X was trained on a balanced mix of
Korean, English, and code data, followed by instruction-tuning with
high-quality human-annotated datasets while abiding by strict safety guidelines
reflecting our commitment to responsible AI. The model is evaluated across
various benchmarks, including comprehensive reasoning, knowledge, commonsense,
factuality, coding, math, chatting, instruction-following, and harmlessness, in
both Korean and English. HyperCLOVA X exhibits strong reasoning capabilities in
Korean backed by a deep understanding of the language and cultural nuances.
Further analysis of the inherent bilingual nature and its extension to
multilingualism highlights the model's cross-lingual proficiency and strong
generalization ability to untargeted languages, including machine translation
between several language pairs and cross-lingual inference tasks. We believe
that HyperCLOVA X can provide helpful guidance for regions or countries in
developing their sovereign LLMs.Comment: 44 pages; updated authors list and fixed author name
Laboratory information management system for COVID-19 non-clinical efficacy trial data
Background : As the number of large-scale studies involving multiple organizations producing data has steadily increased, an integrated system for a common interoperable format is needed. In response to the coronavirus disease 2019 (COVID-19) pandemic, a number of global efforts are underway to develop vaccines and therapeutics. We are therefore observing an explosion in the proliferation of COVID-19 data, and interoperability is highly requested in multiple institutions participating simultaneously in COVID-19 pandemic research.
Results : In this study, a laboratory information management system (LIMS) approach has been adopted to systemically manage various COVID-19 non-clinical trial data, including mortality, clinical signs, body weight, body temperature, organ weights, viral titer (viral replication and viral RNA), and multiorgan histopathology, from multiple institutions based on a web interface. The main aim of the implemented system is to integrate, standardize, and organize data collected from laboratories in multiple institutes for COVID-19 non-clinical efficacy testings. Six animal biosafety level 3 institutions proved the feasibility of our system. Substantial benefits were shown by maximizing collaborative high-quality non-clinical research.
Conclusions : This LIMS platform can be used for future outbreaks, leading to accelerated medical product development through the systematic management of extensive data from non-clinical animal studies.This research was supported by the National research foundation of Korea(NRF) grant funded by the Korea government(MSIT) (2020M3A9I2109027 and 2021M3H9A1030260)
metaSafer: A Technique to Detect Heap Metadata Corruption in WebAssembly
WebAssembly (Wasm), a technology enabling efficient native code execution in web browsers, has seen a significant rise in adoption as a popular compilation target. This has led to the emergence of lightweight web services powered by Wasm, characterized by their small binary size and reduced data transfer overhead, thanks to the inherent efficiency of Wasm. Despite their lightweight nature, these services can deliver powerful features like image/video processing, AI and graphical application that surpass the capabilities of JavaScript. To ensure lightweight web services and enhance the overall web experience, Wasm has been extensively optimized. However, these optimizations have raised concerns about memory safety, leading to memory-related vulnerabilities. Wasm’s characteristic memory structure, linear memory, has vulnerabilities that provide various attack vectors to attackers. In particular, it presents various attack possibilities through metadata modification containing memory structure information. Attackers can exploit heap memory overflow in Wasm applications, allowing them to target arbitrary memory addresses, modify data, or execute arbitrary code. Such overflows can corrupt memory metadata, resulting in incorrect memory behavior. While research has mitigate memory-related weaknesses in languages such as C and C++ and architectures like X86 in recent decades, the direct application of security solutions designed for different domains to Wasm is not a practical approach. Consequently, allocators in Wasm remain vulnerable to issues like heap overflow and metadata corruption. Thus, there is a pressing need for tailored memory safety techniques and solutions that accommodate Wasm’s architecture-agnostic and linear memory structures. In this paper, we propose metaSafer as a solution. By shadowing metadata from Wasm linear memory to JavaScript virtual machine memory and conducting metadata verification, metaSafer effectively blocks attack attempts and vectors. Notably, our solution achieves fast memory shadowing and validation while maintaining a small code size. Through various verification processes, we measured the performance and code size of metaSafer, revealing that it is a software-only security solution with no additional hardware requirements. metaSafer demonstrates robust metadata protection for Wasm applications with an acceptable performance overhead of up to 8% in SQLite speed tests and Polybench benchmarks
WasDom: An Efficient Write Protection for Wasm JITed Code With ARM Domain
WebAssembly (Wasm) is a binary instruction format designed to run web applications efficiently and securely across different browsers, including Chrome’s V8, Firefox’s SpiderMonkey, and Safari’s JavaScriptCore. While Wasm’s Just-In-Time (JIT) compilation offers significant performance benefits by converting wasm code into machine code (JITed code), it introduces security vulnerabilities by violating the W^X (Write XOR Execute) policy. Conventional methodologies, such as Intel MPK for safeguarding JITed code, are constrained to specific hardware and are ineffectual in mobile environments. Consequently, there is a necessity for the development of a ARM-compatible solution. This paper proposes WasDom, an efficacious write protection mechanism for wasm JITed code on ARM architecture. Leveraging ARM’s domain-based memory management, WasDom employs a randomized domain allocation strategy to permit multiple cores to access JITed code securely. The prototype of WasDom was implemented in the V8 runtime and demonstrated a minimal performance overhead of less than 11% while providing strong write protection. The system manages memory permissions dynamically through the Domain Access Control Register (DACR), ensuring that memory regions are writable during JIT compilation and executable during runtime, thus enforcing a strong W^X policy
A Study on the Direction of Research for Pharmacopuncture through the Analysis on the Current Status of Chinese Herbal Injections
SHUNIT: Style Harmonization for Unpaired Image-to-Image Translation
We propose a novel solution for unpaired image-to-image (I2I) translation. To translate complex images with a wide range of objects to a different domain, recent approaches often use the object annotations to perform per-class source-to-target style mapping. However, there remains a point for us to exploit in the I2I. An object in each class consists of multiple components, and all the sub-object components have different characteristics. For example, a car in CAR class consists of a car body, tires, windows and head and tail lamps, etc., and they should be handled separately for realistic I2I translation. The simplest solution to the problem will be to use more detailed annotations with sub-object component annotations than the simple object annotations, but it is not possible. The key idea of this paper is to bypass the sub-object component annotations by leveraging the original style of the input image because the original style will include the information about the characteristics of the sub-object components. Specifically, for each pixel, we use not only the per-class style gap between the source and target domains but also the pixel’s original style to determine the target style of a pixel. To this end, we present Style Harmonization for unpaired I2I translation (SHUNIT). Our SHUNIT generates a new style by harmonizing the target domain style retrieved from a class memory and an original source image style. Instead of direct source-to-target style mapping, we aim for source and target styles harmonization. We validate our method with extensive experiments and achieve state-of-the-art performance on the latest benchmark sets. The source code is available online: https://github.com/bluejangbaljang/SHUNIT
OsWRKY114 Inhibits ABA-Induced Susceptibility to Xanthomonas oryzae pv. oryzae in Rice
The phytohormone abscisic acid (ABA) regulates various aspects of plant growth, development, and stress responses. ABA suppresses innate immunity to Xanthomonas oryzae pv. oryzae (Xoo) in rice (Oryza sativa), but the identity of the underlying regulator is unknown. In this study, we revealed that OsWRKY114 is involved in the ABA response during Xoo infection. ABA-induced susceptibility to Xoo was reduced in OsWRKY114-overexpressing rice plants. OsWRKY114 attenuated the negative effect of ABA on salicylic acid-dependent immunity. Furthermore, OsWRKY114 decreased the transcript levels of ABA-associated genes involved in ABA response and biosynthesis. Moreover, the endogenous ABA level was lower in OsWRKY114-overexpressing plants than in the wild-type plants after Xoo inoculation. Taken together, our results suggest that OsWRKY114 is a negative regulator of ABA that confers susceptibility to Xoo in rice
