93 research outputs found

    MIMIR: A Streamlined Platform for Personalized Agent Tuning in Domain Expertise

    Full text link
    Recently, large language models (LLMs) have evolved into interactive agents, proficient in planning, tool use, and task execution across a wide variety of tasks. However, without specific agent tuning, open-source models like LLaMA currently struggle to match the efficiency of GPT- 4, particularly given the scarcity of agent-tuning datasets for fine-tuning. In response, we introduce \textsc{Mimir}: a streamlined platform offering a customizable pipeline that enables users to leverage both private knowledge and publicly available, legally compliant datasets at scale for \textbf{personalized agent tuning}. Additionally, \textsc{Mimir} supports the generation of general instruction-tuning datasets from the same input. This dual capability ensures that language agents developed through the platform possess both specific agent abilities and general competencies. \textsc{Mimir} integrates these features into a cohesive end-to-end platform, facilitating everything from the uploading of personalized files to one-click agent fine-tuning

    BP-Im2col: Implicit Im2col Supporting AI Backpropagation on Systolic Arrays

    Full text link
    State-of-the-art systolic array-based accelerators adopt the traditional im2col algorithm to accelerate the inference of convolutional layers. However, traditional im2col cannot efficiently support AI backpropagation. Backpropagation in convolutional layers involves performing transposed convolution and dilated convolution, which usually introduces plenty of zero-spaces into the feature map or kernel. The zero-space data reorganization interfere with the continuity of training and incur additional and non-negligible overhead in terms of off- and on-chip storage, access and performance. Since countermeasures for backpropagation are rarely proposed, we propose BP-im2col, a novel im2col algorithm for AI backpropagation, and implement it in RTL on a TPU-like accelerator. Experiments on TPU-like accelerator indicate that BP-im2col reduces the backpropagation runtime by 34.9% on average, and reduces the bandwidth of off-chip memory and on-chip buffers by at least 22.7% and 70.6% respectively, over a baseline accelerator adopting the traditional im2col. It further reduces the additional storage overhead in the backpropagation process by at least 74.78%.Comment: Accepted in ICCD 2022, The 40th IEEE International Conference on Computer Desig

    Unveiling the Spectrum of Data Contamination in Language Models: A Survey from Detection to Remediation

    Full text link
    Data contamination has garnered increased attention in the era of large language models (LLMs) due to the reliance on extensive internet-derived training corpora. The issue of training corpus overlap with evaluation benchmarks--referred to as contamination--has been the focus of significant recent research. This body of work aims to identify contamination, understand its impacts, and explore mitigation strategies from diverse perspectives. However, comprehensive studies that provide a clear pathway from foundational concepts to advanced insights are lacking in this nascent field. Therefore, we present a comprehensive survey in the field of data contamination, laying out the key issues, methodologies, and findings to date, and highlighting areas in need of further research and development. In particular, we begin by examining the effects of data contamination across various stages and forms. We then provide a detailed analysis of current contamination detection methods, categorizing them to highlight their focus, assumptions, strengths, and limitations. We also discuss mitigation strategies, offering a clear guide for future research. This survey serves as a succinct overview of the most recent advancements in data contamination research, providing a straightforward guide for the benefit of future research endeavors.ACL 2024 Camera-Ready Versio

    New genetic loci link adipose and insulin biology to body fat distribution.

    Get PDF
    Body fat distribution is a heritable trait and a well-established predictor of adverse metabolic outcomes, independent of overall adiposity. To increase our understanding of the genetic basis of body fat distribution and its molecular links to cardiometabolic traits, here we conduct genome-wide association meta-analyses of traits related to waist and hip circumferences in up to 224,459 individuals. We identify 49 loci (33 new) associated with waist-to-hip ratio adjusted for body mass index (BMI), and an additional 19 loci newly associated with related waist and hip circumference measures (P < 5 × 10(-8)). In total, 20 of the 49 waist-to-hip ratio adjusted for BMI loci show significant sexual dimorphism, 19 of which display a stronger effect in women. The identified loci were enriched for genes expressed in adipose tissue and for putative regulatory elements in adipocytes. Pathway analyses implicated adipogenesis, angiogenesis, transcriptional regulation and insulin resistance as processes affecting fat distribution, providing insight into potential pathophysiological mechanisms

    Prevalence and patterns of pre-competition weight loss practices in Chinese Amateur boxers

    Get PDF
    This study investigated the weight loss (WL) practices of Chinese amateur boxers using the Rapid Weight Loss Questionnaire (RWLQ). A total of 701 (563 males, 138 females) boxers participated in the study and were categorized by sex, age group, and competitive level. Sixtyseven percent of boxers purposefully engaged in WL practices before competition. The average habitual WL was 6.0% (5.8% for juniors and 6.3% for seniors) of body mass (BM), with the average highest WL was 9.5 % (9.1% for juniors and 10.1% for seniors) of BM. Most participants (69% for juniors and 84% for seniors) allocated 15+ days for WL before competition. No significant differences in habitual WL%, highest WL%, and rapid weight loss score (RWLS) were found between age groups or competitive levels (all p>0.05). However, males’ highest WL% and RWLS were significantly higher than females (p<0.001, p=0.002, respectively). International boxers began WL later than local boxers (15.5 vs. 14.3 years, p=0.012). National boxers began WL later than provincial and local boxers (15.6 vs. 14.8 years, p<0.001; 15.6 vs. 14.3 years, p<0.001). Increased exercise and training in plastic suits were the most frequently used WL methods. Coaches were identified as the most influential person concerning boxer’s WL practices, surpassing doctors or nutritionists. This study found that some WL practices among Chinese boxers differ from those in other sports and countries. Although the prevalence of WL among junior boxers was relatively low, the magnitude of WL was high, warranting more attention in both academic and practical fields

    TRUSTLLM:Trustworthiness in Large Language Models

    Get PDF
    Large language models (LLMs) have gained considerable attention for their excellent natural language processing capabilities. Nonetheless, these LLMs present many challenges, particularly in the realm of trustworthiness. This paper introduces TRUSTLLM, a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TRUSTLLM, consisting of over 30 datasets. Our findings firstly show that in general trustworthiness and capability (i.e., functional effectiveness) are positively related. Secondly, our observations reveal that proprietary LLMs generally outperform most open-source counterparts in terms of trustworthiness, raising concerns about the potential risks of widely accessible open-source LLMs. However, a few open-source LLMs come very close to proprietary ones, suggesting that open-source models can achieve high levels of trustworthiness without additional mechanisms like moderator, offering valuable insights for developers in this field. Thirdly, it is important to note that some LLMs may be overly calibrated towards exhibiting trustworthiness, to the extent that they compromise their utility by mistakenly treating benign prompts as harmful and consequently not responding. Besides these observations, we've uncovered key insights into the multifaceted trustworthiness in LLMs. We emphasize the importance of ensuring transparency not only in the models themselves but also in the technologies that underpin trustworthiness. We advocate that the establishment of an AI alliance between industry, academia, and the open-source community to foster collaboration is imperative to advance the trustworthiness of LLMs. Our dataset, code, and toolkit will be available at § https://github.com/HowieHwong/TrustLLM and the leaderboard is released at https://trustllmbenchmark.github. io/TrustLLM-Website/.</p

    TRUSTLLM:Trustworthiness in Large Language Models

    Get PDF
    Large language models (LLMs) have gained considerable attention for their excellent natural language processing capabilities. Nonetheless, these LLMs present many challenges, particularly in the realm of trustworthiness. This paper introduces TRUSTLLM, a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TRUSTLLM, consisting of over 30 datasets. Our findings firstly show that in general trustworthiness and capability (i.e., functional effectiveness) are positively related. Secondly, our observations reveal that proprietary LLMs generally outperform most open-source counterparts in terms of trustworthiness, raising concerns about the potential risks of widely accessible open-source LLMs. However, a few open-source LLMs come very close to proprietary ones, suggesting that open-source models can achieve high levels of trustworthiness without additional mechanisms like moderator, offering valuable insights for developers in this field. Thirdly, it is important to note that some LLMs may be overly calibrated towards exhibiting trustworthiness, to the extent that they compromise their utility by mistakenly treating benign prompts as harmful and consequently not responding. Besides these observations, we've uncovered key insights into the multifaceted trustworthiness in LLMs. We emphasize the importance of ensuring transparency not only in the models themselves but also in the technologies that underpin trustworthiness. We advocate that the establishment of an AI alliance between industry, academia, and the open-source community to foster collaboration is imperative to advance the trustworthiness of LLMs. Our dataset, code, and toolkit will be available at § https://github.com/HowieHwong/TrustLLM and the leaderboard is released at https://trustllmbenchmark.github. io/TrustLLM-Website/.</p
    corecore