244 research outputs found

    TRUSTLLM:Trustworthiness in Large Language Models

    Get PDF
    Large language models (LLMs) have gained considerable attention for their excellent natural language processing capabilities. Nonetheless, these LLMs present many challenges, particularly in the realm of trustworthiness. This paper introduces TRUSTLLM, a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TRUSTLLM, consisting of over 30 datasets. Our findings firstly show that in general trustworthiness and capability (i.e., functional effectiveness) are positively related. Secondly, our observations reveal that proprietary LLMs generally outperform most open-source counterparts in terms of trustworthiness, raising concerns about the potential risks of widely accessible open-source LLMs. However, a few open-source LLMs come very close to proprietary ones, suggesting that open-source models can achieve high levels of trustworthiness without additional mechanisms like moderator, offering valuable insights for developers in this field. Thirdly, it is important to note that some LLMs may be overly calibrated towards exhibiting trustworthiness, to the extent that they compromise their utility by mistakenly treating benign prompts as harmful and consequently not responding. Besides these observations, we've uncovered key insights into the multifaceted trustworthiness in LLMs. We emphasize the importance of ensuring transparency not only in the models themselves but also in the technologies that underpin trustworthiness. We advocate that the establishment of an AI alliance between industry, academia, and the open-source community to foster collaboration is imperative to advance the trustworthiness of LLMs. Our dataset, code, and toolkit will be available at § https://github.com/HowieHwong/TrustLLM and the leaderboard is released at https://trustllmbenchmark.github. io/TrustLLM-Website/.</p

    TRUSTLLM:Trustworthiness in Large Language Models

    Get PDF
    Large language models (LLMs) have gained considerable attention for their excellent natural language processing capabilities. Nonetheless, these LLMs present many challenges, particularly in the realm of trustworthiness. This paper introduces TRUSTLLM, a comprehensive study of trustworthiness in LLMs, including principles for different dimensions of trustworthiness, established benchmark, evaluation, and analysis of trustworthiness for mainstream LLMs, and discussion of open challenges and future directions. Specifically, we first propose a set of principles for trustworthy LLMs that span eight different dimensions. Based on these principles, we further establish a benchmark across six dimensions including truthfulness, safety, fairness, robustness, privacy, and machine ethics. We then present a study evaluating 16 mainstream LLMs in TRUSTLLM, consisting of over 30 datasets. Our findings firstly show that in general trustworthiness and capability (i.e., functional effectiveness) are positively related. Secondly, our observations reveal that proprietary LLMs generally outperform most open-source counterparts in terms of trustworthiness, raising concerns about the potential risks of widely accessible open-source LLMs. However, a few open-source LLMs come very close to proprietary ones, suggesting that open-source models can achieve high levels of trustworthiness without additional mechanisms like moderator, offering valuable insights for developers in this field. Thirdly, it is important to note that some LLMs may be overly calibrated towards exhibiting trustworthiness, to the extent that they compromise their utility by mistakenly treating benign prompts as harmful and consequently not responding. Besides these observations, we've uncovered key insights into the multifaceted trustworthiness in LLMs. We emphasize the importance of ensuring transparency not only in the models themselves but also in the technologies that underpin trustworthiness. We advocate that the establishment of an AI alliance between industry, academia, and the open-source community to foster collaboration is imperative to advance the trustworthiness of LLMs. Our dataset, code, and toolkit will be available at § https://github.com/HowieHwong/TrustLLM and the leaderboard is released at https://trustllmbenchmark.github. io/TrustLLM-Website/.</p

    Genic evidence that gnetophytes are sister to all other seed plants

    Full text link
    AbstractGnetophytes, comprising three relict genera, Gnetum, Welwitchia and Ephedra, are a morphologically diverse and enigmatic assemblage among seed plants. Despite recent progress on phylogenomic analyses or the insights from the recently decoded Gnetum genome, the relationship between gnetophytes and other seed plant lineages is still one of the outstanding, unresolved questions in plant sciences. Here, we showed that phylogenetic studies from nuclear genes support the hypothesis that places gnetophytes as sister to all other extant seed plants and so this hypothesis should not be ruled out according to phylogenetic inference based on nuclear genes. However, this extraordinarily difficult phylogenetic problem might never be solved by phylogenetic inference based gene tree under various artificial selection. Hence, we adopted a novel approach, comparing gene divergence among different lineages, to solve the conflicts by showing that gnetophytes actually did not gained a set of genes like the most recent common ancestor (MRCA) of other seed plants. This distinct gene evolution pattern could not be explained by random gene lost as in other seed plants but should be interpreted by the early divergence of gnetophytes from rest of seed plants. With such a placement, the gymnosperms are paraphyletic and there should be three distinct groups of living seed plants: gnetophytes, non-gnetophytes gymnosperms and angiosperms.</jats:p
    corecore