Search CORE

62 research outputs found

Leveraging Instance Features for Label Aggregation in Programmatic Weak Supervision

Author: Ratner Alexander
Song Linxin
Zhang Jieyu
Publication venue
Publication date: 09/10/2022
Field of study

Programmatic Weak Supervision (PWS) has emerged as a widespread paradigm to synthesize training labels efficiently. The core component of PWS is the label model, which infers true labels by aggregating the outputs of multiple noisy supervision sources abstracted as labeling functions (LFs). Existing statistical label models typically rely only on the outputs of LF, ignoring the instance features when modeling the underlying generative process. In this paper, we attempt to incorporate the instance features into a statistical label model via the proposed FABLE. In particular, it is built on a mixture of Bayesian label models, each corresponding to a global pattern of correlation, and the coefficients of the mixture components are predicted by a Gaussian Process classifier based on instance features. We adopt an auxiliary variable-based variational inference algorithm to tackle the non-conjugate issue between the Gaussian Process and Bayesian label models. Extensive empirical comparison on eleven benchmark datasets sees FABLE achieving the highest averaged performance across nine baselines.Comment: 16 page

arXiv.org e-Print Archive

Adaptive Ranking-based Sample Selection for Weakly Supervised Class-imbalanced Text Classification

Author: Goto Masayuki
Song Linxin
Yang Tianxiang
Zhang Jieyu
Publication venue
Publication date: 07/10/2022
Field of study

To obtain a large amount of training labels inexpensively, researchers have recently adopted the weak supervision (WS) paradigm, which leverages labeling rules to synthesize training labels rather than using individual annotations to achieve competitive results for natural language processing (NLP) tasks. However, data imbalance is often overlooked in applying the WS paradigm, despite being a common issue in a variety of NLP tasks. To address this challenge, we propose Adaptive Ranking-based Sample Selection (ARS2), a model-agnostic framework to alleviate the data imbalance issue in the WS paradigm. Specifically, it calculates a probabilistic margin score based on the output of the current model to measure and rank the cleanliness of each data point. Then, the ranked data are sampled based on both class-wise and rule-aware ranking. In particular, the two sample strategies corresponds to our motivations: (1) to train the model with balanced data batches to reduce the data imbalance issue and (2) to exploit the expertise of each labeling rule for collecting clean samples. Experiments on four text classification datasets with four different imbalance ratios show that ARS2 outperformed the state-of-the-art imbalanced learning and WS methods, leading to a 2%-57.8% improvement on their F1-score

arXiv.org e-Print Archive

Better Explain Transformers by Illuminating Important Information

Author: Cui Yan
Lecue Freddy
Li Irene
Luo Ao
Song Linxin
Publication venue
Publication date: 26/01/2024
Field of study

Transformer-based models excel in various natural language processing (NLP) tasks, attracting countless efforts to explain their inner workings. Prior methods explain Transformers by focusing on the raw gradient and attention as token attribution scores, where non-relevant information is often considered during explanation computation, resulting in confusing results. In this work, we propose highlighting the important information and eliminating irrelevant information by a refined information flow on top of the layer-wise relevance propagation (LRP) method. Specifically, we consider identifying syntactic and positional heads as important attention heads and focus on the relevance obtained from these important heads. Experimental results demonstrate that irrelevant information does distort output attribution scores and then should be masked during explanation computation. Compared to eight baselines on both classification and question-answering datasets, our method consistently outperforms with over 3\% to 33\% improvement on explanation metrics, providing superior explanation performance. Our anonymous code repository is available at: https://github.com/LinxinS97/Mask-LR

arXiv.org e-Print Archive

NLPBench: Evaluating Large Language Models on Solving NLP Problems

Author: Cheng Lechao
Li Irene
Song Linxin
Zhang Jieyu
Zhou Pengyuan
Zhou Tianyi
Publication venue
Publication date: 19/10/2023
Field of study

Recent developments in large language models (LLMs) have shown promise in enhancing the capabilities of natural language processing (NLP). Despite these successes, there remains a dearth of research dedicated to the NLP problem-solving abilities of LLMs. To fill the gap in this area, we present a unique benchmarking dataset, NLPBench, comprising 378 college-level NLP questions spanning various NLP topics sourced from Yale University's prior final exams. NLPBench includes questions with context, in which multiple sub-questions share the same public information, and diverse question types, including multiple choice, short answer, and math. Our evaluation, centered on LLMs such as GPT-3.5/4, PaLM-2, and LLAMA-2, incorporates advanced prompting strategies like the chain-of-thought (CoT) and tree-of-thought (ToT). Our study reveals that the effectiveness of the advanced prompting strategies can be inconsistent, occasionally damaging LLM performance, especially in smaller models like the LLAMA-2 (13b). Furthermore, our manual assessment illuminated specific shortcomings in LLMs' scientific problem-solving skills, with weaknesses in logical decomposition and reasoning notably affecting results

arXiv.org e-Print Archive

SCP: Spherical-Coordinate-based Learned Point Cloud Compression

Author: Goto Masayuki
Katto Jiro
Luo Ao
Nonaka Keisuke
Song Linxin
Sun Heming
Unno Kyohei
Publication venue
Publication date: 08/02/2024
Field of study

In recent years, the task of learned point cloud compression has gained prominence. An important type of point cloud, the spinning LiDAR point cloud, is generated by spinning LiDAR on vehicles. This process results in numerous circular shapes and azimuthal angle invariance features within the point clouds. However, these two features have been largely overlooked by previous methodologies. In this paper, we introduce a model-agnostic method called Spherical-Coordinate-based learned Point cloud compression (SCP), designed to leverage the aforementioned features fully. Additionally, we propose a multi-level Octree for SCP to mitigate the reconstruction error for distant areas within the Spherical-coordinate-based Octree. SCP exhibits excellent universality, making it applicable to various learned point cloud compression techniques. Experimental results demonstrate that SCP surpasses previous state-of-the-art methods by up to 29.14% in point-to-point PSNR BD-Rate

arXiv.org e-Print Archive

Au@h-Al2O3 Analogic Yolk–Shell Nanocatalyst for Highly Selective Synthesis of Biomass-Derived D-xylonic Acid via Regulation of Structure Effects

Author: Li Xuehui
Liu Zewei
Ma Jiliang
Peng Xinwen
Song Junlong
Sun Runcang
Xi Hongxia
Xiao Dequan
Zhong Linxin
Publication venue: Digital Commons @ New Haven
Publication date: 03/10/2018
Field of study

Selective oxidation of biomass-based monosaccharides into value-added sugar acids is highly desired, but limited success of producing D-xylonic acid has been achieved. Herein, we report an efficient catalyst system, viz., Au nanoparticles anchored on the inner walls of hollow Al2O3 nanospheres (Au@h- Al2O3), which could catalyze the selective oxidation of D-xylose into D-xylonic acid under base-free conditions. The mesoporous Al2O3 shell as the adsorbent first adsorbed D-xylose. Then, the interface of Au nanoparticles and Al2O3 as active sites spontaneously dissociated O2, and the exposed Au nanoparticle surface as the catalytic site drove the transformation. With this catalyst system, the valuable D-xylonic acid was produced with excellent yields in the aerobic oxidation of D-xylose. Extensive investigation showed that Au@h- Al2O3 is an efficient catalyst with high stability and recyclability

Digital Commons @ New Haven

LARE: Latent Augmentation using Regional Embedding with Vision-Language Model

Author: Goto Masayuki
Ishii Tatsuya
Sakurai Kosuke
Shimizu Ryotaro
Song Linxin
Publication venue
Publication date: 19/09/2024
Field of study

In recent years, considerable research has been conducted on vision-language models that handle both image and text data; these models are being applied to diverse downstream tasks, such as image-related chat, image recognition by instruction, and answering visual questions. Vision-language models (VLMs), such as Contrastive Language-Image Pre-training (CLIP), are also high-performance image classifiers that are being developed into domain adaptation methods that can utilize language information to extend into unseen domains. However, because these VLMs embed images as a single point in a unified embedding space, there is room for improvement in the classification accuracy. Therefore, in this study, we proposed the Latent Augmentation using Regional Embedding (LARE), which embeds the image as a region in the unified embedding space learned by the VLM. By sampling the augmented image embeddings from within this latent region, LARE enables data augmentation to various unseen domains, not just to specific unseen domains. LARE achieves robust image classification for domains in and out using augmented image embeddings to fine-tune VLMs. We demonstrate that LARE outperforms previous fine-tuning models in terms of image classification accuracy on three benchmarks. We also demonstrate that LARE is a more robust and general model that is valid under multiple conditions, such as unseen domains, small amounts of data, and imbalanced data.10 pages, 4 figure

arXiv.org e-Print Archive

Sex Differences in Frequency, Severity, and Distribution of Cerebral Microbleeds

Author: Ambler Gareth
Arsava Ethem Murat
Ben Assayag Einor
Bornstein Natan M
Chabriat Hugues
Coutts Shelagh B
El-Koussy Marwan
Eppinger Sebastian
Fandler-Höfler Simon
Fischer Urs
Gattringer Thomas
Hallevi Hen
Hennerici Michael
Hernandez Maria Valdes
Horstmann Solveig
Inamura Shigeru
Jäger Hans R
Kappelle L Jaap
Kim Young Dae
Kneihsl Markus
Koga Masatoshi
Lam Bonnie Yin Ka
Lee Keon-Joo
Legrand Laurence
Lemmens Robin
Li Linxin
Lim Jae-Sung
Lip Gregory Y H
Lovelock Caroline
Lyrer Philippe
Mak Henry Ka Fung
Makin Stephen
Martínez-Domeño Alejandro
Mas Jean-Louis
Microbleeds International Collaborative Network
Molad Jeremy
Nash Philip
Nishihara Masashi
Polymeris Alexandros
Prats-Sanchez Luis
Purrucker Jan
Salman Rustam Al-Shahi
Seiffge David J
Shiozawa Masayuki
Song Tae-Jin
Tanaka Jun
Tanriverdi Zeynep
Uysal Ender
Wagner Benjamin
Wong Adrian
Wong Yuen Kwun
Yoshifuji Kazuhisa
Publication venue
Publication date: 01/10/2024
Field of study

Peer reviewe

Maastricht University Research Portal

Aberdeen University Research

EUR Research Repository

Sex Differences in Frequency, Severity, and Distribution of Cerebral Microbleeds

Author: Abrigo Jill
Ambler Gareth
Arsava Ethem murat
Ay Hakan
Bae Hee-Joon
Barbato Carmen
Ben assayag Einor
Best Jonathan
Bordet Régis
Bornstein Natan m.
Bos Daniel
Browning Simone
Calvet David
Chabriat Hugues
Chappell Francesca
Chen Christopher
Christ Nicolas
Chu Winnie
Coutts Shelagh b.
De leeuw Frank-Erik
Delmaire Christine
El-Koussy Marwan
Engelter Stefan t.
Enzinger Christian
Eppinger Sebastian
Fandler-Höfler Simon
Fazekas Franz
Fischer Urs
Fluri Felix
Gattringer Thomas
Guevarra Anne cristine
Gunkel Sarah
Hallevi Hen
Hara Hideo
Hayden Derek
Hennerici Michael
Heo Ji hoe
Hernandez Maria valdes
Hilal Saima
Horstmann Solveig
Imaizumi Toshio
Inamura Shigeru
Jouvent Eric
Jung Simon
Jäger Hans r.
Kandiah Nagaendran
Kappelle L. jaap
Karayiannis Christopher
Kelly Peter j.
Kim Young dae
Kneihsl Markus
Koga Masatoshi
Kooi M. eline
Kwa Vincent i. h.
Köhler Sebastian
Lam Bonnie yin ka
Lau Kui kai
Lee Keon-Joo
Legrand Laurence
Lemmens Robin
Leung Thomas
Li Linxin
Lim Jae-Sung
Lip Gregory y. h.
Lou Min
Lovelock Caroline
Lyrer Philippe
Maaijwee Noortje
Mak Henry ka fung
Makin Stephen
Marti-Fabregas Joan
Martínez-Domeño Alejandro
Mas Jean-Louis
Mendyk Anne-Marie
Mess Werner h.
Mok Vincent
Molad Jeremy
Nash Philip
Nishihara Masashi
Orken Dilek necioglu
Peters Nils
Phan Thanh
Polymeris Alexandros
Prats-Sanchez Luis
Purrucker Jan
Robert Caroline
Rothwell Peter m.
Salman Rustam al-Shahi
Seiffge David j.
Shiozawa Masayuki
Simister Robert
Smith Eric e.
Song Tae-Jin
Soo Yannie
Srikanth Velandai
Staals Julie
Tanaka Jun
Tanriverdi Zeynep
Thijs Vincent
Toyoda Kazunori
Tuladhar Anil m.
Uysal Ender
Van oostenbrugge Robert
Veltkamp Roland
Wagner Benjamin
Wardlaw Joanna
Werring David j.
Williams David j.
Wilson Duncan
Wong Adrian
Wong Yuen kwun
Xu Chao
Yakushiji Yusuke
Yoshifuji Kazuhisa
Zhou Ying
Publication venue
Publication date: 15/10/2024
Field of study

Importance: Cerebral small vessel disease (SVD) is associated with various cerebrovascular outcomes, but data on sex differences in SVD are scarce. Objective: To investigate whether the frequency, severity, and distribution of cerebral microbleeds (CMB), other SVD markers on magnetic resonance imaging (MRI), and outcomes differ by sex. Design, Setting, and Participants: This cohort study used pooled individual patient data from the Microbleeds International Collaborative Network, including patients from 38 prospective cohort studies in 18 countries between 2000 and 2018, with clinical follow-up of at least 3 months (up to 5 years). Participants included patients with acute ischemic stroke or transient ischemic attack with available brain MRI. Data were analyzed from April to December 2023. Main Outcomes and Measures: Outcomes of interest were presence of CMB, lacunes, and severe white matter hyperintensities determined on MRI. Additionally, mortality, recurrent ischemic stroke, and intracranial hemorrhage during follow-up were assessed. Multivariable random-effects logistic regression models, Cox regression, and competing risk regression models were used to investigate sex differences in individual SVD markers, risk of recurrent cerebrovascular events, and death. Results: A total of 20 314 patients (mean [SD] age, 70.1 [12.7] years; 11 721 [57.7%] male) were included, of whom 5649 (27.8%) had CMB. CMB were more frequent in male patients, and this was consistent throughout different age groups, locations, and in multivariable models (female vs male adjusted odds ratio [aOR], 0.86; 95% CI, 0.80-0.92; P < .001). Female patients had fewer lacunes (aOR, 0.82; 95% CI, 0.74-0.90; P < .001) but a higher prevalence of severe white matter hyperintensities (aOR, 1.10; 95% CI, 1.01-1.20; P = .04) compared with male patients. A total of 2419 patients (11.9%) died during a median (IQR) follow-up of 1.4 (0.7-2.5) years. CMB presence was associated with a higher risk of mortality in female patients (hazard ratio, 1.15; 95% CI, 1.02-1.31), but not male patients (hazard ratio, 0.95; 95% CI, 0.84-1.07) (P for interaction = .01). A total of 1113 patients (5.5%) had recurrent ischemic stroke, and 189 patients (0.9%) had recurrent intracranial hemorrhage, with no sex differences. Conclusions and Relevance: This cohort study using pooled individual patient data found varying frequencies of individual SVD markers between female and male patients, indicating potential pathophysiological differences in manifestation and severity of SVD. Further research addressing differences in pathomechanisms and outcomes of SVD between female and male patients is required.</p

Edinburgh Research Explorer

VBN (Videnbasen) Aalborg Universitets forskningsportal

Sex Differences in Frequency, Severity, and Distribution of Cerebral Microbleeds

Author: Al-Shahi Salman Rustam
Ambler Gareth
Arsava Ethem Murat
Ben Assayag Einor
Bornstein Natan M
Bos Daniel
Chabriat Hugues
Coutts Shelagh B
El-Koussy Marwan
Eppinger Sebastian
Fandler-Höfler Simon
Fischer Urs
Hallevi Hen
Hennerici Michael
Hernandez Maria Valdes
Horstmann Solveig
Inamura Shigeru
Jäger Hans R
Kappelle L Jaap
Kim Young Dae
Kneihsl Markus
Koga Masatoshi
Lam Bonnie Yin Ka
Lee Keon-Joo
Legrand Laurence
Lemmens Robin
Li Linxin
Lim Jae-Sung
Lip Gregory Y H
Lovelock Caroline
Lyrer Philippe
Mak Henry Ka Fung
Martínez-Domeño Alejandro
Mas Jean-Louis
Microbleeds International Collaborative Network
Molad Jeremy
Nash Philip
Nishihara Masashi
Polymeris Alexandros
Prats-Sanchez Luis
Purrucker Jan
Seiffge David J
Shiozawa Masayuki
Song Tae-Jin
Tanaka Jun
Tanriverdi Zeynep
Uysal Ender
Wagner Benjamin
Wong Adrian
Wong Yuen Kwun
Yoshifuji Kazuhisa
Publication venue
Publication date: 01/10/2024
Field of study

Importance: Cerebral small vessel disease (SVD) is associated with various cerebrovascular outcomes, but data on sex differences in SVD are scarce. Objective: To investigate whether the frequency, severity, and distribution of cerebral microbleeds (CMB), other SVD markers on magnetic resonance imaging (MRI), and outcomes differ by sex. Design, Setting, and Participants: This cohort study used pooled individual patient data from the Microbleeds International Collaborative Network, including patients from 38 prospective cohort studies in 18 countries between 2000 and 2018, with clinical follow-up of at least 3 months (up to 5 years). Participants included patients with acute ischemic stroke or transient ischemic attack with available brain MRI. Data were analyzed from April to December 2023. Main Outcomes and Measures: Outcomes of interest were presence of CMB, lacunes, and severe white matter hyperintensities determined on MRI. Additionally, mortality, recurrent ischemic stroke, and intracranial hemorrhage during follow-up were assessed. Multivariable random-effects logistic regression models, Cox regression, and competing risk regression models were used to investigate sex differences in individual SVD markers, risk of recurrent cerebrovascular events, and death. Results: A total of 20314 patients (mean [SD] age, 70.1 [12.7] years; 11721 [57.7%] male) were included, of whom 5649 (27.8%) had CMB. CMB were more frequent in male patients, and this was consistent throughout different age groups, locations, and in multivariable models (female vs male adjusted odds ratio [aOR], 0.86; 95% CI, 0.80-0.92; P <.001). Female patients had fewer lacunes (aOR, 0.82; 95% CI, 0.74-0.90; P <.001) but a higher prevalence of severe white matter hyperintensities (aOR, 1.10; 95% CI, 1.01-1.20; P =.04) compared with male patients. A total of 2419 patients (11.9%) died during a median (IQR) follow-up of 1.4 (0.7-2.5) years. CMB presence was associated with a higher risk of mortality in female patients (hazard ratio, 1.15; 95% CI, 1.02-1.31), but not male patients (hazard ratio, 0.95; 95% CI, 0.84-1.07) (P for interaction =.01). A total of 1113 patients (5.5%) had recurrent ischemic stroke, and 189 patients (0.9%) had recurrent intracranial hemorrhage, with no sex differences. Conclusions and Relevance: This cohort study using pooled individual patient data found varying frequencies of individual SVD markers between female and male patients, indicating potential pathophysiological differences in manifestation and severity of SVD. Further research addressing differences in pathomechanisms and outcomes of SVD between female and male patients is required

Utrecht University Repository