Search CORE

11 research outputs found

ChatGPT Versus Modest Large Language Models: An Extensive Study on Benefits and Drawbacks for Conversational Search

Author: Frieder Ophir
Muntean Cristina Ioana
Nardini Franco Maria
Perego Raffaele
Rocchietti Guido
Rulli Cosimo
Publication venue
Publication date: 01/01/2025
Field of study

Large Language Models (LLMs) are effective in modeling text syntactic and semantic content, making them a strong choice to perform conversational query rewriting. While previous approaches proposed NLP-based custom models, requiring significant engineering effort, our approach is straightforward and conceptually simpler. Not only do we improve effectiveness over the current state-of-the-art, but we also curate the cost and efficiency aspects. We explore the use of pre-trained LLMs fine-tuned to generate quality user query rewrites, aiming to reduce computational costs while maintaining or improving retrieval effectiveness. As a first contribution, we study various prompting approaches - including zero, one, and few-shot methods - with ChatGPT (e.g., gpt-3.5-turbo). We observe an increase in the quality of rewrites leading to improved retrieval. We then fine-tuned smaller open LLMs on the query rewriting task. Our results demonstrate that our fine-tuned models, including the smallest with 780 million parameters, achieve better performance during the retrieval phase than gpt-3.5-turbo. To fine-tune the selected models, we used the QReCC dataset, which is specifically designed for query rewriting tasks. For evaluation, we used the TREC CAsT datasets to assess the retrieval effectiveness of the rewrites of both gpt-3.5-turbo and our fine-tuned models. Our findings show that fine-tuning LLMs on conversational query rewriting datasets can be more effective than relying on generic instruction-tuned models or traditional query reformulation techniques

Archivio della Ricerca - Università di Pisa

On Improving Efficiency/Effectiveness trade-offs with Neural Network Compression

Author: RULLI COSIMO
Publication venue: 'Pisa University Press'
Publication date: 09/05/2023
Field of study

Deep Neural Networks (DNNs) deliver state-of-the-art performance in various fields at the price of huge computational requirements. In this thesis, we propose three solutions to reduce the computational requirements of DNNs in Learning to Rank (LtR), Image Classification, and multi-term Dense Retrieval (DR). LtR is the field of machine learning employed to rank candidate documents in a search engine. We propose a methodology to train efficient and effective neural networks for LtR by e employing pruning and cross-modal knowledge distillation. Furthermore, we develop analytic time predictors estimating the execution time of sparse and dense neural networks, thus easing the design of neural models matching the desired time requirements. In Image Classification, we propose Automatic Prune Binarization (APB), a novel compression framework enriching the expressiveness of binary networks with few full-precision weights. Moreover, we design two innovative matrix multiplication algorithms for extremely low bits configurations, based on highly efficient bitwise and logical CPU instructions. In multi-term DR, we propose two different contributions, working with uncompressed and compressed vector representations, respectively. The former exploits query terms and document terms merging to speedup the search phase while jointly reducing the memory footprint. The latter introduces Product Quantization during the document scoring phase and presents a highly efficient filtering step implemented using bit vectors. Le Reti Neurali Profonde (DNN) sono l’attuale stato dell’arte nel Machine Learning (ML), ma richiedono enormi requisiti computazionali. In questa tesi, proponiamo tre soluzioni per ridurre tali requisiti nei tasks di Learning to Rank (LtR), classificazione delle immagini e multi-term Dense Retrieval (DR). LtR è il campo del (ML) utilizzato per ordinare i documenti candidati in un motore di ricerca. Viene proposta una metodologia per addestrare reti neurali efficienti ed efficaci per LtR utilizzando il pruning e la knowledge distillation. Inoltre, vengono sviluppati dei predittori analitici che stimano la latenza di reti neurali sparse e dense, semplificandonde così la progettazione. Nella classificazione delle immagini, proponiamo Automatic Prune Binarization (APB), un nuovo framework di compressione che arricchisce l'espressività delle reti binarie con pochi pesi full-precision. Inoltre, progettiamo due algoritmi innovativi di moltiplicazione tra matrici per configurazioni a pochi bit, basati sulle efficienti istruzioni bitwise e logiche della CPU. Nel multi-term DR, vengono proposti due contributi, rispettivamente per rappresentazioni vettoriali compresse e non compresse. Il primo sfrutta la fusione dei termini di query e documenti per velocizzare la fase di ricerca, riducendo anche la memoria necessaria. Il secondo introduce Product Quantization durante la fase di scoring del documento e presenta una fase di filtraggio efficiente implementata utilizzando bit vectors.

Electronic Thesis and Dissertation Archive - Università di Pisa

Pairing Clustered Inverted Indexes with κ-NN Graphs for Fast Approximate Retrieval over Learned Sparse Representations

Author: Cosimo Rulli
Franco Maria Nardini
Rossano Venturini
Sebastian Bruch
Publication venue: Association for Computing Machinery
Publication date: 01/01/2024
Field of study

Learned sparse representations form an effective and interpretable class of embeddings for text retrieval. While exact top-k retrieval over such embeddings faces efficiency challenges, a recent algorithm called Seismic has enabled remarkably fast, highly-accurate approximate retrieval. Seismic statically prunes inverted lists, organizes each list into geometrically-cohesive blocks, and augments each block with a summary vector. At query time, each inverted list associated with a query term is traversed one block at a time in an arbitrary order, with the inner product between the query and summaries determining if a block must be evaluated. When a block is deemed promising, its documents are fully evaluated with a forward index. Seismic is one to two orders of magnitude faster than state-of-the-art inverted index-based solutions and significantly outperforms the winning graph-based submissions to the BigANN 2023 Challenge. In this work, we speed up Seismic further by introducing two innovations to its query processing subroutine. First, we traverse blocks in order of importance, rather than arbitrarily. Second, we take the list of documents retrieved by Seismic and expand it to include the neighbors of each document using an offline k-regular nearest neighbor graph; the expanded list is then ranked to produce the final top-k set. Experiments on two public datasets show that our extension, named SeismicWave, can reach almost-exact accuracy levels and is up to 2.2x faster than Seismic

Archivio della Ricerca - Università di Pisa

Efficient Inverted Indexes for Approximate Retrieval over Learned Sparse Representations

Author: Bruch Sebastian
Nardini Franco Maria
Rulli Cosimo
Venturini Rossano
Publication venue
Publication date: 29/04/2024
Field of study

Learned sparse representations form an attractive class of contextual embeddings for text retrieval. That is so because they are effective models of relevance and are interpretable by design. Despite their apparent compatibility with inverted indexes, however, retrieval over sparse embeddings remains challenging. That is due to the distributional differences between learned embeddings and term frequency-based lexical models of relevance such as BM25. Recognizing this challenge, a great deal of research has gone into, among other things, designing retrieval algorithms tailored to the properties of learned sparse representations, including approximate retrieval systems. In fact, this task featured prominently in the latest BigANN Challenge at NeurIPS 2023, where approximate algorithms were evaluated on a large benchmark dataset by throughput and recall. In this work, we propose a novel organization of the inverted index that enables fast yet effective approximate retrieval over learned sparse embeddings. Our approach organizes inverted lists into geometrically-cohesive blocks, each equipped with a summary vector. During query processing, we quickly determine if a block must be evaluated using the summaries. As we show experimentally, single-threaded query processing using our method, Seismic, reaches sub-millisecond per-query latency on various sparse embeddings of the MS MARCO dataset while maintaining high recall. Our results indicate that Seismic is one to two orders of magnitude faster than state-of-the-art inverted index-based solutions and further outperforms the winning (graph-based) submissions to the BigANN Challenge by a significant margin

arXiv.org e-Print Archive

Distilled Neural Networks for Efficient Learning to Rank

Author: Cosimo Rulli
Franco Maria Nardini
Rossano Venturini
Salvatore Trani
Publication venue: Institute of Electrical and Electronics Engineers (IEEE)
Publication date: 01/01/2022
Field of study

Crossref

Efficient Conversational Search via Topical Locality in Dense Retrieval

Author: Muntean Cristina Ioana
Nardini Franco Maria
Perego Raffaele
Rocchietti Guido
Rulli Cosimo
Publication venue: Association for Computing Machinery
Publication date: 01/01/2025
Field of study

Archivio della Ricerca - Università di Pisa

ChatGPT Versus Modest Large Language Models: An Extensive Study on Benefits and Drawbacks for Conversational Search

Author: Cosimo Rulli
Cristina Ioana Muntean
Franco Maria Nardini
Guido Rocchietti
Ophir Frieder
Raffaele Perego
Publication venue: IEEE
Publication date: 01/01/2025
Field of study

Large Language Models (LLMs) are effective in modeling text syntactic and semantic content, making them a strong choice to perform conversational query rewriting. While previous approaches proposed NLP-based custom models, requiring significant engineering effort, our approach is straightforward and conceptually simpler. Not only do we improve effectiveness over the current state-of-the-art, but we also curate the cost and efficiency aspects. We explore the use of pre-trained LLMs fine-tuned to generate quality user query rewrites, aiming to reduce computational costs while maintaining or improving retrieval effectiveness. As a first contribution, we study various prompting approaches — including zero, one, and few-shot methods — with ChatGPT (e.g., gpt-3.5-turbo). We observe an increase in the quality of rewrites leading to improved retrieval. We then fine-tuned smaller open LLMs on the query rewriting task. Our results demonstrate that our fine-tuned models, including the smallest with 780 million parameters, achieve better performance during the retrieval phase than gpt-3.5-turbo. To fine-tune the selected models, we used the QReCC dataset, which is specifically designed for query rewriting tasks. For evaluation, we used the TREC CAsT datasets to assess the retrieval effectiveness of the rewrites of both gpt-3.5-turbo and our fine-tuned models. Our findings show that fine-tuning LLMs on conversational query rewriting datasets can be more effective than relying on generic instruction-tuned models or traditional query reformulation techniques

Directory of Open Access Journals

Neural Lexical Search with Learned Sparse Retrieval

Author: Lassance Carlos
Lei Yibin
MacAvaney Sean
Nguyen Thong
Rulli Cosimo
Singh Siddharth
Yang Eugene
Yates Andrew
Publication venue
Publication date
Field of study

Enlighten

LongDoc summarization using instruction-tuned large language models for food safety regulations

Author: Janostik Jakub
Karvounis Manos
Muntean Cristina Ioana
Nardini Franco Maria
Perego Raffaele
Randl Korbinian
Rocchietti Guido
Rulli Cosimo
Trani Salvatore
Publication venue
Publication date: 01/01/2024
Field of study

Archivio della Ricerca - Università di Pisa

LongDoc summarization using instruction-tuned large language models for food safety regulations

Author: Cosimo and Randl
Cristina Ioana and Nardini
Franco Maria and Perego
Guido and Rulli
Jakub
Korbinian and Muntean
Manos and Janostik
Raffaele and Trani
Rocchietti
Salvatore and Karvounis
Publication venue
Publication date: 01/01/2024
Field of study

Archivio della Ricerca - Università di Pisa