35 research outputs found

    Benchmarking GPT-4 on Algorithmic Problems: A Systematic Evaluation of Prompting Strategies

    Full text link
    Large Language Models (LLMs) have revolutionized the field of Natural Language Processing thanks to their ability to reuse knowledge acquired on massive text corpora on a wide variety of downstream tasks, with minimal (if any) tuning steps. At the same time, it has been repeatedly shown that LLMs lack systematic generalization, which allows to extrapolate the learned statistical regularities outside the training distribution. In this work, we offer a systematic benchmarking of GPT-4, one of the most advanced LLMs available, on three algorithmic tasks characterized by the possibility to control the problem difficulty with two parameters. We compare the performance of GPT-4 with that of its predecessor (GPT-3.5) and with a variant of the Transformer-Encoder architecture recently introduced to solve similar tasks, the Neural Data Router. We find that the deployment of advanced prompting techniques allows GPT-4 to reach superior accuracy on all tasks, demonstrating that state-of-the-art LLMs constitute a very strong baseline also in challenging tasks that require systematic generalization.Comment: Accepted at LREC-COLING 2024. Added acknowledgement

    Assessing the Emergent Symbolic Reasoning Abilities of Llama Large Language Models

    Full text link
    Large Language Models (LLMs) achieve impressive performance in a wide range of tasks, even if they are often trained with the only objective of chatting fluently with users. Among other skills, LLMs show emergent abilities in mathematical reasoning benchmarks, which can be elicited with appropriate prompting methods. In this work, we systematically investigate the capabilities and limitations of popular open-source LLMs on different symbolic reasoning tasks. We evaluate three models of the Llama 2 family on two datasets that require solving mathematical formulas of varying degrees of difficulty. We test a generalist LLM (Llama 2 Chat) as well as two fine-tuned versions of Llama 2 (MAmmoTH and MetaMath) specifically designed to tackle mathematical problems. We observe that both increasing the scale of the model and fine-tuning it on relevant tasks lead to significant performance gains. Furthermore, using fine-grained evaluation measures, we find that such performance gains are mostly observed with mathematical formulas of low complexity, which nevertheless often remain challenging even for the largest fine-tuned models.Comment: Accepted at 33rd International Conference on Artificial Neural Networks (ICANN24

    Addressing multiple facets of bias and uncertainty in continental scale biodiversity databases

    Get PDF
    The availability of biodiversity databases is expanding at unprecedented rates. Nevertheless, species occurrence data can be intrinsically biased and contain uncertainties that impact the accuracy and reliability of biodiversity estimates. In this study, we developed a reproducible framework to assess three dimensions of bias—taxonomic, spatial, and temporal—as well as temporal uncertainty associated with data collections. We utilized the vegetation plot data located in Europe, from sPlotOpen, an open-access database, as a case study. The metrics proposed for estimating bias include completeness of the species richness for taxonomic bias, Nearest Neighbor Index for spatial bias, and Pielou’s index for temporal bias. Additionally, we introduced a new method based on a negative exponential curve to model the temporal decay in biodiversity data, aiming to quantify temporal uncertainty. Finally, we assessed the sampling bias considering the influence of various spatial variables (i.e, road density, human population count, Natura 2000 network and topographic roughness). We discovered that the facets of bias and the temporal uncertainty varied throughout Europe, as did the different roles played by spatial variables in determining biases. sPlotOpen showed a clustered distribution of the vegetation plots, and an uneven distribution in sampling completeness, year of sampling and temporal uncertainty. The facets of bias were significantly explained mainly by the presence of Natura 2000 network and marginally by the human population count. These results suggest that employing an efficient procedure to examine biases and uncertainties in data collections can enhance data quality and provide more reliable biodiversity estimates

    Post-glacial determinants of regional species pools in alpine grasslands

    Get PDF
    [Aim] Alpine habitats support unique biodiversity confined to high-elevation areas in the current interglacial. Plant diversity in these habitats may respond to area, environment, connectivity and isolation, yet these factors have been rarely evaluated in concert. Here we investigate major determinants of regional species pools in alpine grasslands, and the responses of their constituent species groups.[Location] European mountains below 50° N.[Time period] Between 1928 and 2019.[Major taxa studied] Vascular plants.[Methods] We compiled species pools from alpine grasslands in 23 regions, including 794 alpine species and 2,094 non-alpines. We used species–area relationships to test the influence of the extent of alpine areas on regional richness, and mixed-effects models to compare the effects of 12 spatial and environmental predictors. Variation in species composition was addressed by generalized dissimilarity models and by a coefficient of dispersal direction to assess historical links among regions.[Results] Pool sizes were partially explained by current alpine areas, but the other predictors largely contributed to regional differences. The number of alpine species was influenced by area, calcareous bedrock, topographic heterogeneity and regional isolation, while non-alpines responded better to connectivity and climate. Regional dissimilarity of alpine species was explained by isolation and precipitation, but non-alpines only responded to isolation. Past dispersal routes were correlated with latitude, with alpine species showing stronger connections among regions.[Main conclusions] Besides area effects, edaphic, topographic and spatio-temporal determinants are important to understand the organization of regional species pools in alpine habitats. The number of alpine species is especially linked to refugia and isolation, but their composition is explained by past dispersal and post-glacial environmental filtering, while non-alpines are generally influenced by regional floras. New research on the dynamics of alpine biodiversity should contextualize the determinants of regional species pools and the responses of species with different ecological profiles.The authors thank Daniela Gaspar for support in GIS analyses. B.J.-A. thanks the Marie Curie Clarín-COFUND program of the Principality of Asturias-EU (ACB17-26), the regional grant IDI/2018/000151, and the Spanish Research Agency grant AEI/ 10.13039/501100011033. J.V.R.-D. was supported by the ACA17-02FP7 Marie Curie COFUND-Clarín grant. G.P.M. was funded by US National Science Foundation award 1853665. C.M. was funded by grant no. 19-28491 of the Czech Science Foundation.Peer reviewe

    Additive manufacturing and characterisation of ceramic insulators for FEBIAD ISOL ion sources

    No full text
    embargoed_20261016The FEBIAD-ISOL ion source is a complex assembly consisting of two main parts: the cathode and the anode. These parts must be maintained electrically insulated at very high temperatures (about 1500°C). The fabrication of the two electrodes via additive manufacturing is currently being investigated by the INFN facilities. This approach would allow for easier assembly and enhanced properties of the ion source. In this context, the insulating elements would require custom shapes that could also be obtained via additive manufacturing, provided that the resulting parts maintain adequate thermal and electric properties. In this thesis work, zirconia and alumina insulators compatible with the current ion source were produced by additive manufacturing (fused filament fabrication) using commercial feedstock. The printed components underwent debinding and sintering treatments. Their physicochemical properties, before and after treatment, were characterised using x-ray diffraction, scanning electron microscopy, porosimetry, and micro-hardness testing. Additionally, their thermal stability and electrical resistivity at high temperatures were tested using a custom experimental setup designed specifically for this study.The FEBIAD-ISOL ion source is a complex assembly consisting of two main parts: the cathode and the anode. These parts must be maintained electrically insulated at very high temperatures (about 1500°C). The fabrication of the two electrodes via additive manufacturing is currently being investigated by the INFN facilities. This approach would allow for easier assembly and enhanced properties of the ion source. In this context, the insulating elements would require custom shapes that could also be obtained via additive manufacturing, provided that the resulting parts maintain adequate thermal and electric properties. In this thesis work, zirconia and alumina insulators compatible with the current ion source were produced by additive manufacturing (fused filament fabrication) using commercial feedstock. The printed components underwent debinding and sintering treatments. Their physicochemical properties, before and after treatment, were characterised using x-ray diffraction, scanning electron microscopy, porosimetry, and micro-hardness testing. Additionally, their thermal stability and electrical resistivity at high temperatures were tested using a custom experimental setup designed specifically for this study

    Neural Networks for Sequential Data: A Pre-training Approach based on Hidden Markov Models

    No full text
    none3noIn the last few years, research highlighted the critical role of unsupervised pre-training strategies to improve the performance of artificial neural networks. However, the scope of existing pre-training methods is limited to static data, whereas many learning tasks require to deal with temporal information. We propose a novel approach to pre-training sequential neural networks that exploits a simpler, first-order Hidden Markov Model to generate an approximate distribution of the original dataset. The learned distribution is used to generate a smoothed dataset that is used for pre-training. In this way, it is possible to drive the connection weights in a better region of the parameter space, where subsequent fine-tuning on the original dataset can be more effective. This novel pre-training approach is model-independent and can be readily applied to different network architectures. The benefits of the proposed method, both in terms of accuracy and training times, are demonstrated on a prediction task using four datasets of polyphonic music. The flexibility of the proposed strategy is shown by applying it to two different recurrent neural network architectures, and we also empirically investigate the impact of different hyperparameters on the performance of the proposed pre-training strategy.nonePasa, Luca; Testolin, Alberto; Sperduti, AlessandroPasa, Luca; Testolin, Alberto; Sperduti, Alessandr

    A HMM-based pre-training approach for sequential data

    No full text
    Much recent research highlighted the critical role of unsuper- vised pre-training to improve the performance of neural network models. However, extensions of those architectures to the temporal domain intro- duce additional issues, which often prevent to obtain good performance in a reasonable time. We propose a novel approach to pre-train sequential neural networks in which a simpler, approximate distribution generated by a linear model is first used to drive the weights in a better region of the parameter space. After this smooth distribution has been learned, the net- work is fine-tuned on the more complex real dataset. The benefits of the proposed method are demonstrated on a prediction task using two datasets of polyphonic music, and the general validity of this strategy is shown by applying it to two different recurrent neural network architectures

    A Neural Rewriting System to Solve Algorithmic Problems

    Full text link
    Modern neural network architectures still struggle to learn algorithmic procedures that require to systematically apply compositional rules to solve out-of-distribution problem instances. In this work, we focus on formula simplification problems, a class of synthetic benchmarks used to study the systematic generalization capabilities of neural architectures. We propose a modular architecture designed to learn a general procedure for solving nested mathematical formulas by only relying on a minimal set of training examples. Inspired by rewriting systems, a classic framework in symbolic artificial intelligence, we include in the architecture three specialized and interacting modules: the Selector, trained to identify solvable sub-expressions; the Solver, mapping sub-expressions to their values; and the Combiner, replacing sub-expressions in the original formula with the solution provided by the Solver. We benchmark our system against the Neural Data Router, a recent model specialized for systematic generalization, and a state-of-the-art large language model (GPT-4) probed with advanced prompting strategies. We demonstrate that our approach achieves a higher degree of out-of-distribution generalization compared to these alternative approaches on three different types of formula simplification problems, and we discuss its limitations by analyzing its failures.Comment: Updated version (v2) accepted at the 27th European Conference on Artificial Intelligence (ECAI 24

    Stereotactic body radiation therapy for a new lung cancer arising after pneumonectomy: dosimetric evaluation and pulmonary toxicity

    No full text
    To evaluate the tolerance of stereotactic body radiation therapy (SBRT) for the treatment of secondary lung tumours in patients who underwent previous pneumonectomy
    corecore