740 research outputs found

    Multi-fidelity classification using Gaussian processes: accelerating the prediction of large-scale computational models

    Full text link
    Machine learning techniques typically rely on large datasets to create accurate classifiers. However, there are situations when data is scarce and expensive to acquire. This is the case of studies that rely on state-of-the-art computational models which typically take days to run, thus hindering the potential of machine learning tools. In this work, we present a novel classifier that takes advantage of lower fidelity models and inexpensive approximations to predict the binary output of expensive computer simulations. We postulate an autoregressive model between the different levels of fidelity with Gaussian process priors. We adopt a fully Bayesian treatment for the hyper-parameters and use Markov Chain Mont Carlo samplers. We take advantage of the probabilistic nature of the classifier to implement active learning strategies. We also introduce a sparse approximation to enhance the ability of themulti-fidelity classifier to handle large datasets. We test these multi-fidelity classifiers against their single-fidelity counterpart with synthetic data, showing a median computational cost reduction of 23% for a target accuracy of 90%. In an application to cardiac electrophysiology, the multi-fidelity classifier achieves an F1 score, the harmonic mean of precision and recall, of 99.6% compared to 74.1% of a single-fidelity classifier when both are trained with 50 samples. In general, our results show that the multi-fidelity classifiers outperform their single-fidelity counterpart in terms of accuracy in all cases. We envision that this new tool will enable researchers to study classification problems that would otherwise be prohibitively expensive. Source code is available at https://github.com/fsahli/MFclass

    Multifidelity Information Fusion Algorithms for High-Dimensional Systems and Massive Data sets

    Get PDF
    We develop a framework for multifidelity information fusion and predictive inference in high-dimensional input spaces and in the presence of massive data sets. Hence, we tackle simultaneously the “big N" problem for big data and the curse of dimensionality in multivariate parametric problems. The proposed methodology establishes a new paradigm for constructing response surfaces of high-dimensional stochastic dynamical systems, simultaneously accounting for multifidelity in physical models as well as multifidelity in probability space. Scaling to high dimensions is achieved by data-driven dimensionality reduction techniques based on hierarchical functional decompositions and a graph-theoretic approach for encoding custom autocorrelation structure in Gaussian process priors. Multifidelity information fusion is facilitated through stochastic autoregressive schemes and frequency-domain machine learning algorithms that scale linearly with the data. Taking together these new developments leads to linear complexity algorithms as demonstrated in benchmark problems involving deterministic and stochastic fields in up to 10⁵ input dimensions and 10⁵ training points on a standard desktop computer

    Digital computer synthesis of admittance matrices of N+1 nodes

    Get PDF
    "March, 1967.
    corecore