254 research outputs found

    Online Unsupervised Multi-view Feature Selection

    Full text link
    In the era of big data, it is becoming common to have data with multiple modalities or coming from multiple sources, known as "multi-view data". Multi-view data are usually unlabeled and come from high-dimensional spaces (such as language vocabularies), unsupervised multi-view feature selection is crucial to many applications. However, it is nontrivial due to the following challenges. First, there are too many instances or the feature dimensionality is too large. Thus, the data may not fit in memory. How to select useful features with limited memory space? Second, how to select features from streaming data and handles the concept drift? Third, how to leverage the consistent and complementary information from different views to improve the feature selection in the situation when the data are too big or come in as streams? To the best of our knowledge, none of the previous works can solve all the challenges simultaneously. In this paper, we propose an Online unsupervised Multi-View Feature Selection, OMVFS, which deals with large-scale/streaming multi-view data in an online fashion. OMVFS embeds unsupervised feature selection into a clustering algorithm via NMF with sparse learning. It further incorporates the graph regularization to preserve the local structure information and help select discriminative features. Instead of storing all the historical data, OMVFS processes the multi-view data chunk by chunk and aggregates all the necessary information into several small matrices. By using the buffering technique, the proposed OMVFS can reduce the computational and storage cost while taking advantage of the structure information. Furthermore, OMVFS can capture the concept drifts in the data streams. Extensive experiments on four real-world datasets show the effectiveness and efficiency of the proposed OMVFS method. More importantly, OMVFS is about 100 times faster than the off-line methods

    MGCN: Semi-supervised Classification in Multi-layer Graphs with Graph Convolutional Networks

    Full text link
    Graph embedding is an important approach for graph analysis tasks such as node classification and link prediction. The goal of graph embedding is to find a low dimensional representation of graph nodes that preserves the graph information. Recent methods like Graph Convolutional Network (GCN) try to consider node attributes (if available) besides node relations and learn node embeddings for unsupervised and semi-supervised tasks on graphs. On the other hand, multi-layer graph analysis has been received attention recently. However, the existing methods for multi-layer graph embedding cannot incorporate all available information (like node attributes). Moreover, most of them consider either type of nodes or type of edges, and they do not treat within and between layer edges differently. In this paper, we propose a method called MGCN that utilizes the GCN for multi-layer graphs. MGCN embeds nodes of multi-layer graphs using both within and between layers relations and nodes attributes. We evaluate our method on the semi-supervised node classification task. Experimental results demonstrate the superiority of the proposed method to other multi-layer and single-layer competitors and also show the positive effect of using cross-layer edges

    Entailment Tree Explanations via Iterative Retrieval-Generation Reasoner

    Full text link
    Large language models have achieved high performance on various question answering (QA) benchmarks, but the explainability of their output remains elusive. Structured explanations, called entailment trees, were recently suggested as a way to explain and inspect a QA system's answer. In order to better generate such entailment trees, we propose an architecture called Iterative Retrieval-Generation Reasoner (IRGR). Our model is able to explain a given hypothesis by systematically generating a step-by-step explanation from textual premises. The IRGR model iteratively searches for suitable premises, constructing a single entailment step at a time. Contrary to previous approaches, our method combines generation steps and retrieval of premises, allowing the model to leverage intermediate conclusions, and mitigating the input size limit of baseline encoder-decoder models. We conduct experiments using the EntailmentBank dataset, where we outperform existing benchmarks on premise retrieval and entailment tree generation, with around 300% gain in overall correctness.Comment: published in NAACL 202

    MnO2 prepared by hydrothermal method and electrochemical performance as anode for lithium-ion battery

    Get PDF
    Two α-MnO(2) crystals with caddice-clew-like and urchin-like morphologies are prepared by the hydrothermal method, and their structure and electrochemical performance are characterized by scanning electron microscope (SEM), X-ray diffraction (XRD), galvanostatic cell cycling, cyclic voltammetry, and electrochemical impedance spectroscopy (EIS). The morphology of the MnO(2) prepared under acidic condition is urchin-like, while the one prepared under neutral condition is caddice-clew-like. The identical crystalline phase of MnO(2) crystals is essential to evaluate the relationship between electrochemical performances and morphologies for lithium-ion battery application. In this study, urchin-like α-MnO(2) crystals with compact structure have better electrochemical performance due to the higher specific capacity and lower impedance. We find that the relationship between electrochemical performance and morphology is different when MnO(2) material used as electrochemical supercapacitor or as anode of lithium-ion battery. For lithium-ion battery application, urchin-like MnO(2) material has better electrochemical performance

    Cloning and characterization of a selenium-independent glutathione peroxidase (HC29) from adult Haemonchus contortus

    Get PDF
    The complete coding sequence of Haemonchus (H.) contortus HC29 cDNA was generated by rapid amplification of cDNA ends in combination with PCR using primers targeting the 5'- and 3'-ends of the partial mRNA sequence. The cloned HC29 cDNA was shown to be 1,113 bp in size with an open reading frame of 507 bp, encoding a protein of 168 amino acid with a calculated molecular mass of 18.9 kDa. Amino acid sequence analysis revealed that the cloned HC29 cDNA contained the conserved catalytic triad and dimer interface of selenium-independent glutathione peroxidase (GPX). Alignment of the predicted amino acid sequences demonstrated that the protein shared 44.7~80.4% similarity with GPX homologues in the thioredoxin-like family. Phylogenetic analysis revealed close evolutionary proximity of the GPX sequence to the counterpart sequences. These results suggest that HC29 cDNA is a GPX, a member of the thioredoxin-like family. Alignment of the nucleic acid and amino acid sequences of HC29 with those of the reported selenium-independent GPX of H. contortus showed that HC29 contained different types of spliced leader sequences as well as dimer interface sites, although the active sites of both were identical. Enzymatic analysis of recombinant prokaryotic HC29 protein showed activity for the hydrolysis of H2O2. These findings indicate that HC29 is a selenium-independent GPX of H. contortus

    Nuclear mass table in deformed relativistic Hartree-Bogoliubov theory in continuum, II: Even-ZZ nuclei

    Full text link
    The mass table in the deformed relativistic Hartree-Bogoliubov theory in continuum (DRHBc) with the PC-PK1 density functional has been established for even-ZZ nuclei with 8Z1208\le Z\le120, extended from the previous work for even-even nuclei [Zhang et. al.\it{et.~al.} (DRHBc Mass Table Collaboration), At. Data Nucl. Data Tables 144, 101488 (2022)]. The calculated binding energies, two-nucleon and one-neutron separation energies, root-mean-square (rms) radii of neutron, proton, matter, and charge distributions, quadrupole deformations, and neutron and proton Fermi surfaces are tabulated and compared with available experimental data. A total of 4829 even-ZZ nuclei are predicted to be bound, with an rms deviation of 1.477 MeV from the 1244 mass data. Good agreement with the available experimental odd-even mass differences, α\alpha decay energies, and charge radii is also achieved. The description accuracy for nuclear masses and nucleon separation energies as well as the prediction for drip lines is compared with the results obtained from other relativistic and nonrelativistic density functional. The comparison shows that the DRHBc theory with PC-PK1 provides an excellent microscopic description for the masses of even-ZZ nuclei. The systematics of the nucleon separation energies, odd-even mass differences, pairing energies, two-nucleon gaps, α\alpha decay energies, rms radii, quadrupole deformations, potential energy curves, neutron density distributions, and neutron mean-field potentials are discussed.Comment: 394 pages, 17 figures, 2 tables, published in Atomic Data and Nuclear Data Tables, data file in the TXT form is available for download under "Ancillary files
    corecore