29 research outputs found

    基于迭代延长纠错输出编码的微阵列数据多分类方法

    Get PDF
    微阵列技术使快速大量检测基因成为可能,人们迫切需要利用该技术提高疾病诊断水平.因此,对微阵列数据的分析研究迅速发展,其中以数据多类分类研究尤为突出.但由于微阵列数据具有特征多、样本少的特点,使得传统统计学习方法分类效果欠佳.为了针对微阵列数据特点解决多类分类问题,提出了一种迭代延长纠错输出编码(iterative extension error correct output coding,IE-ECOC)的算法.在几个特征子集上,配合与特征相关的数据复杂度,利用一种基于二叉树的编码方法生成一个列池,并提出一种择列策略构造编码矩阵;然后,依据迭代验证结果延长矩阵.对癌症基因微阵列进行分类实验,结果显示,IE-ECOC对特征多、样本少的数据具有针对性,且与一些经典的ECOC算法相比,可以产生较好的结果,IE-ECOE算法效果也在实验中得到了验证.国家自然科学基金(61502402,61772023);;福建省自然科学基金(2016J01320,2015J05129

    The application of feature selection methods to analyze the tissue microarray data

    No full text
    Conference Name:4th International Workshop on Advanced Computational Intelligence, IWACI 2011. Conference Address: Wuhan, Hubei, China. Time:October 19, 2011 - October 21, 2011.In this paper, two feature selection methods, binary genetic algorithm (GA) and sequential floating forward selection (SFFS), were deployed to analyze tissue microarray dataset. The tissue microarray materials in our experiments consisted of 15 tumor-related genes in histological normal tissues adjacent to clinic tumors and different tumors, and the data were arranged in three different datasets and all the collection works were done by the Affiliated Zhongshan Hospital of Xiamen University. For each dataset, we used three distinguished classifiers to obtain the AUC of receive operating characteristic (ROC) curve. The experimental results showed that both feature selection methods could lead to reliable and accuracy results, and be used to discover the connection of genes and cancers. ? 2011 IEEE

    A genetic programming-based approach to the classification of multiclass microarray datasets

    No full text
    National Science Foundation of China [30570368, 30700161, 60772130, 60805021]; National Basic Research Program of China [2007CB311002]; National High Technology Research and Development Program of China [2007AA01Z167, 2006AA02Z309]; Guide Project of InnovMotivation: Feature selection approaches have been widely applied to deal with the small sample size problem in the analysis of microarray datasets. For the multiclass problem, the proposed methods are based on the idea of selecting a gene subset to distinguish all classes. However, it will be more effective to solve a multiclass problem by splitting it into a set of two-class problems and solving each problem with a respective classification system. Results: We propose a genetic programming (GP)-based approach to analyze multiclass microarray datasets. Unlike the traditional GP, the individual proposed in this article consists of a set of small-scale ensembles, named as sub-ensemble (denoted by SE). Each SE consists of a set of trees. In application, a multiclass problem is divided into a set of two-class problems, each of which is tackled by a SE first. The SEs tackling the respective two-class problems are combined to construct a GP individual, so each individual can deal with a multiclass problem directly. Effective methods are proposed to solve the problems arising in the fusion of SEs, and a greedy algorithm is designed to keep high diversity in SEs. This GP is tested in five datasets. The results show that the proposed method effectively implements the feature selection and classification tasks

    Microarray data classification based on evolutionary multiple classifier system

    No full text
    Conference Name:2011 3rd International Conference on Mechanical and Electronics Engineering, ICMEE 2011. Conference Address: Hefei, China. Time:September 23, 2011 - September 25, 2011.Hefei University of TechnologyDesigning an evolutionary multiple classifier system (MCS) is a relatively new research area. In this paper, we propose a genetic algorithm (GA) based MCS for microarray data classification. We construct a feature poll with different feature selection methods first, and then a multi-objective GA is applied to implement ensemble feature selection process so as to generate a set of classifiers. When this GA stops, a set of base classifiers are generated. Here we use all the nondominated individuals in last generation to build an ensemble system and test the proposed ensemble method and the method that apply a classifier selection process to select proper classifiers from all the individuals in last generation. The experimental results show the proposed ensemble method is roubust and can lead to promising results. 漏 (2012) Trans Tech Publications, Switzerland

    Cancer classification using ensemble of error correcting output codes

    No full text
    Conference Name:10th International Conference on Intelligent Computing, ICIC 2014. Conference Address: Taiyuan, China. Time:August 3, 2014 - August 6, 2014.IEEE Computational Intelligence Society; International Neural Network Society; National Science Foundation of ChinaWe address the microarray dataset based cancer classification problem using a newly proposed ensemble of Error Correcting Output Codes (E-ECOC) method. To the best of our knowledge, it is the first time that ECOC based ensemble has been applied to the microarray dataset classification. Different feature subsets are generated from datasets as inputs for some problem-dependent ECOC coding methods, so as to produce diverse ECOC coding matrixes. Then, the mutual difference degree among the coding matrixes is calculated as an indicator to select coding matrixes with maximum difference. Local difference maximum selection(L-DMS) and global difference maximum selection(G-DMS) are the strategies for picking coding matrixes based on same or different ECOC algorithms. In the experiments, it can be found that E-ECOC algorithm outperforms the individual ECOC and effectively solves the microarray classification problem. ? 2014 Springer International Publishing Switzerland

    An ensemble of SVM classifiers based on gene pairs

    No full text
    National Science Foundation of China [61100106]; Natural Science Foundation of Fujian Province of China [2010J05137]; Fundamental Research Funds for the Central Universities [2010121038]In this paper, a genetic algorithm (GA) based ensemble support vector machine (SVM) classifier built on gene pairs (GA-ESP) is proposed. The SVMs (base classifiers of the ensemble system) are trained on different informative gene pairs. These gene pairs are selected by the top scoring pair (TSP) criterion. Each of these pairs projects the original microarray expression onto a 2-D space. Extensive permutation of gene pairs may reveal more useful information and potentially lead to an ensemble classifier with satisfactory accuracy and interpretability. GA is further applied to select an optimized combination of base classifiers. The effectiveness of the GA-ESP classifier is evaluated on both binary-class and multi-class datasets. (C) 2013 Elsevier Ltd. All rights reserved

    Ensemble component selection for improving ICA based microarray data prediction models

    No full text
    Independent component analysis (ICA) has been widely used to tackle the microarray dataset classification problem, but there still exists an unsolved problem that the independent component (IC) sets may not be reproducible after different ICA transformations. Inspired by the idea of ensemble feature selection, we design an ICA based ensemble learning system to fully utilize the difference among different IC sets. In this system, some IC sets are generated by different ICA transformations firstly. A multi-objective genetic algorithm (MOGA) is designed to select different biologically significant IC subsets from these IC sets, which are then applied to build base classifiers. Three schemes are used to fuse these base classifiers. The first fusion scheme is to combine all individuals in the final generation of the MOCA. In addition, in the evolution, we design a global-recording technique to record the best IC subsets of each IC set in a global-recording list. Then the IC subsets in the list are deployed to build base classifier so as to implement the second fusion scheme. Furthermore, by pruning about half of less accurate base classifiers obtained by the second scheme, a compact and more accurate ensemble system is built, which is regarded as the third fusion scheme. Three microarray datasets are used to test the ensemble systems, and the corresponding results demonstrate that these ensemble schemes can further improve the performance of the ICA based classification model, and the third fusion scheme leads to the most accurate ensemble system with the smallest ensemble size. (C) 2009 Elsevier Ltd. All rights reserved

    SO_2质量分数对污染海洋大气环境中高强钢E690腐蚀行为的影响

    No full text
    采用干湿交替腐蚀试验方法,结合电化学测试、锈层截面和腐蚀产物X射线衍射(X-ray diffraction,XRD)分析,研究SO_2质量分数对E690钢在模拟污染海洋大气环境中腐蚀行为的影响。结果表明,海洋大气环境中的SO_2改变了E690钢海洋大气腐蚀的电化学机制,使得极化曲线的阳极分支由弱钝化特征转变为活性溶解特征,阴极分支由氧扩散过程控制转变为氧扩散和析氢反应共同控制,因而大大促进了阳极和阴极的电化学反应过程。同时,SO_2又显著促进α-Fe OOH的生成和Ni、Cr合金元素在内锈层中的富集,大大促进锈层的致密化,使均匀腐蚀速率逐渐减低,并促进锈层底部点蚀坑的生长。随着模拟溶液中Na HSO_3浓度的增加,E690钢在60 d内的平均腐蚀速率逐渐增加,当Na HSO_3浓度达到0.03 mol/L时,又出现一定程度的降低;同时,锈层底部的点蚀坑随Na HSO_3浓度的增加显著长大

    The design of evolutionary multiple classifier system for the classification of microarray data

    No full text
    Conference Name:8th International Symposium on Neural Networks, ISNN 2011. Conference Address: Guilin, China. Time:May 29, 2011 - June 1, 2011.Designing an evolutionary multiple classifier system (MCS) is a relatively new research area. In this paper, we propose a genetic algorithm (GA) based MCS for microarray data classification. In detail, we construct a feature poll with different feature selection methods first, and then a multi-objective GA is applied to implement ensemble feature selection process so as to generate a set of classifiers. Then we construct an ensemble system with the individuals in last generation in two ways: using the nondominated individuals; using all the individuals accompanied with a classifier selection process based on another GA. We test the two proposed ensemble methods based on two microarray data sets, and the experimental results show that these two methods are robust and can lead to promising results. ? 2011 Springer-Verlag
    corecore