18 research outputs found

    K-means clustering adopting rbf-kernel

    No full text
    Clustering technique in data mining has received a significant amount of attention from machine learning community in the last few years as one of the fundamental research areas. Among the vast range of clustering algorithms, K-means is one of the most popular clustering algorithm. In this research we extend K-means algorithm by adding well known radial basis function (rbf) kernel and find better performance than classical K-means algorithm. It is a critical issue for rbf kernel, how can we select a unique parameter for optimum clustering task. This present chapter will provide a statistical based solution on this issue. The best parameter selection is considered on the basis of prior information of the data by Maximum Likelihood (ML) method and Nelder-Mead (N-M) simplex method. A rule based meta-learning approach is then proposed for automatic rbf kernel parameter selection.We consider 112 upervised data set and measure the statistical data characteristics using basic statistics, central tendency measure and entropy based approach. We split this data characteristics using well known decision tree approach to generate the rules. Finally we use the generated rules to select the unique parameter value for rbf kernel and then adopt in K-means algorithm. The experiment has been demonstrated with 112 problems and 10 fold cross validation methods. Finally the proposed algorithm can solve any clustering task very quickly with optimum performance

    Unique classifier selection approach for bagging algorithm

    No full text
    Bagging is a popular method that improves the classification accuracy for any learning algorithm. A trial and error classifier feeding with the Bagging algorithm is a regular practice for classification tasks in the machine learning community. In this research we propose a rule based method using statistical information for unique classifier selection. The generated rules are verified using 113 classification problems with cross validation approach. That makes Bagging is a computationally faster algorithm and provides a unique solution for classifier selection

    Spam classification using adaptive boosting algorithm

    No full text
    Spam is no doubt a new and growing threat to the Internet and its end users. This paper investigates current approaches for blocking spam and proposes a new spam classification method by using adaptive boosting algorithm. Experiment is carried out to evaluate the results of spam filtering. We find adaptive boosting algorithm is an effective approach to solve the spam problem. We also find that default method in WEKA such as DecisionStump is not actually the best associated algorithm to filter spam. After comparing DecisionStump, J48, and NaiveBayes we conclude J48 is the most suitable associated algorithm to filter spam with high true positive rate, low false positive rate and low computation time

    Optimal classifier selection for adaptive boosting algorithm

    No full text
    Boosting is a general approach for improving classifier performances. In this research we investigated these issues with the latest Boosting algorithm Adaptive Boosting M1 (AdaBoostM1). A trial and error classifier feeding with the AdaBoostM1 algorithm is a regular practice for classification tasks in the research community. We provide a statistical information-based rule method for optimal classifier selection with the AdaBoostM1 algorithm. The classification performance is ranked based on confusion matrix outcome. The solution also verified a wide range of benchmark classification problems

    A novel classifier selection approach for adaptive boosting algorithms

    No full text
    Boosting is a general approach for improving classifier performances. In this research we investigated these issues with the latest Boosting algorithm AdaBoostM1. A trial and error classifier feeding with the AdaBoostM1 algorithm is a regular practice for classification tasks in the research community. We provide a novel statistical information-based rule method for unique classifier selection with the AdaBoostM1 algorithm. The solution also verified a wide range of benchmark classification problems

    On optimal degree selection for polynomial kernel with support vector machines : Theoretical and empirical investigations

    No full text
    The key challenge in kernel based learning algorithms is the choice of an appropriate kernel and its optimal parameters. Selecting the optimal degree of a polynomial kernel is critical to ensure good generalisation of the resulting support vector machine model. In this paper we propose Bayesian and Laplace approximation methods to estimate the polynomial degree. A rule based meta-learning approach is then proposed for automatic polynomial kernel and its optimal degree selection. The new approach is constructed and tested on different sizes of 112 datasets with binary class as well as multi class classification problems. An extensive computational evaluation of these methods is conducted, and rules are generated to determine when these approximation methods are appropriate

    Above the trust and security in cloud computing : a notion towards innovation

    No full text
    While the nascent Cloud Computing paradigm supported by virtualization has the upward new notion of edges, it lacks proper security and trust mechanisms. Edges are like on demand scalability and infinite resource provisioning as per the ‘pay-as-you-go’ manner in favour of a single information owner (abbreviated as INO from now onwards) to multiple corporate INOs. While outsourcing information to a cloud storage controlled by a cloud service provider (abbreviated as CSP from now onwards) relives an information owner of tackling instantaneous oversight and management needs, a significant issue of retaining the control of that information to the information owner still needs to be solved. This paper perspicaciously delves into the facts of the Cloud Computing security issues and aims to explore and establish a secure channel for the INO to communicate with the CSP while maintaining trust and confidentiality. The objective of the paper is served by analyzing different protocols and proposing the one in commensurate with the requirement of the security property like information or data confidentiality along the line of security in Cloud Computing Environment (CCE). To the best of our knowledge, we are the first to derive a secure protocol by successively eliminating the dangling pitfalls that remain dormant and thereby hamper confidentiality and integrity of information that is worth exchanging between the INO and the CSP. Besides, conceptually, our derived protocol is compared with the SSL from the perspectives of work flow related activities along the line of secure trusted path for information confidentiality

    Identification of typical load profiles using K-means clustering algorithm

    No full text
    Typical load profile (TLP) describes the hourly values of electricity consumption on a daily basis, and is associated to a certain consumer category, for certain specific operating conditions. TLPs can be defined for residential, small industrial, commercial or services consumers, for warm season and cold season, for week days and weekends. In this paper, the daily load curves of a residential feeder are grouped using K-Means clustering algorithm to classify the load curves. The paper further explores the relationship between load profiles and seasonal periods to identify season types. The paper also obtains truncated discrete Fourier transform coefficients for the load curves to reduce the dimensionality of the clustering problem. Application of K-Means clustering on the discrete Fourier coefficients exhibits results that are identical to the clusters of the original load curves. © 2014 IEEE
    corecore