334 research outputs found

    How I won the "Chess Ratings - Elo vs the Rest of the World" Competition

    Full text link
    This article discusses in detail the rating system that won the kaggle competition "Chess Ratings: Elo vs the rest of the world". The competition provided a historical dataset of outcomes for chess games, and aimed to discover whether novel approaches can predict the outcomes of future games, more accurately than the well-known Elo rating system. The winning rating system, called Elo++ in the rest of the article, builds upon the Elo rating system. Like Elo, Elo++ uses a single rating per player and predicts the outcome of a game, by using a logistic curve over the difference in ratings of the players. The major component of Elo++ is a regularization technique that avoids overfitting these ratings. The dataset of chess games and outcomes is relatively small and one has to be careful not to draw "too many conclusions" out of the limited data. Many approaches tested in the competition showed signs of such an overfitting. The leader-board was dominated by attempts that did a very good job on a small test dataset, but couldn't generalize well on the private hold-out dataset. The Elo++ regularization takes into account the number of games per player, the recency of these games and the ratings of the opponents. Finally, Elo++ employs a stochastic gradient descent scheme for training the ratings, and uses only two global parameters (white's advantage and regularization constant) that are optimized using cross-validation

    Energy Savings in EAF Steelmaking by Process Simulation and Data-Science Modeling on the Reproduced Results

    Get PDF
    Electric-Arc-Furnace (EAF)-based process route in modern steelmaking for the production of plates and special quality bars requires a series of stations for the secondary metallurgy treatment (Ladle-Furnace, and potentially Vacuum-Degasser), till the final casting for the production of slabs and blooms in the corresponding continuous casting machines. However, since every steel grade has its own melting characteristics, the melting (liquidus) temperature per grade is generally different and plays an important role in the final casting temperature, which has to exceed by somewhat the melting temperature by an amount called superheat. The superheat is adjusted at the ladle-furnace (LF) station by the operator who decides mostly on personal experience but, since the ladle has to pass from downstream processes, the liquid steel loses temperature not only due to the duration of the processes till casting but also due to the ladle refractory history. Simulation software was developed in order to reproduce the phenomena involved in a meltshop and influence downstream superheats. Data science models were deployed in order to check the potential of controlling casting temperatures by adjusting liquid-steel exit temperatures at LF

    A Numerical Solution Model for the Heat Transfer in Octagonal Billets

    Get PDF
    In the quest for high-quality steel products, the need of cast billets with minimum surface and internal defects is of paramount importance. On the other hand, productivity is required to be as high as possible in order to reduce production cost. Different billet shapes have been applied with emphasis upon square, rectangular, and circular cross-sections. It is obvious that the best billet shape that minimizes surface and subsurface defects is the circular one. Nevertheless, this shape creates some problems with respect to handling and safety reasons. One recent attempt is to produce normal octagonal-shaped billets that appear to approach the circular shape albeit easier to handle. In this study, a numerical solution for the heat transfer during solidification in the continuous casting of octagonal billets has been carried out. The developed model deploys an implicit scheme in order to solve the differential equations of heat transfer under the appropriate boundary conditions in a section of an octagonal billet, assuming fully axisymmetric cooling of the bloom. The geometry of the octagonal billet plays an interesting role in the development of the heat transfer analysis. Based upon fundamental principles, a computer program has been developed for this purpose. Consequently, results from the numerical solution are presented and discussed

    Ανίχνευση Επικαλυπτόμενων Κοινοτήτων σε Γράφους με Δίκτυα Προσοχής

    Get PDF
    Εθνικό Μετσόβιο Πολυτεχνείο--Μεταπτυχιακή Εργασία. Διεπιστημονικό-Διατμηματικό Πρόγραμμα Μεταπτυχιακών Σπουδών (Δ.Π.Μ.Σ.) "Επιστήμη Δεδομένων και Μηχανική Μάθηση

    Dwarf: A Complete System for Analyzing High-Dimensional Data Sets

    Get PDF
    The need for data analysis by different industries, including telecommunications, retail, manufacturing and financial services, has generated a flurry of research, highly sophisticated methods and commercial products. However, all of the current attempts are haunted by the so-called "high-dimensionality curse"; the complexity of space and time increases exponentially with the number of analysis "dimensions". This means that all existing approaches are limited only to coarse levels of analysis and/or to approximate answers with reduced precision. As the need for detailed analysis keeps increasing, along with the volume and the detail of the data that is stored, these approaches are very quickly rendered unusable. I have developed a unique method for efficiently performing analysis that is not affected by the high-dimensionality of data and scales only polynomially -and almost linearly- with the dimensions without sacrificing any accuracy in the returned results. I have implemented a complete system (called "Dwarf") and performed an extensive experimental evaluation that demonstrated tremendous improvements over existing methods for all aspects of performing analysis -initial computation, storing, querying and updating it. I have extended my research to the "data-streaming" model where updates are performed on-line, exacerbating any concurrent analysis but has a very high impact on applications like security, network management/monitoring router traffic control and sensor networks. I have devised streaming algorithms that provide complex statistics within user-specified relative-error bounds over a data stream. I introduced the class of "distinct implicated statistics", which is much more general than the established class of "distinct count" statistics. The latter has been proved invaluable in applications such as analyzing and monitoring the distinct count of species in a population or even in query optimization. The "distinct implicated statistics" class provides invaluable information about the correlations in the stream and is necessary for applications such as security. My algorithms are designed to use bounded amounts of memory and processing -so that they can even be implemented in hardware for resource-limited environments such as network-routers or sensors- and also to work in "noisy" environments, where some data may be flawed either implicitly due to the extraction process or explicitly

    The Dwarf Data Cube Eliminates the Highy Dimensionality Curse

    Get PDF
    The data cube operator encapsulates all possible groupings of a data set and has proved to be an invaluable tool in analyzing vast amounts of data. However its apparent exponential complexity has significantly limited its applicability to low dimensional datasets. Recently the idea of the dwarf data cube model was introduced, and showed that high-dimensional ``dwarf data cubes'' are orders of magnitudes smaller in size than the original data cubes even when they calculate and store every possible aggregation with 100\% precision. In this paper we present a surprising analytical result proving that the size of dwarf cubes grows polynomially with the dimensionality of the data set and, therefore, a full data cube at 100% precision is not inherently cursed by high dimensionality. This striking result of polynomial complexity reformulates the context of cube management and redefines most of the problems associated with data-warehousing and On-Line Analytical Processing. We also develop an efficient algorithm for estimating the size of dwarf data cubes before actually computing them. Finally, we complement our analytical approach with an experimental evaluation using real and synthetic data sets, and demonstrate our results. UMIACS-TR-2003-12

    Cholesteatoma of the external ear canal: etiological factors, symptoms and clinical findings in a series of 48 cases

    Get PDF
    BACKGROUND: To evaluate symptoms, clinical findings, and etiological factors in external ear canal cholesteatoma (EECC). METHOD: Retrospective evaluation of clinical records of all consecutive patients with EECC in the period 1979 to 2005 in a tertiary referral centre. Main outcome measures were incidence rates, classification according to causes, symptoms, extensions in the ear canal including adjacent structures, and possible etiological factors. RESULTS: Forty-five patients were identified with 48 EECC. Overall incidence rate was 0.30 cases per year per 100,000 inhabitants. Twenty-five cases were primary, while 23 cases were secondary: postoperative (n = 9), postinflammatory (n = 5), postirradiatory (n = 7), and posttraumatic (n = 2). Primary EECC showed a right/left ratio of 12/13 and presented with otalgia (n = 15), itching (n = 5), occlusion (n = 4), hearing loss (n = 3), fullness (n = 2), and otorrhea (n = 1). Similar symptoms were found in secondary EECC, but less pronounced. In total the temporomandibular joint was exposed in 11 cases, while the mastoid and middle ear was invaded in six and three cases, respectively. In one primary case the facial nerve was exposed and in a posttraumatic case the atticus and antrum were invaded. In primary EECC 48% of cases reported mechanical trauma. CONCLUSION: EECC is a rare condition with inconsistent and silent symptoms, whereas the extent of destruction may be pronounced. Otalgia was the predominant symptom and often related to extension into nearby structures. Whereas the aetiology of secondary EECC can be explained, the origin of primary EECC remains uncertain; smoking and minor trauma of the ear canal may predispose
    corecore