169 research outputs found
An algorithm for identifying interpretable simple structures and its application in healthcare
Many complex machine learning algorithms achieve state-of-the-art performance, but their lack of interpretability limits their applicability in domains where transparency is critical. On the other hand, simpler, interpretable models such as decision trees, linear regression, and linear classifiers often struggle to capture complex patterns in real-world data, offering vague or inadequate explanations. In high-stakes domains like precision healthcare, complexity usually arises from intricate feature interactions that may cancel each other out within specific subsets of the population. These regions form what we refer to as “simple structures.”
This dissertation proposes an ensemble-based approach that leverages simple, interpretable models on localized subsets of the data, referred to as simple structures. By identifying and modeling these structures, our method reduces the impact of complex feature interactions, resulting in models that are both interpretable and competitive in predictive performance. We introduce a bottom-up algorithm that discovers these simple structures via a recursive neighborhood search, approximating local regions where feature interactions are minimal or canceling, and yielding more natural decision boundaries than traditional global models. We use synthetic data to demonstrate the algorithm's robustness and show that the resulting decision boundaries align closely with ground truth.
We demonstrate the practical value of this method in precision healthcare. While personalized medicine aims to tailor decisions at the individual level, our method offers a coarser—but interpretable—alternative by tailoring models to subgroups of similar individuals. In applications such as heart failure (HF) diagnosis and COVID-19 mortality prediction, our ensemble of simple models consistently outperforms global simple models. It achieves predictive performance comparable to complex models like XGBoost and neural networks—while maintaining the same level of interpretability. In the context of cardiovascular disease (CVD) risk prediction, the ensemble improves risk stratification and highlights distinct clinical patterns within patient subgroups.
More importantly, our findings reveal that the patterns and feature interactions captured within simple structures differ markedly from those learned by global models—both in magnitude and in nature. This allows for identifying subpopulations with distinct risk profiles, enabling more nuanced and targeted clinical interventions. These subgroup-specific models often capture local patterns obscured in global analyses, improving performance without sacrificing transparency. Importantly, these insights are derived using simple, routinely collected electronic health record (EHR) data without requiring complex genetic, omics, or imaging inputs. This makes the approach highly practical and broadly applicable.
In summary, this dissertation presents a novel, bottom-up framework for identifying and modeling simple structures in complex clinical datasets. By balancing accuracy and interpretability using only standard EHR features, this work contributes a robust and scalable tool for advancing explainable, personalized, and clinically actionable machine learning in healthcare
Learning Ensembles of Interpretable Simple Structure
Decision-making in complex systems often relies on machine learning models, yet highly accurate models such as XGBoost and neural networks can obscure the reasoning behind their predictions. In operations research applications, understanding how a decision is made is often as crucial as the decision itself. Traditional interpretable models, such as decision trees and logistic regression, provide transparency but may struggle with datasets containing intricate feature interactions. However, complexity in decision-making stem from interactions that are only relevant within certain subsets of data. Within these subsets, feature interactions may be simplified, forming simple structures where simple interpretable models can perform effectively. We propose a bottom-up simple structure-identifying algorithm that partitions data into interpretable subgroups known as simple structure, where feature interactions are minimized, allowing simple models to be trained within each subgroup. We demonstrate the robustness of the algorithm on synthetic data and show that the decision boundaries derived from simple structures are more interpretable and aligned with the intuition of the domain than those learned from a global model. By improving both explainability and predictive accuracy, our approach provides a principled framework for decision support in applications where model transparency is essential.This is a preprint from Arwade, Gaurav, and Sigurdur Olafsson. "Learning Ensembles of Interpretable Simple Structure." arXiv preprint arXiv:2502.19602 (2025). doi: https://doi.org/10.48550/arXiv.2502.19602
Towards Quantifying Beneficial System Effects in Cold-Formed Steel Wood-Sheathed Floor Diaphragms
Cold-formed steel wood-sheathed floor diaphragm system behavior is analyzed from a system reliability perspective. Floor systems consisting of oriented strand board (OSB), cold-formed steel (CFS) joists, tracks and screw fasteners are modeled using shell and spring elements in ABAQUS. (Dassault-Systems ())The models consider typical seismic demand loads, with careful treatment of light steel framing diaphragm boundary conditions and OSB sheathing kinematics, i.e., two sheets pulling apart or bearing against each other at an ultimate limit state, consistent with existing experimental results. The finite element results are used to build surrogate mathematical idealizations (series, parallel-brittle and parallel-ductile) for the critical system components. System reliability and reliability sensitivity, defined as the derivative of system reliability with respect to component reliability, are studied for these idealizations. These results represent mathematical upper and lower bounds to real system behavior, and are being used in ongoing research to codify beneficial diaphragm system effects
Incorporating cold-formed steel member and system design into the undergraduate curriculum
Cold-formed steel design, in an ideal scenario, deserves an entire advanced undergraduate or graduate level course. However, this is not practical in many institutions, where a program of study can only include a few courses in hot-rolled steel design due to teaching capacity and ever-expanding program requirements. Thus, instructors with expertise in cold-formed steel and repetitively-framed systems are forced to infuse it into other curricula, or simply not teach it at all. The pervasiveness of repetitively-framed structural systems worldwide motivates not only teaching the fundamentals of member behavior, but also system behavior, to prepare undergraduates for their careers as practicing engineers. This paper highlights efforts at the University of Massachusetts Amherst to do this in two courses: a second course in steel design (CFS members), and a course on structural systems (repetitive and light framed systems). Modularized lesson plans are presented, along with in-class active learning activities, examples of student work, and feedback from students in each of courses. This paper aims to enable effective modular cold-formed steel instruction, leading to significant learning in thin-walled member behavior and repetitively-framed system behavior.The authors gratefully acknowledge our students, who, with their enthusiasm, made teaching this material a joyful experience
Validation of a 37-year Metocean Hindcast along the U.S. Atlantic Coast
A numerical model is implemented using Mike 21 to estimate metocean conditions to evaluate hurricane risk for the 22 proposed wind energy areas along the U.S. Atlantic coast. A metocean hindcast study is conducted using this model for the period between 1979 and 2015 when atmospheric conditions are available as part of the Climate Forecast System Reanalysis (CFSR) study. These atmospheric conditions are used as input to the Mike 21 model, and the model results are compared with measurements of wind speed and significant wave height from five offshore buoys and of water level from three onshore stations. The predictions match the measurements reasonably well. The model is then applied to generate maps of wind speeds and wave heights with a 50-year return period, based on annual maxima of wind and wave
Oral Schwannoma—An Unusual Oral Presentation: Case Report and Literature Review
Schwannoma or neurilemmoma is benign, slow growing, usually solitary and encapsulated tumor, originating from schwann cells of the nerve sheath. Intraoral schwannoma accounts for 1% of head and neck region and commonly involves tongue. Most of the earlier reports in the literature, have described schwannomas that occurred in the tongue. In this article,we report a case of schwannoma involving an unusual site - mandibular labial vestibule, in a young patient. The lesion was completely excised with no reported complication for a followup of 15 months
Design optimization of offshore wind jacket piles by assessing support structure orientation relative to metocean conditions
The orientation of a three-legged offshore wind jacket structure in 60 m water depth, supporting the IEA 15 MW reference turbine, has been assessed for optimizing the jacket pile design. A reference site off the coast of Massachusetts was considered, including site-specific metocean conditions and realistically plausible geotechnical conditions. Soil–structure interaction was modeled using three-dimensional finite-element (FE) ground–structure simulations to obtain equivalent mudline springs, which were subsequently used in nonlinear elastic simulations, considering aerodynamic and hydrodynamic loading of extreme sea states in the time domain. Jacket pile loads were found to be sensitive to the maximum 50-year wave direction, as opposed to the wind direction, indicating that the jacket orientation should be considered relative to the dominant wave direction. The results further demonstrated that the jacket orientation has a substantial impact on the overall jacket pile mass and maximum pile embedment depth and therefore represents an important opportunity for project cost and risk reductions. Finally, this research highlights the importance of detailed knowledge of the full global model behavior (both turbine and foundation) for capturing this optimization potential, particularly due to the influence of wind–wave misalignment on pile loads. Close collaboration between the turbine supplier and foundation designer, at the appropriate design stages, is essential.</p
Towards the Design of Cold-formed Steel Foam Sandwich Columns
In this paper a design method for the compressive capacity of sandwich panels comprised of steel face sheets and foamed steel cores is derived and verified. Foamed steel, literally steel with internal voids, provides the potential to mitigate many local stability issues through increasing the effective width-to thickness of the component for the same amount of material. Winter’s classical effective width expression was generalized to the case of steel foam sandwich panels. The provided analytical expressions are verified with finite element simulations employing brick elements that explicitly model the steel face sheets and steel foam cores. The closed-form design expressions are employed to conduct parametric studies of steel foam sandwich panels with various face sheet and steel foamed core configurations. The studies show the significant strength improvements possible with steel foam sandwich panels when compared with plain steel sheet/plate
Analysis of roof live loads in industrial buildings
In design, structural engineers must have a clear understanding of live loads, both qualitatively and statistically. For decades, multiple studies have been published that relate live loads for floor loads in various occupancies such as offices and residences. However, survey data or probabilistic live load models for industrial building roofs are difficult to find. There are recommendations in major standards used in the modern world that give design live load values for roofs based on the accessibility of the rooftops. On the other hand, engineers may not understand the origin of these values. Comparison is made between current U.S standards for roof live loads and standards used in other parts of the world. To ensure that the most accurate live load assessment is implemented in the design, our understanding of live loads should be updated on a regular basis. Furthermore, in the United States, the current roof live load design value is 0.96 kN/m2 (20 psf), which is much greater than the values recommended by European, Australian, and Chinese standards. As a result, determining the source of live load on industrial building roofs is essential. To cover the gap in the literature, this article gives survey methodology and probabilistic studies related to design live load value on roofs. The sensitivity of existing probabilistic models to mean, variance, and time duration was also investigated.This work is part of the research project Roof Live Load Models for Metal Buildings which is sponsored by the Metal Building Manufacturers Association (MBMA) and the Steel Deck Institute (SDI). The authors would like to thank Dr. Zhanjie Li, Associate Professor at Suny Polytechnic, for his help translating the Chinese standards
- …
