Search CORE

3,570 research outputs found

Geoadditive hazard regression for interval censored survival times

Author: Kneib Thomas
Publication venue
Publication date: 01/01/2005
Field of study

The Cox proportional hazards model is the most commonly used method when analyzing the impact of covariates on continuous survival times. In its classical form, the Cox model was introduced in the setting of right-censored observations. However, in practice other sampling schemes are frequently encountered and therefore extensions allowing for interval and left censoring or left truncation are clearly desired. Furthermore, many applications require a more flexible modeling of covariate information than the usual linear predictor. For example, effects of continuous covariates are likely to be of nonlinear form or spatial information is to be included appropriately. Further extensions should allow for time-varying effects of covariates or covariates that are themselves time-varying. Such models relax the assumption of proportional hazards. We propose a regression model for the hazard rate that combines and extends the above-mentioned features on the basis of a unifying Bayesian model formulation. Nonlinear and time-varying effects as well as the baseline hazard rate are modeled by penalized splines. Spatial effects can be included based on either Markov random fields or stationary Gaussian random fields. The model allows for arbitrary combinations of left, right and interval censoring as well as left truncation. Estimation is based on a reparameterisation of the model as a variance components mixed model. The variance parameters corresponding to inverse smoothing parameters can then be estimated based on an approximate marginal likelihood approach. As an application we present an analysis on childhood mortality in Nigeria, where the interval censoring framework also allows to deal with the problem of heaped survival times caused by memory effects. In a simulation study we investigate the effect of ignoring the impact of interval censored observations

CiteSeerX

Open Access LMU ( Ludwig-Maximilians-Univ. München)

EconStor (ZBW Kiel)

Connecting Cluster Substructure in Galaxy Cluster Cores at z=0.2 With Cluster Assembly Histories

Author: Graham P. Smith
James E. Taylor
Kneib
Kneib
Kneib
Kneib
Lacey
Moore
Moran
Nagai
Natarajan
Navarro
Poole
Richard
Sand
Smith
Smith
Smith
Taylor
Wechsler
Publication venue: 'University of Chicago Press'
Publication date: 01/01/2008
Field of study

We use semi-analytic models of structure formation to interpret gravitational lensing measurements of substructure in galaxy cluster cores (R<=250kpc/h) at z=0.2. The dynamic range of the lensing-based substructure fraction measurements is well matched to the theoretical predictions, both spanning f_sub~0.05-0.65. The structure formation model predicts that f_sub is correlated with cluster assembly history. We use simple fitting formulae to parameterize the predicted correlations: Delta_90 = tau_90 + alpha_90 * log(f_sub) and Delta_50 = tau_50 + alpha_50 * log(f_sub), where Delta_90 and Delta_50 are the predicted lookback times from z=0.2 to when each theoretical cluster had acquired 90% and 50% respectively of the mass it had at z=0.2. The best-fit parameter values are: alpha_90 = (-1.34+/-0.79)Gyr, tau_90 = (0.31+/-0.56)Gyr and alpha_50 = (-2.77+/-1.66)Gyr, tau_50 = (0.99+/-1.18)Gyr. Therefore (i) observed clusters with f_sub<~0.1 (e.g. A383, A1835) are interpreted, on average, to have formed at z>~0.8 and to have suffered <=10% mass growth since z~0.4, (ii) observed clusters with f_sub>~0.4 (e.g. A68, A773) are interpreted as, on average, forming since z~0.4 and suffering >10% mass growth in the ~500Myr preceding z=0.2, i.e. since z=0.25. In summary, observational measurements of f_sub can be combined with structure formation models to estimate the age and assembly history of observed clusters. The ability to ``age-date'' approximately clusters in this way has numerous applications to the large clusters samples that are becoming available.Comment: Accepted by ApJL, 4 pages, 2 figure

arXiv.org e-Print Archive

Crossref

University of Birmingham Research Portal

Caltech Authors

Bayesian Semiparametric Multi-State Models

Author: Hennerfeind Andrea
Kneib Thomas
Publication venue
Publication date: 01/01/2006
Field of study

Multi-state models provide a unified framework for the description of the evolution of discrete phenomena in continuous time. One particular example are Markov processes which can be characterised by a set of time-constant transition intensities between the states. In this paper, we will extend such parametric approaches to semiparametric models with flexible transition intensities based on Bayesian versions of penalised splines. The transition intensities will be modelled as smooth functions of time and can further be related to parametric as well as nonparametric covariate effects. Covariates with time-varying effects and frailty terms can be included in addition. Inference will be conducted either fully Bayesian using Markov chain Monte Carlo simulation techniques or empirically Bayesian based on a mixed model representation. A counting process representation of semiparametric multi-state models provides the likelihood formula and also forms the basis for model validation via martingale residual processes. As an application, we will consider human sleep data with a discrete set of sleep states such as REM and Non-REM phases. In this case, simple parametric approaches are inappropriate since the dynamics underlying human sleep are strongly varying throughout the night and individual-specific variation has to be accounted for using covariate information and frailty terms

Open Access LMU ( Ludwig-Maximilians-Univ. München)

EconStor (ZBW Kiel)

High-dimensional Structured Additive Regression Models: Bayesian Regularisation, Smoothing and Predictive Performance

Author: Fahrmeir Ludwig
Kneib Thomas
Konrath Susanne
Publication venue
Publication date: 23/01/2009
Field of study

Data structures in modern applications frequently combine the necessity of flexible regression techniques such as nonlinear and spatial effects with high-dimensional covariate vectors. While estimation of the former is typically achieved by supplementing the likelihood with a suitable smoothness penalty, the latter are usually assigned shrinkage penalties that enforce sparse models. In this paper, we consider a Bayesian unifying perspective, where conditionally Gaussian priors can be assigned to all types of regression effects. Suitable hyperprior assumptions on the variances of the Gaussian distributions then induce the desired smoothness or sparseness properties. As a major advantage, general Markov chain Monte Carlo simulation algorithms can be developed that allow for the joint estimation of smooth and spatial effects and regularised coefficient vectors. Two applications demonstrate the usefulness of the proposed procedure: A geoadditive regression model for data from the Munich rental guide and an additive probit model for the prediction of consumer credit defaults. In both cases, high-dimensional vectors of categorical covariates will be included in the regression models. The predictive ability of the resulting high-dimensional structure additive regression models compared to expert models will be of particular relevance and will be evaluated on cross-validation test data

Open Access LMU ( Ludwig-Maximilians-Univ. München)

A General Approach for the Analysis of Habitat Selection

Author: Knauer Felix
Kneib Thomas
Küchenhoff Helmut
Publication venue
Publication date: 01/01/2007
Field of study

Investigating habitat selection of animals aims at the detection of preferred and avoided habitat types as well as at the identification of covariates influencing the choice of certain habitat types. The final goal of such analyses is an improvement of the conservation of animals. Usually, habitat selection by larger animals is assessed by radio-tracking or visual observation studies, where the chosen habitat is determined for a number of animals at a set of time points. Hence the resulting data often have the following structure: A categorical variable indicating the habitat type selected by an animal at a specific time point is repeatedly observed and shall be explained by covariates. These may either describe properties of the habitat types currently available and / or properties of the animal. In this paper, we present a general approach for the analysis of such data in a categorical regression setup. The proposed model generalises and improves upon several of the approaches previously discussed in the literature and in particular allows to account for changing habitat availability due to the movement of animals within the observation area. It incorporates both habitat- and animal-specific covariates, and includes individual-specific random effects in order to account for correlations introduced by the repeated measurements on single animals. The methodology is implemented in a freely available software package. We demonstrate the general applicability and the capabilities of the proposed approach in two case studies: The analysis of a songbird in South-America and a study on brown bears in Central Europe

Open Access LMU ( Ludwig-Maximilians-Univ. München)

BayesX: Analysing Bayesian structured additive regression models

Author: Brezger Andreas
Kneib Thomas
Lang S.
Publication venue
Publication date: 01/01/2003
Field of study

There has been much recent interest in Bayesian inference for generalized additive and related models. The increasing popularity of Bayesian methods for these and other model classes is mainly caused by the introduction of Markov chain Monte Carlo (MCMC) simulation techniques which allow the estimation of very complex and realistic models. This paper describes the capabilities of the public domain software BayesX for estimating complex regression models with structured additive predictor. The program extends the capabilities of existing software for semiparametric regression. Many model classes well known from the literature are special cases of the models supported by BayesX. Examples are Generalized Additive (Mixed) Models, Dynamic Models, Varying Coefficient Models, Geoadditive Models, Geographically Weighted Regression and models for space-time regression. BayesX supports the most common distributions for the response variable. For univariate responses these are Gaussian, Binomial, Poisson, Gamma and negative Binomial. For multicategorical responses, both multinomial logit and probit models for unordered categories of the response as well as cumulative threshold models for ordered categories may be estimated. Moreover, BayesX allows the estimation of complex continuous time survival and hazardrate models

Open Access LMU ( Ludwig-Maximilians-Univ. München)

EconStor (ZBW Kiel)

Penalized additive regression for space-time data: a Bayesian perspective

Author: Fahrmeir Ludwig
Kneib Thomas
Lang S.
Publication venue
Publication date: 01/01/2003
Field of study

We propose extensions of penalized spline generalized additive models for analysing space-time regression data and study them from a Bayesian perspective. Non-linear effects of continuous covariates and time trends are modelled through Bayesian versions of penalized splines, while correlated spatial effects follow a Markov random field prior. This allows to treat all functions and effects within a unified general framework by assigning appropriate priors with different forms and degrees of smoothness. Inference can be performed either with full (FB) or empirical Bayes (EB) posterior analysis. FB inference using MCMC techniques is a slight extension of own previous work. For EB inference, a computationally efficient solution is developed on the basis of a generalized linear mixed model representation. The second approach can be viewed as posterior mode estimation and is closely related to penalized likelihood estimation in a frequentist setting. Variance components, corresponding to smoothing parameters, are then estimated by using marginal likelihood. We carefully compare both inferential procedures in simulation studies and illustrate them through real data applications. The methodology is available in the open domain statistical package BayesX and as an S-plus/R function

CiteSeerX

Open Access LMU ( Ludwig-Maximilians-Univ. München)

Variable Selection and Model Choice in Geoadditive Regression Models

Author: Hothorn Torsten
Kneib Thomas
Tutz Gerhard
Publication venue
Publication date: 01/01/2007
Field of study

Model choice and variable selection are issues of major concern in practical regression analyses. We propose a boosting procedure that facilitates both tasks in a class of complex geoadditive regression models comprising spatial effects, nonparametric effects of continuous covariates, interaction surfaces, random effects, and varying coefficient terms. The major modelling component are penalized splines and their bivariate tensor product extensions. All smooth model terms are represented as the sum of a parametric component and a remaining smooth component with one degree of freedom to obtain a fair comparison between all model terms. A generic representation of the geoadditive model allows to devise a general boosting algorithm that implements automatic model choice and variable selection. We demonstrate the versatility of our approach with two examples: a geoadditive Poisson regression model for species counts in habitat suitability analyses and a geoadditive logit model for the analysis of forest health

Open Access LMU ( Ludwig-Maximilians-Univ. München)

Gradient boosting in Markov-switching generalized additive models for location, scale and shape

Author: Adam Timo
Kneib Thomas
Mayr Andreas
Publication venue
Publication date: 06/10/2017
Field of study

We propose a novel class of flexible latent-state time series regression models which we call Markov-switching generalized additive models for location, scale and shape. In contrast to conventional Markov-switching regression models, the presented methodology allows us to model different state-dependent parameters of the response distribution - not only the mean, but also variance, skewness and kurtosis parameters - as potentially smooth functions of a given set of explanatory variables. In addition, the set of possible distributions that can be specified for the response is not limited to the exponential family but additionally includes, for instance, a variety of Box-Cox-transformed, zero-inflated and mixture distributions. We propose an estimation approach based on the EM algorithm, where we use the gradient boosting framework to prevent overfitting while simultaneously performing variable selection. The feasibility of the suggested approach is assessed in simulation experiments and illustrated in a real-data setting, where we model the conditional distribution of the daily average price of energy in Spain over time

arXiv.org e-Print Archive

GRO.publications (Univ. Göttingen)