666 research outputs found
The impact of imprecisely measured covariates on estimating gene-environment interactions
BACKGROUND
The effects of measurement error in epidemiological exposures and confounders on estimated effects of exposure are well described, but the effects on estimates for gene-environment interactions has received rather less attention. In particular, the effects of confounder measurement error on gene-environment interactions are unknown.
METHODS
We investigate these effects using simulated data and illustrate our results with a practical example in nutrition epidemiology.
RESULTS
We show that the interaction regression coefficient is unchanged by confounder measurement error under certain conditions, but biased by exposure measurement error. We also confirm that confounder measurement error can lead to estimated effects of exposure biased either towards or away from the null, depending on the correlation structure, with associated effects on type II errors.
CONCLUSION
Whilst measurement error in confounders does not lead to bias in interaction coefficients, it may still lead to bias in the estimated effects of exposure. There may still be cost implications for epidemiological studies that need to calibrate all error-prone covariates against a valid reference, in addition to the exposure, to reduce the effects of confounder measurement erro
Growth, current size and the role of the 'reversal paradox' in the foetal origins of adult disease: an illustration using vector geometry
BACKGROUND
Numerous studies have reported inverse associations between birth weight and a range of diseases in later life. These have led to the development of the 'foetal origins of adult disease hypothesis'. However, many such studies have only been able to demonstrate a statistically significant association between birth weight and disease in later life by adjusting for current size. This has been interpreted as evidence that the impact of low birth weight on subsequent disease is somehow dependent on subsequent weight gain, and has led to a broadening of the hypothesis into the 'developmental origins of health and disease'. Unfortunately, much of the epidemiological evidence used for both of these interpretations is prone to a statistical artefact known as the 'reversal paradox'. The aim of this paper is to illustrate why, using vector geometry.
MATERIALS AND METHODS
This paper introduces the key concepts of vector geometry as applied to multiple regression analysis. This approach is then used to illustrate the similar statistical problems encountered when adjusting for current size or growth when exploring the association between birth weight and disease in later life.
RESULTS
Geometrically, the three covariates – birth size, growth, and current size – span only 2-dimensional space. Regressing disease in later life (i.e. the outcome variable) on any two of these covariates equates to projecting the disease variable onto the plane spanned by the three covariate vectors. The three possible regression models – where any two covariates are considered – are therefore equivalent and yield exactly the same model fit (R2).
CONCLUSION
Vector geometry illustrates why it is impossible to differentiate between the effects of growth from the effects of current size in studies exploring the relationship between size at birth and subsequent disease. For similar reasons, it is impossible to differentiate between the effects of growth and the effects of birth weight. Assessing the 'independent' impact of growth on later disease by adjusting for either birth weight or current size is therefore illusory
Joint disease mapping using six cancers in the Yorkshire region of England
OBJECTIVES:
The aims of this study were to model jointly the incidence rates of six smoking related cancers in the Yorkshire region of England, to explore the patterns of spatial correlation amongst them, and to estimate the relative weight of smoking and other shared risk factors for the relevant disease sites, both before and after adjustment for socioeconomic background (SEB).
METHODS:
Data on the incidence of oesophagus, stomach, pancreas, lung, kidney, and bladder cancers between 1983 and 2003 were extracted from the Northern & Yorkshire Cancer Registry database for the 532 electoral wards in the Yorkshire region. Using postcode of residence, each case was assigned an area-based measure of SEB using the Townsend index. Standardised incidence ratios (SIRs) were calculated for each cancer site and their correlations investigated. The joint analysis of the spatial variation in incidence used a Bayesian shared-component model. Three components were included to represent differences in smoking (for all six sites), bodyweight/obesity (for oesophagus, pancreas and kidney cancers) and diet/alcohol consumption (for oesophagus and stomach cancers).
RESULTS:
The incidence of cancers of the oesophagus, pancreas, kidney, and bladder was relatively evenly distributed across the region. The incidence of stomach and lung cancers was more clustered around the urban areas in the south of the region, and these two cancers were significantly associated with higher levels of area deprivation. The incidence of lung cancer was most impacted by adjustment for SEB, with the rural/urban split becoming less apparent. The component representing smoking had a larger effect on cancer incidence in the eastern part of the region. The effects of the other two components were small and disappeared after adjustment for SEB.
CONCLUSIONS:
This study demonstrates the feasibility of joint disease modelling using data from six cancer sites. Incidence estimates are more precise than those obtained without smoothing. This methodology may be an important tool to help authorities evaluate healthcare system performance and the impact of policies
A Model for the Analysis of Caries Occurrence in Primary Molar Tooth Surfaces
Recently methods of caries quantification in the primary dentition have moved away from summary ‘whole mouth’ measures at the individual level to methods based on generalised linear modelling (GLM) approaches or survival analysis approaches. However, GLM approaches based on logistic transformation fail to take into account the time-dependent process of tooth/surface survival to caries. There may also be practical difficulties associated with casting parametric survival-based approaches in a complex multilevel hierarchy and the selection of an optimal survival distribution, while non-parametric survival methods are not generally suitable for the assessment of supplementary information recorded on study participants. In the current investigation, a hybrid semi-parametric approach comprising elements of survival-based and GLM methodologies suitable for modelling of caries occurrence within fixed time periods is assessed, using an illustrative multilevel data set of caries occurrence in primary molars from a cohort study, with clustering of data assumed to occur at surface and tooth levels. Inferences of parameter significance were found to be consistent with previous parametric survival-based analyses of the same data set, with gender, socio-economic status, fluoridation status, tooth location, surface type and fluoridation status-surface type interaction significantly associated with caries occurrence. The appropriateness of the hierarchical structure facilitated by the hybrid approach was also confirmed. Hence the hybrid approach is proposed as a more appropriate alternative to primary caries modelling than non-parametric survival methods or other GLM-based models, and as a practical alternative to more rigorous survival-based methods unlikely to be fully accessible to most researchers
Advanced Modelling Strategies: Challenges and pitfalls in robust causal inference with observational data
Advanced Modelling Strategies: Challenges and pitfalls in robust causal inference with observational data summarises the lecture notes prepared for a four-day workshop sponsored by the Society for Social Medicine and hosted by the Leeds Institute for Data Analytics (LIDA) at the University of Leeds on 17th-20th July 2017
Placental blood transfusion in newborn babies reaches a plateau after 140 s: Further analysis of longitudinal survey of weight change
Objective: With the introduction of active management of the third stage of labour in the 1960s, it became usual practice to clamp and cut the umbilical cord immediately following birth. The timing of this cord clamping is controversial, as blood may beneficially be transferred to the baby if clamping of the cord is delayed slightly. There is no agreement, however, on how long the delay should be before clamping the cord. This study aimed to establish when blood ceased to flow in the umbilical cord to determine how long to delay clamping of the umbilical cord following delivery of the term newborn to maximise placental transfusion. Methods: This observational study collected longitudinal weight measurements set in a hospital labour ward. A total of 26 mothers at term and their singleton babies participated in the study. In this reanalysis, the velocity of weight change over the first minutes of life determined by functional data analysis was estimated. Results: We found that the flow velocity in the umbilical cord was on average 0 at 125 s after placing the baby on the scales, which was typically 140 s after birth. Conclusions: To maximise placental transfusion, cord clamping should be delayed for at least 140 s following birth of the baby
The impact of the Calman–Hine report on the processes and outcomes of care for Yorkshire's colorectal cancer patients
The 1995 Calman–Hine plan outlined radical reform of the UK's cancer services with the aim of improving outcomes and reducing inequalities in NHS cancer care. Its main recommendation was to concentrate care into the hands of site-specialist, multi-disciplinary teams. This study aimed to determine if the implementation of Calman–Hine cancer teams was associated with improved processes and outcomes of care for colorectal cancer patients. The design included longitudinal survey of 13 colorectal cancer teams in Yorkshire and retrospective study of population-based data collected by the Northern and Yorkshire Cancer Registry and Information Service. The population was all colorectal cancer patients diagnosed and treated in Yorkshire between 1995 and 2000. The main outcome measures were: variations in the use of anterior resection and preoperative radiotherapy in rectal cancer, chemotherapy in Dukes stage C and D patients, and five-year survival. Using multilevel models, these outcomes were assessed in relation to measures of the extent of Calman–Hine implementation throughout the study period, namely: (i) each team's degree of adherence to the Manual of Cancer Service Standards (which outlines the specification of the ‘ideal’ colorectal cancer team) and (ii) the extent of site specialisation of each team's surgeons. Variation was observed in the extent to which the colorectal cancer teams in Yorkshire had conformed to the Calman–Hine recommendations. An increase in surgical site specialisation was associated with increased use of preoperative radiotherapy (OR=1.43, 95% CI=1.04–1.98, P<0.04) and anterior resection (OR=1.43, 95% CI=1.16–1.76, P<0.01) in rectal cancer patients. Increases in adherence to the Manual of Cancer Service Standards was associated with improved five-year survival after adjustment for the casemix factors of age, stage of disease, socioeconomic status and year of diagnosis, especially for colon cancer (HR=0.97, 95% CI=0.94–0.99 P<0.01). There was a similar trend of improved survival in relation to increased surgical site specialisation for rectal cancer, although the effect was not statistically significant (HR=0.93, 95% CI=0.84–1.03, P=0.15). In conclusion, the extent of implementation of the Calman–Hine report has been variable and its recommendations are associated with improvements in processes and outcomes of care for colorectal cancer patients
Challenges in modelling the random structure correctly in growth mixture models and the impact this has on model mixtures
Lifecourse trajectories of clinical or anthropological attributes are useful for identifying how our early-life experiences influence later-life morbidity and mortality. Researchers often use growth mixture models (GMMs) to estimate such phenomena. It is common to place constrains on the random part of the GMM to improve parsimony or to aid convergence, but this can lead to an autoregressive structure that distorts the nature of the mixtures and subsequent model interpretation. This is especially true if changes in the outcome within individuals are gradual compared with the magnitude of differences between individuals. This is not widely appreciated, nor is its impact well understood. Using repeat measures of body mass index (BMI) for 1528 US adolescents, we estimated GMMs that required variance-covariance constraints to attain convergence. We contrasted constrained models with and without an autocorrelation structure to assess the impact this had on the ideal number of latent classes, their size and composition. We also contrasted model options using simulations. When the GMM variance-covariance structure was constrained, a within-class autocorrelation structure emerged. When not modelled explicitly, this led to poorer model fit and models that differed substantially in the ideal number of latent classes, as well as class size and composition. Failure to carefully consider the random structure of data within a GMM framework may lead to erroneous model inferences, especially for outcomes with greater within-person than between-person homogeneity, such as BMI. It is crucial to reflect on the underlying data generation processes when building such models
Cardiovascular disease in a cohort exposed to the 1940-45 Channel Islands occupation
BACKGROUND
To clarify the nature of the relationship between food deprivation/undernutrition during pre- and postnatal development and cardiovascular disease (CVD) in later life, this study examined the relationship between birth weight (as a marker of prenatal nutrition) and the incidence of hospital admissions for CVD from 1997–2005 amongst 873 Guernsey islanders (born in 1923–1937), 225 of whom had been exposed to food deprivation as children, adolescents or young adults (i.e. postnatal undernutrition) during the 1940–45 German occupation of the Channel Islands, and 648 of whom had left or been evacuated from the islands before the occupation began.
METHODS
Three sets of Cox regression models were used to investigate (A) the relationship between birth weight and CVD, (B) the relationship between postnatal exposure to the occupation and CVD and (C) any interaction between birth weight, postnatal exposure to the occupation and CVD. These models also tested for any interactions between birth weight and sex, and postnatal exposure to the occupation and parish of residence at birth (as a marker of parish residence during the occupation and related variation in the severity of food deprivation).
RESULTS
The first set of models (A) found no relationship between birth weight and CVD even after adjustment for potential confounders (hazard ratio (HR) per kg increase in birth weight: 1.12; 95% confidence intervals (CI): 0.70 – 1.78), and there was no significant interaction between birth weight and sex (p = 0.60). The second set of models (B) found a significant relationship between postnatal exposure to the occupation and CVD after adjustment for potential confounders (HR for exposed vs. unexposed group: 2.52; 95% CI: 1.54 – 4.13), as well as a significant interaction between postnatal exposure to the occupation and parish of residence at birth (p = 0.01), such that those born in urban parishes (where food deprivation was worst) had a greater HR for CVD than those born in rural parishes. The third model (C) found no interaction between birth weight and exposure to the occupation (p = 0.43).
CONCLUSION
These findings suggest that the levels of postnatal undernutrition experienced by children, adolescents and young adults exposed to food deprivation during the 1940–45 occupation of the Channel Islands were a more important determinant of CVD in later life than the levels of prenatal undernutrition experienced in utero prior to the occupatio
Robust causal inference using directed acyclic graphs: the R package ‘dagitty’
Directed acyclic graphs (DAGs), which offer systematic representations of causal relationships, have become an established framework for the analysis of causal inference in epidemiology, often being used to determine covariate adjustment sets for minimizing confounding bias. DAGitty is a popular web application for drawing and analysing DAGs. Here we introduce the R package ‘dagitty’, which provides access to all of the capabilities of the DAGitty web application within the R platform for statistical computing, and also offers several new functions. We describe how the R package ‘dagitty’ can be used to: evaluate whether a DAG is consistent with the dataset it is intended to represent; enumerate ‘statistically equivalent’ but causally different DAGs; and identify exposure outcome adjustment sets that are valid for causally different but statistically equivalent DAGs. This functionality enables epidemiologists to detect causal misspecifications in DAGs and make robust inferences that remain valid for a range of different DAGs. The R package ‘dagitty’ is available through the comprehensive R archive network (CRAN) at
[https://cran.r-project.org/web/packages/dagitty/]. The source code is available on github at [https://github.com/jtextor/dagitty]. The web application ‘DAGitty’ is free software, licensed under the GNU general public licence (GPL) version 2 and is available at [http://
dagitty.net/]
- …
