1,493 research outputs found

    Between analysis and transformation: technology, methodology and evaluation on the SPLICE project

    Get PDF
    This paper concerns the ways in which technological change may entail methodological development in e-learning research. The focus of our argument centres on the subject of evaluation in e-learning and how technology can contribute to consensus-building on the value of project outcomes, and the identification of mechanisms behind those outcomes. We argue that a critical approach to the methodology of evaluation which harnesses technology in this way is vital to agile and effective policy and strategy-making in institutions as the challenges of transformation in a rapidly changing educational and technological environment are grappled with. With its focus on mechanisms and multiple stakeholder perspectives, we identify Pawson and Tilley’s ‘Realistic Evaluation’ as an appropriate methodological approach for this purpose, and we report on its use within a JISC-funded project on social software, SPLICE (Social Practices, Learning and Interoperability in Connected Environments). The project created new tools to assist the identification of mechanisms responsible for change to personal and institutional technological practice. These tools included collaborative mind-mapping and focused questioning, and tools for the animated modelling of complex mechanisms. By using these tools, large numbers of project stakeholders could engage in a process where they were encouraged to articulate and share their theories and ideas as to why project outcomes occurred. Using the technology, this process led towards the identification and agreement of common mechanisms which had explanatory power for all stakeholders. In conclusion, we argue that SPLICE has shown the potential of technologically-mediated Realistic Evaluation. Given the technologies we now have, a methodology based on the mass cumulation of stakeholder theories and ideas about mechanisms is feasible. Furthermore, the summative outcomes of such a process are rich in explanatory and predictive power, and therefore useful to the immediate and strategic problems of the sector. Finally, we argue that as well as generating better explanations for phenomena, the evaluation process can itself become transformative for stakeholders

    A New Partitioning Around Medoids Algorithm

    Get PDF
    Kaufman & Rousseeuw (1990) proposed a clustering algorithm Partitioning Around Medoids (PAM) which maps a distance matrix into a specified number of clusters. A particularly nice property is that PAM allows clustering with respect to any specified distance metric. In addition, the medoids are robust representations of the cluster centers, which is particularly important in the common context that many elements do not belong well to any cluster. Based on our experience in clustering gene expression data, we have noticed that PAM does have problems recognizing relatively small clusters in situations where good partitions around medoids clearly exist. In this note, we propose to partition around medoids by maximizing a criteria Average Silhouette\u27\u27 defined by Kaufman & Rousseeuw. We also propose a fast-to-compute approximation of Average Silhouette\u27\u27. We implement these two new partitioning around medoids algorithms and illustrate their performance relative to existing partitioning methods in simulations

    Multiple Testing. Part II. Step-Down Procedures for Control of the Family-Wise Error Rate

    Get PDF
    The present article proposes two step-down multiple testing procedures for asymptotic control of the family-wise error rate (FWER): the first procedure is based on maxima of test statistics (step-down maxT), while the second relies on minima of unadjusted p-values (step-down minP). A key feature of our approach is the test statistics null distribution (rather than data generating null distribution) used to derive cut-offs (i.e., rejection regions) for these test statistics and the resulting adjusted p-values. For general null hypotheses, corresponding to submodels for the data generating distribution, we identify an asymptotic domination condition for a null distribution under which the step-down maxT and minP procedures asymptotically control the Type I error rate, for arbitrary data generating distributions, without the need for conditions such as subset pivotality. Inspired by this general characterization of a null distribution, we then propose as an explicit null distribution the asymptotic distribution of the vector of null-value shifted and scaled test statistics. Step-down procedures based on consistent estimators of the null distribution are shown to also provide asymptotic control of the Type I error rate. A general bootstrap algorithm is supplied to conveniently obtain consistent estimators of the null distribution

    Resampling-based Multiple Testing: Asymptotic Control of Type I Error and Applications to Gene Expression Data

    Get PDF
    We define a general statistical framework for multiple hypothesis testing and show that the correct null distribution for the test statistics is obtained by projecting the true distribution of the test statistics onto the space of mean zero distributions. For common choices of test statistics (based on an asymptotically linear parameter estimator), this distribution is asymptotically multivariate normal with mean zero and the covariance of the vector influence curve for the parameter estimator. This test statistic null distribution can be estimated by applying the non-parametric or parametric bootstrap to correctly centered test statistics. We prove that this bootstrap estimated null distribution provides asymptotic control of most type I error rates. We show that obtaining a test statistic null distribution from a data null distribution, e.g. projecting the data generating distribution onto the space of all distributions satisfying the complete null), only provides the correct test statistic null distribution if the covariance of the vector influence curve is the same under the data null distribution as under the true data distribution. This condition is a weak version of the subset pivotality condition. We show that our multiple testing methodology controlling type I error is equivalent to constructing an error-specific confidence region for the true parameter and checking if it contains the hypothesized value. We also study the two sample problem and show that the permutation distribution produces an asymptotically correct null distribution if (i) the sample sizes are equal or (ii) the populations have the same covariance structure. We include a discussion of the application of multiple testing to gene expression data, where the dimension typically far exceeds the sample size. An analysis of a cancer gene expression data set illustrates the methodology

    Supervised Distance Matrices: Theory and Applications to Genomics

    Get PDF
    We propose a new approach to studying the relationship between a very high dimensional random variable and an outcome. Our method is based on a novel concept, the supervised distance matrix, which quantifies pairwise similarity between variables based on their association with the outcome. A supervised distance matrix is derived in two stages. The first stage involves a transformation based on a particular model for association. In particular, one might regress the outcome on each variable and then use the residuals or the influence curve from each regression as a data transformation. In the second stage, a choice of distance measure is used to compute all pairwise distances between variables in this transformed data. When the outcome is right-censored, we show that the supervised distance matrix can be consistently estimated using inverse probability of censoring weighted (IPCW) estimators based on the mean and covariance of the transformed data. The proposed methodology is illustrated with examples of gene expression data analysis with a survival outcome. This approach is widely applicable in genomics and other fields where high-dimensional data is collected on each subject

    High-order curvilinear hybrid mesh generation for CFD simulations

    Full text link
    We describe a semi-structured method for the generation of high-order hybrid meshes suited for the simulation of high Reynolds number flows. This is achieved through the use of highly stretched elements in the viscous boundary layers near the wall surfaces. CADfix is used to first repair any possible defects in the CAD geometry and then generate a medial object based decomposition of the domain that wraps the wall boundaries with partitions suitable for the generation of either prismatic or hexahedral elements. The latter is a novel distinctive feature of the method that permits to obtain well-shaped hexahedral meshes at corners or junctions in the boundary layer. The medial object approach allows greater control on the "thickness" of the boundary-layer mesh than is generally achievable with advancing layer techniques. CADfix subsequently generates a hybrid straight sided mesh of prismatic and hexahedral elements in the near-field region modelling the boundary layer, and tetrahedral elements in the far-field region covering the rest of the domain. The mesh in the near-field region provides a framework that facilitates the generation, via an isoparametric technique, of layers of highly stretched elements with a distribution of points in the direction normal to the wall tailored to efficiently and accurately capture the flow in the boundary layer. The final step is the generation of a high-order mesh using NekMesh, a high-order mesh generator within the Nektar++ framework. NekMesh uses the CADfix API as a geometry engine that handles all the geometrical queries to the CAD geometry required during the high-order mesh generation process. We will describe in some detail the methodology using a simple geometry, a NACA wing tip, for illustrative purposes. Finally, we will present two examples of application to reasonably complex geometries proposed by NASA as CFD validation cases.Comment: Pre-print accepted to the 2018 AIAA Aerospace Sciences Meetin

    Effective risk governance for environmental policy making: a knowledge management perspective

    Get PDF
    Effective risk management within environmental policy making requires knowledge on natural, economic and social systems to be integrated; knowledge characterised by complexity, uncertainty and ambiguity. We describe a case study in a (UK) central government department exploring how risk governance supports and hinders this challenging integration of knowledge. Forty-five semi-structured interviews were completed over a two year period. We found that lateral knowledge transfer between teams working on different policy areas was widely viewed as a key source of knowledge. However, the process of lateral knowledge transfer was predominantly informal and unsupported by risk governance structures. We argue this made decision quality vulnerable to a loss of knowledge through staff turnover, and time and resource pressures. Our conclusion is that the predominant form of risk governance framework, with its focus on centralised decision-making and vertical knowledge transfer is insufficient to support risk-based, environmental policy making. We discuss how risk governance can better support environmental policy makers through systematic knowledge management practices

    Multiple Testing Procedures: R multtest Package and Applications to Genomics

    Get PDF
    The Bioconductor R package multtest implements widely applicable resampling-based single-step and stepwise multiple testing procedures (MTP) for controlling a broad class of Type I error rates, in testing problems involving general data generating distributions (with arbitrary dependence structures among variables), null hypotheses, and test statistics. The current version of multtest provides MTPs for tests concerning means, differences in means, and regression parameters in linear and Cox proportional hazards models. Procedures are provided to control Type I error rates defined as tail probabilities for arbitrary functions of the numbers of false positives and rejected hypotheses. These error rates include tail probabilities for the number of false positives (generalized family-wise error rate, gFWER) and the proportion of false positives among the rejected hypotheses (TPPFP). Single-step and step-down common-cut-off (maxT) and common-quantile (minP) procedures, that take into account the joint distribution of the test statistics, are proposed to control the family-wise error rate (FWER), or chance of at least one Type I error. In addition, augmentation multiple testing procedures are provided to control the gFWER and TPPFP, based on any initial FWER-controlling procedure. The results of a multiple testing procedure can be summarized using rejection regions for the test statistics, confidence regions for the parameters of interest, or adjusted p-values. A key ingredient of our proposed MTPs is the test statistics null distribution (and estimator thereof) used to derive rejection regions and corresponding confidence regions and adjusted p-values. Both bootstrap and permutation estimators of the test statistics null distribution are available. The S4 class/method object-oriented programming approach was adopted to summarize the results of a MTP. The modular design of multtest allows interested users to readily extend the package\u27s functionality. Typical testing scenarios are illustrated by applying various MTPs implemented in multtest to the Acute Lymphoblastic Leukemia (ALL) dataset of Chiaretti et al. (2004), with the aim of identifying genes whose expression measures are associated with (possibly censored) biological and clinical outcomes

    Multiple Testing Procedures and Applications to Genomics

    Get PDF
    This chapter proposes widely applicable resampling-based single-step and stepwise multiple testing procedures (MTP) for controlling a broad class of Type I error rates, in testing problems involving general data generating distributions (with arbitrary dependence structures among variables), null hypotheses, and test statistics (Dudoit and van der Laan, 2005; Dudoit et al., 2004a,b; van der Laan et al., 2004a,b; Pollard and van der Laan, 2004; Pollard et al., 2005). Procedures are provided to control Type I error rates defined as tail probabilities for arbitrary functions of the numbers of Type I errors, V_n, and rejected hypotheses, R_n. These error rates include: the generalized family-wise error rate, gFWER(k) = Pr(V_n \u3e k), or chance of at least (k+1) false positives (the special case k=0 corresponds to the usual family-wise error rate, FWER), and tail probabilities for the proportion of false positives among the rejected hypotheses, TPPFP(q) = Pr(V_n/R_n \u3e q). Single-step and step-down common-cut-off (maxT) and common-quantile (minP) procedures, that take into account the joint distribution of the test statistics, are proposed to control the FWER. In addition, augmentation multiple testing procedures are provided to control the gFWER and TPPFP, based on any initial FWER-controlling procedure. The results of a multiple testing procedure can be summarized using rejection regions for the test statistics, confidence regions for the parameters of interest, or adjusted p-values. A key ingredient of our proposed MTPs is the test statistics null distribution (and consistent bootstrap estimator thereof) used to derive rejection regions and corresponding confidence regions and adjusted p-values. This chapter illustrates an implementation in SAS (Version 9) of the bootstrap-based single-step maxT procedure and of the gFWER- and TPPFP-controlling augmentation procedures. These multiple testing procedures are applied to an HIV-1 sequence dataset to identify codon positions associated with viral replication capacity
    corecore