9,503 research outputs found

    Boolean algebras and Lubell functions

    Full text link
    Let 2[n]2^{[n]} denote the power set of [n]:={1,2,...,n}[n]:=\{1,2,..., n\}. A collection \B\subset 2^{[n]} forms a dd-dimensional {\em Boolean algebra} if there exist pairwise disjoint sets X0,X1,...,Xd[n]X_0, X_1,..., X_d \subseteq [n], all non-empty with perhaps the exception of X0X_0, so that \B={X_0\cup \bigcup_{i\in I} X_i\colon I\subseteq [d]}. Let b(n,d)b(n,d) be the maximum cardinality of a family \F\subset 2^X that does not contain a dd-dimensional Boolean algebra. Gunderson, R\"odl, and Sidorenko proved that b(n,d)cdn1/2d2nb(n,d) \leq c_d n^{-1/2^d} \cdot 2^n where cd=10d221ddd2dc_d= 10^d 2^{-2^{1-d}}d^{d-2^{-d}}. In this paper, we use the Lubell function as a new measurement for large families instead of cardinality. The Lubell value of a family of sets \F with \F\subseteq \tsupn is defined by h_n(\F):=\sum_{F\in \F}1/{{n\choose |F|}}. We prove the following Tur\'an type theorem. If \F\subseteq 2^{[n]} contains no dd-dimensional Boolean algebra, then h_n(\F)\leq 2(n+1)^{1-2^{1-d}} for sufficiently large nn. This results implies b(n,d)Cn1/2d2nb(n,d) \leq C n^{-1/2^d} \cdot 2^n, where CC is an absolute constant independent of nn and dd. As a consequence, we improve several Ramsey-type bounds on Boolean algebras. We also prove a canonical Ramsey theorem for Boolean algebras.Comment: 10 page

    Data-based stochastic model reduction for the Kuramoto--Sivashinsky equation

    Full text link
    The problem of constructing data-based, predictive, reduced models for the Kuramoto-Sivashinsky equation is considered, under circumstances where one has observation data only for a small subset of the dynamical variables. Accurate prediction is achieved by developing a discrete-time stochastic reduced system, based on a NARMAX (Nonlinear Autoregressive Moving Average with eXogenous input) representation. The practical issue, with the NARMAX representation as with any other, is to identify an efficient structure, i.e., one with a small number of terms and coefficients. This is accomplished here by estimating coefficients for an approximate inertial form. The broader significance of the results is discussed.Comment: 23 page, 7 figure

    Characteristics and Fertility Status of Soils and Minesoils in Selected Areas of Usibelli Coal Mine, Healy, Alaska

    Get PDF
    Alaska has been proven to contain not only bountiful oil and gas reserves. but also vast coal fields occurring from the southcentral coastline to the Interior and the Arctic zone to the north. Because of concerns for stable sources of energy, particularly by the energy-short, industrial nations of the Orient, more exploration and stripmining for coal can be expected in the near future. Therefore, it is important to know the consequences of large-area soil disturbances tn the subarctic and bow the effects of man's reclamation efforts and natural processes combine in reestablishing vegetative community. The culmination or synthesis of these processes is soil development and is of great importance in successful stripmine reclamation. The Usibelli Coal Mine Company in the Healy coal field, located in Interior Alaska. commenced stripmining in 1943. Its operation has been continuous, moving from area to area, for the last 40 years. Stripmining requires the excavation of overburden and subsequent redeposition, therefore the Healy operation has exposed minespoils from different strata on various topography. In 1972, the Usibelli Coal Mine company initiated a reclamation program and, over the ensuing l0 years, has seeded and fertilized over 2000 acres. Nevertheless, there remain barren areas and areas undergoing natural revegetation. Additionally, experimental trials in seeding and fertilization were started in 1980. Large areas of intact native plant communities adjoin the mined areas. The company property provides opportunities to study the processes of soil formation under different sets of conditions. The objectives of this study were to (1) characterize the soils on the mine lease area for baseline data, (2) to characterize the mine soils with various history, (3) to study the process of soil formation under different sets of conditions, and (4) to evaluate the nutrient levels of both soil and minesoils to form a basis for establishing soil-handling requirements to promote reclamation practices.This study was supported by funds from the U.S. Department of Energy (AM06-76RL02229) and the U.S. Department of Agriculture Hatch project. Our appreciation to Drs. W.M. Mitchell. G.A. Mitchell. and F. Wooding of the Agricultural and Forestry Experiment Station. and Mr. J.P. Moore of USDA Son Conservation Service for reviewing the manuscript and offering many useful suggestions. Our appreciation also to Dr. Milton A. Wiltse of Division of Geological and Geophysical Surveys. Department of Natural Resources for access to the X-ray diffractometer and technical advice. Special thanks to the Usibelli Coal Mine Inc. for logistic and technical assistance tn carrying out this study

    How to Host a Data Competition: Statistical Advice for Design and Analysis of a Data Competition

    Full text link
    Data competitions rely on real-time leaderboards to rank competitor entries and stimulate algorithm improvement. While such competitions have become quite popular and prevalent, particularly in supervised learning formats, their implementations by the host are highly variable. Without careful planning, a supervised learning competition is vulnerable to overfitting, where the winning solutions are so closely tuned to the particular set of provided data that they cannot generalize to the underlying problem of interest to the host. This paper outlines some important considerations for strategically designing relevant and informative data sets to maximize the learning outcome from hosting a competition based on our experience. It also describes a post-competition analysis that enables robust and efficient assessment of the strengths and weaknesses of solutions from different competitors, as well as greater understanding of the regions of the input space that are well-solved. The post-competition analysis, which complements the leaderboard, uses exploratory data analysis and generalized linear models (GLMs). The GLMs not only expand the range of results we can explore, they also provide more detailed analysis of individual sub-questions including similarities and differences between algorithms across different types of scenarios, universally easy or hard regions of the input space, and different learning objectives. When coupled with a strategically planned data generation approach, the methods provide richer and more informative summaries to enhance the interpretation of results beyond just the rankings on the leaderboard. The methods are illustrated with a recently completed competition to evaluate algorithms capable of detecting, identifying, and locating radioactive materials in an urban environment.Comment: 36 page

    Pantheon 1.0, a manually verified dataset of globally famous biographies

    Full text link
    We present the Pantheon 1.0 dataset: a manually verified dataset of individuals that have transcended linguistic, temporal, and geographic boundaries. The Pantheon 1.0 dataset includes the 11,341 biographies present in more than 25 languages in Wikipedia and is enriched with: (i) manually verified demographic information (place and date of birth, gender) (ii) a taxonomy of occupations classifying each biography at three levels of aggregation and (iii) two measures of global popularity including the number of languages in which a biography is present in Wikipedia (L), and the Historical Popularity Index (HPI) a metric that combines information on L, time since birth, and page-views (2008-2013). We compare the Pantheon 1.0 dataset to data from the 2003 book, Human Accomplishments, and also to external measures of accomplishment in individual games and sports: Tennis, Swimming, Car Racing, and Chess. In all of these cases we find that measures of popularity (L and HPI) correlate highly with individual accomplishment, suggesting that measures of global popularity proxy the historical impact of individuals.Comment: Scientific Data 2:15007
    corecore