Search CORE

9,503 research outputs found

Boolean algebras and Lubell functions

Author: Johnston Travis
Lu Linyuan
Milans Kevin G.
Publication venue
Publication date: 11/07/2013
Field of study

Let

2^{[n]}

denote the power set of

[n]:=\{1,2,..., n\}

. A collection \B\subset 2^{[n]} forms a

d

-dimensional {\em Boolean algebra} if there exist pairwise disjoint sets

X_0, X_1,..., X_d \subseteq [n]

, all non-empty with perhaps the exception of

X_0

, so that \B={X_0\cup \bigcup_{i\in I} X_i\colon I\subseteq [d]}. Let

b(n,d)

be the maximum cardinality of a family \F\subset 2^X that does not contain a

d

-dimensional Boolean algebra. Gunderson, R\"odl, and Sidorenko proved that

b(n,d) \leq c_d n^{-1/2^d} \cdot 2^n

where

c_d= 10^d 2^{-2^{1-d}}d^{d-2^{-d}}

. In this paper, we use the Lubell function as a new measurement for large families instead of cardinality. The Lubell value of a family of sets \F with \F\subseteq \tsupn is defined by h_n(\F):=\sum_{F\in \F}1/{{n\choose |F|}}. We prove the following Tur\'an type theorem. If \F\subseteq 2^{[n]} contains no

d

-dimensional Boolean algebra, then h_n(\F)\leq 2(n+1)^{1-2^{1-d}} for sufficiently large

n

. This results implies

b(n,d) \leq C n^{-1/2^d} \cdot 2^n

, where

C

is an absolute constant independent of

n

and

d

. As a consequence, we improve several Ramsey-type bounds on Boolean algebras. We also prove a canonical Ramsey theorem for Boolean algebras.Comment: 10 page

arXiv.org e-Print Archive

CiteSeerX

Data-based stochastic model reduction for the Kuramoto--Sivashinsky equation

Author: Chorin Alexandre J.
Lin Kevin
Lu Fei
Publication venue
Publication date: 09/08/2016
Field of study

The problem of constructing data-based, predictive, reduced models for the Kuramoto-Sivashinsky equation is considered, under circumstances where one has observation data only for a small subset of the dynamical variables. Accurate prediction is achieved by developing a discrete-time stochastic reduced system, based on a NARMAX (Nonlinear Autoregressive Moving Average with eXogenous input) representation. The practical issue, with the NARMAX representation as with any other, is to identify an efficient structure, i.e., one with a small number of terms and coefficients. This is accomplished here by estimating coefficients for an approximate inertial form. The broader significance of the results is discussed.Comment: 23 page, 7 figure

arXiv.org e-Print Archive

Crossref

The University of Arizona

eScholarship - University of California

Characteristics and Fertility Status of Soils and Minesoils in Selected Areas of Usibelli Coal Mine, Healy, Alaska

Author: Kaija Kevin J.
Ping Chien-Lu
Publication venue: School of Agriculture and Land Resources Management, Agricultural and Forestry Experiment Station
Publication date: 01/12/1989
Field of study

Alaska has been proven to contain not only bountiful oil and gas reserves. but also vast coal fields occurring from the southcentral coastline to the Interior and the Arctic zone to the north. Because of concerns for stable sources of energy, particularly by the energy-short, industrial nations of the Orient, more exploration and stripmining for coal can be expected in the near future. Therefore, it is important to know the consequences of large-area soil disturbances tn the subarctic and bow the effects of man's reclamation efforts and natural processes combine in reestablishing vegetative community. The culmination or synthesis of these processes is soil development and is of great importance in successful stripmine reclamation. The Usibelli Coal Mine Company in the Healy coal field, located in Interior Alaska. commenced stripmining in 1943. Its operation has been continuous, moving from area to area, for the last 40 years. Stripmining requires the excavation of overburden and subsequent redeposition, therefore the Healy operation has exposed minespoils from different strata on various topography. In 1972, the Usibelli Coal Mine company initiated a reclamation program and, over the ensuing l0 years, has seeded and fertilized over 2000 acres. Nevertheless, there remain barren areas and areas undergoing natural revegetation. Additionally, experimental trials in seeding and fertilization were started in 1980. Large areas of intact native plant communities adjoin the mined areas. The company property provides opportunities to study the processes of soil formation under different sets of conditions. The objectives of this study were to (1) characterize the soils on the mine lease area for baseline data, (2) to characterize the mine soils with various history, (3) to study the process of soil formation under different sets of conditions, and (4) to evaluate the nutrient levels of both soil and minesoils to form a basis for establishing soil-handling requirements to promote reclamation practices.This study was supported by funds from the U.S. Department of Energy (AM06-76RL02229) and the U.S. Department of Agriculture Hatch project. Our appreciation to Drs. W.M. Mitchell. G.A. Mitchell. and F. Wooding of the Agricultural and Forestry Experiment Station. and Mr. J.P. Moore of USDA Son Conservation Service for reviewing the manuscript and offering many useful suggestions. Our appreciation also to Dr. Milton A. Wiltse of Division of Geological and Geophysical Surveys. Department of Natural Resources for access to the X-ray diffractometer and technical advice. Special thanks to the Usibelli Coal Mine Inc. for logistic and technical assistance tn carrying out this study

ScholarWorks@UA

How to Host a Data Competition: Statistical Advice for Design and Analysis of a Data Competition

Author: Anderson-Cook Christine M.
Fugate Michael L.
Lu Lu
Myers Kary L.
Pawley Norma
Quinlan Kevin R.
Publication venue: 'Wiley'
Publication date: 01/01/2019
Field of study

Data competitions rely on real-time leaderboards to rank competitor entries and stimulate algorithm improvement. While such competitions have become quite popular and prevalent, particularly in supervised learning formats, their implementations by the host are highly variable. Without careful planning, a supervised learning competition is vulnerable to overfitting, where the winning solutions are so closely tuned to the particular set of provided data that they cannot generalize to the underlying problem of interest to the host. This paper outlines some important considerations for strategically designing relevant and informative data sets to maximize the learning outcome from hosting a competition based on our experience. It also describes a post-competition analysis that enables robust and efficient assessment of the strengths and weaknesses of solutions from different competitors, as well as greater understanding of the regions of the input space that are well-solved. The post-competition analysis, which complements the leaderboard, uses exploratory data analysis and generalized linear models (GLMs). The GLMs not only expand the range of results we can explore, they also provide more detailed analysis of individual sub-questions including similarities and differences between algorithms across different types of scenarios, universally easy or hard regions of the input space, and different learning objectives. When coupled with a strategically planned data generation approach, the methods provide richer and more informative summaries to enhance the interpretation of results beyond just the rankings on the leaderboard. The methods are illustrated with a recently completed competition to evaluate algorithms capable of detecting, identifying, and locating radioactive materials in an urban environment.Comment: 36 page

arXiv.org e-Print Archive

Crossref

USFSP Digital Archive

Digital Commons @ University of South Florida

Pantheon 1.0, a manually verified dataset of globally famous biographies

Author: Hidalgo César A.
Hu Kevin
Lu Tiffany
Ronen Shahar
Yu Amy Zhao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/01/2016
Field of study

We present the Pantheon 1.0 dataset: a manually verified dataset of individuals that have transcended linguistic, temporal, and geographic boundaries. The Pantheon 1.0 dataset includes the 11,341 biographies present in more than 25 languages in Wikipedia and is enriched with: (i) manually verified demographic information (place and date of birth, gender) (ii) a taxonomy of occupations classifying each biography at three levels of aggregation and (iii) two measures of global popularity including the number of languages in which a biography is present in Wikipedia (L), and the Historical Popularity Index (HPI) a metric that combines information on L, time since birth, and page-views (2008-2013). We compare the Pantheon 1.0 dataset to data from the 2003 book, Human Accomplishments, and also to external measures of accomplishment in individual games and sports: Tennis, Swimming, Car Racing, and Chess. In all of these cases we find that measures of popularity (L and HPI) correlate highly with individual accomplishment, suggesting that measures of global popularity proxy the historical impact of individuals.Comment: Scientific Data 2:15007

arXiv.org e-Print Archive

Crossref

PubMed Central