8,920 research outputs found
Sentiment Analysis on New York Times Articles Data
Sentiment Analysis on New York Times Coverage Data
Departmental Affiliation: Data Science/ Political Science
College of Arts and Sciences
The extant political science literature examines media coverage of immigration and assesses the effect of that coverage on partisanship in the United States. Immigration is believed to be a unique factor that induces large- scale changes in partisanship based on race and ethnicity. The negative tone of media coverage pushes non-Latino Whites into the Republican Party, while Latinos trend toward the Democratic Party. The aim for this project is to look at New York time data in order to identify how much immigration is covered in newspaper outlets, specifically Latino immigration, and to determine the overall tone of these stories.
In this research, we seek to determine individual articles take a positive, neutral or negative stance. We achieve this using a dictionary-based approach, meaning we look at individual words to assess if it has a positive, neutral or negative connotation. We train our data using publicly accessible sentiment dictionaries such as VADER (Valence Aware Dictionary and Sentiment Reasoner). However, this task can be difficult because certain words can be dynamic and may pertain to a positive or negative sentiment in context of the article. In order to resolve this issue, we use reliability measures to ensure that the words of high frequencies are in the correct sphere of negative, neutral, and positive light.
Information about the Author(s):
Faculty Sponsor(s): Professor Gregg B. Johnson and Professor Karl Schmitt
Student Contact: Gabriel Carvajal – [email protected]
Strategic Planning: Implications and Applications for Line Managers
Strategic planning is the key to producing a realistic, attractive rate of growth and a respectable return on investment. The author analyzes the steps in the planning process and looks at the environmental and cultural values which influence the strategic planner in his/her work
Questionnaire Construction
Questionnaires used in survey research can elicit excellent data for analysis for any part of the industry. The author discusses how to design questions, construct the survey, and watch for errors in conducting the re- search so that the results secured advance scientific inquiry
The evolution and star formation of dwarf galaxies in the Fornax Cluster
We present the results of a spectroscopic survey of 675 bright (16.5<Bj<18)
galaxies in a 6 degree field centred on the Fornax cluster with the FLAIR-II
spectrograph on the UK Schmidt Telescope. We measured redshifts for 516
galaxies of which 108 were members of the Fornax Cluster. Nine of these are new
cluster members previously misidentified as background galaxies. The cluster
dynamics show that the dwarf galaxies are still falling into the cluster
whereas the giants are virialised. Our spectral data reveal a higher rate of
star formation among the dwarf galaxies than suggested by morphological
classification: 35 per cent have H-alpha emission indicative of star formation
but only 19 per cent were morphologically classified as late-types. The
distribution of scale sizes is consistent with evolutionary processes which
transform late-type dwarfs to early-type dwarfs. The fraction of dwarfs with
active star formation drops rapidly towards the cluster centre. The
star-forming dwarfs are concentrated in the outer regions of the cluster, the
most extreme in an infalling subcluster. We estimate gas depletion time scales
for 5 dwarfs with detected HI emission: these are long (of order 10 Gyr),
indicating that active gas removal must be involved if they are transformed
into gas-poor dwarfs as they fall further into the cluster. In agreement with
our previous results, we find no compact dwarf elliptical (M32-like) galaxies
in the Fornax Cluster.Comment: To appear in Monthly Notices of the Royal Astronomical Societ
USING A MULTIPLE PRODUCT AND MULTIPLE INPUT APPROACH TO DAIRY PROFIT MAXIMIZATION: A SIMULATION USING OPERATIONS RESEARCH METHODS
Dairy producers generally take a single output/multiple input approach when making production decisions. Under component pricing, with large variance in individual component prices, a multiple output/multiple input approach maximizes profits. This paper applied our approach to the individual farm milk production decision.Livestock Production/Industries, Productivity Analysis,
The Impact of Fathers' Job Loss during the Recession of the 1980s on their Children's Educational Attainment and Labour Market Outcomes
Effects of Incomplete Ionization on Beta - Ga2O3 Power Devices: Unintentional Donor with Energy 110 meV
Understanding the origin of unintentional doping in Ga2O3 is key to
increasing breakdown voltages of Ga2O3 based power devices. Therefore,
transport and capacitance spectroscopy studies have been performed to better
understand the origin of unintentional doping in Ga2O3. Previously unobserved
unintentional donors in commercially available (-201) Ga2O3 substrates have
been electrically characterized via temperature dependent Hall effect
measurements up to 1000 K and found to have a donor energy of 110 meV. The
existence of the unintentional donor is confirmed by temperature dependent
admittance spectroscopy, with an activation energy of 131 meV determined via
that technique, in agreement with Hall effect measurements. With the
concentration of this donor determined to be in the mid to high 10^16 cm^-3
range, elimination of this donor from the drift layer of Ga2O3 power
electronics devices will be key to pushing the limits of device performance.
Indeed, analytical assessment of the specific on-resistance (Ronsp) and
breakdown voltage of Schottky diodes containing the 110 meV donor indicates
that incomplete ionization increases Ronsp and decreases breakdown voltage as
compared to Ga2O3 Schottky diodes containing only the shallow donor. The
reduced performance due to incomplete ionization occurs in addition to the
usual tradeoff between Ronsp and breakdown voltage. To achieve 10 kV operation
in Ga2O3 Schottky diode devices, analysis indicates that the concentration of
110 meV donors must be reduced below 5x10^14 cm^-3 to limit the increase in
Ronsp to one percent.Comment: 23 pages, 8 figure
Spatial Variability in Column CO2 Inferred from High Resolution GEOS-5 Global Model Simulations: Implications for Remote Sensing and Inversions
Column CO2 observations from current and future remote sensing missions represent a major advancement in our understanding of the carbon cycle and are expected to help constrain source and sink distributions. However, data assimilation and inversion methods are challenged by the difference in scale of models and observations. OCO-2 footprints represent an area of several square kilometers while NASA s future ASCENDS lidar mission is likely to have an even smaller footprint. In contrast, the resolution of models used in global inversions are typically hundreds of kilometers wide and often cover areas that include combinations of land, ocean and coastal areas and areas of significant topographic, land cover, and population density variations. To improve understanding of scales of atmospheric CO2 variability and representativeness of satellite observations, we will present results from a global, 10-km simulation of meteorology and atmospheric CO2 distributions performed using NASA s GEOS-5 general circulation model. This resolution, typical of mesoscale atmospheric models, represents an order of magnitude increase in resolution over typical global simulations of atmospheric composition allowing new insight into small scale CO2 variations across a wide range of surface flux and meteorological conditions. The simulation includes high resolution flux datasets provided by NASA s Carbon Monitoring System Flux Pilot Project at half degree resolution that have been down-scaled to 10-km using remote sensing datasets. Probability distribution functions are calculated over larger areas more typical of global models (100-400 km) to characterize subgrid-scale variability in these models. Particular emphasis is placed on coastal regions and regions containing megacities and fires to evaluate the ability of coarse resolution models to represent these small scale features. Additionally, model output are sampled using averaging kernels characteristic of OCO-2 and ASCENDS measurement concepts to create realistic pseudo-datasets. Pseudo-data are averaged over coarse model grid cell areas to better understand the ability of measurements to characterize CO2 distributions and spatial gradients on both short (daily to weekly) and long (monthly to seasonal) time scale
Demographic estimation methods for plants with dormancy
Demographic studies in plants appear simple because unlike animals, plants do not run away. Plant individuals can be marked with, e.g., plastic tags, but often the coordinates of an idividual may be sufficient to identify it. Vascular plants in temperate latitudes have a pronounced seasonal life–cycle, so most plant demographers survey their study plots once a year often during or shortly after flowering. Life–states are pervasive in plants, hence the results of a demographic study for an individual can be summarized in a familiar encounter history, such as 0VFVVF000. A zero means that an individual was not seen in a year and a letter denotes its state for years when it was seen aboveground. V and F here stand for vegetative and flowering states, respectively. Probabilities of survival and state transitions can then be obtained by mere counting.
Problems arise when there is an unobservable dormant state, i.e., when plants may stay belowground for one or more growing seasons. Encounter histories such as 0VF00F000 may then occur where the meaning of zeroes becomes ambiguous. A zero can either mean a dead or a dormant plant. Various ad hoc methods in wide use among plant ecologists have made strong assumptions about when a zero should be equated to a dormant individual. These methods have never been compared among each other. In our talk and in Kéry et al. (submitted), we show that these ad hoc estimators provide spurious estimates of survival and should not be used.
In contrast, if detection probabilities for aboveground plants are known or can be estimated, capturerecapture(CR) models can be used to estimate probabilities of survival and state–transitions and the fraction of the population that is dormant. We have used this approach in two studies of terrestrial orchids, Cleistes bifaria (Kéry et al., submitted) and Cypripedium reginae (Kéry & Gregg, submitted) in West Virginia, U.S.A. For Cleistes, our data comprised one population with a total of 620 marked ramets over 10 years, and for Cypripedium, two populations with 98 and 258 marked ramets over 11 years. We chose the ramet (= single stem or shoot) as the demographic unit of our study since there was no way distinguishing among genets (genet = genetical individual, i.e., the "individual" that animal ecologists are mostly concerned with). This will introduce some non–independence into the data, which can nevertheless be dealt with easily by correcting variances for overdispersion. Using ramets instead of genets has the further advantage that individuals can be assigned to a state such as flowering or vegetative in an unambiguous manner. This is not possible when genets are the demographic units. In all three populations, auxiliary data was available to show that detection probability of aboveground plants was > 0.995.
We fitted multistate models in program MARK by specifying three states (D, V, F), even though the dormant state D does not occur in the encounter histories. Detection probability is fixed at 1 for the vegetative (V) and the flowering state (F) and at zero for the dormant state (D). Rates of survival and of state transitions as well as slopes of covariate relationships can be estimated and LRT or the AIC machinery be used to select among models. To estimate the fraction of the population in the unobservable dormant state, the encounter histories are collapsed to 0 (plant not observed aboveground) and 1 (plant observed aboveground). The Cormack–Jolly–Seber model without constraints on detection probability is used to estimate detection probability, the complement of which is the estimated fraction of the population in the dormant state.
Parameter identifiability is an important issue in multi state models. We used the Catchpole–Morgan–Freeman approach to determine which parameters are estimable in principle in our multi state models. Most of 15 tested models were indeed estimable with the notable exception of the most general model, which has fully interactive state- and time-dependent survival and state transition rates. This model would become identifiable if at least some plants would be excavated in years when they do not show up aboveground.
Our analyses for three analyzed populations of Cleistes and Cypripedium yielded annual ramet survival rates ranging from 0.86–0.96. Estimates of the average fraction dormant ranged from 0.02–0.30, but with up to half a population in the dormant state in some years. Ultrastructural modeling enables interesting hypotheses to be tested about the relationships of demographic rates with climatic covariates for instance. Such covariate modeling makes the CR approach particularly interesting for evolutionary–ecological questions about, e.g., the adaptive significance of the dormant state.
Previous and foreseeable future applications of CR in plant ecology
Since the paper by Alexander et al. (1997), it has become increasingly clear that CR models may be useful for demographic analysis of plant populations. In the future, we are likely to see increasing use of these methods that were originally developed for animal populations. Here is a summary about all previous applications that I have come across. I am grateful if readers point out to me any titles that I may have missed.
If a reliable way to mark seeds can be devised, CR might indeed provide the analysis tool for tackling one of the ultimate frontiers in plant population ecology: the dynamics of the seed bank. Indeed, the first ever application of CR to plants that I have come across (Naylor, 1972) used a fluorescent dye to mark seeds and a Lincoln–Peterson–type estimator to estimate the seed bank size in an agricultural weed. The application of CR to plants with dormancy has been treated by hefferson et al. (2001, 2003), Kéry et al. (submitted) and Kéry & Gregg (submitted). Population size, and survival rates of plants whose aboveground states are easily overlooked have been estimated for an elusive prairie plant (Alexander et al., 1997; Slade et al., 2003) and for a tropical savannah tree (Lahoreau et al., 2003). For plot–based plant demographic studies, we have shown previously that (not surprisingly) different life–states may have different detection probabilities, and that this may seriously bias inference from population modelling (Kéry & Gregg, 2003).
It is somewhat astonishing that there still appear to be no applications of CR to the analysis of plant populations and communities. For instance, species richness, patch occupancy, population extinction rates, and species turnover in communities are all still based on adding up the raw data, even though the animal literature has plenty of papers showing more adequate ways of estimating these quantities (e.g., Boulinier et al., 1998; Nichols et al., 1998). I have submitted a note (Kéry, submitted) describing the use of the Cormack–Jolly–Seber model to estimate extinction probabilities for plant populations in a manner exactly analogous to patch occupancy models (MacKenzie et al., 2002, 2003). It is perhaps in plant community ecology where we will see most future applications of CR
- …
