7,147 research outputs found
GPU-TLS: an efficient runtime for speculative loop parallelization on GPUs
Recently GPUs have risen as one important parallel platform for general purpose applications, both in HPC and cloud environments. Due to the special execution model, developing programs for GPUs is difficult even with the recent introduction of high-level languages like CUDA and OpenCL. To ease the programming efforts, some research has proposed automatically generating parallel GPU codes by complex compile-time techniques. However, this approach can only parallelize loops 100% free of inter-iteration dependencies (i.e., DOALL loops). To exploit runtime parallelism, which cannot be proven by static analysis, in this work, we propose GPU-TLS, a runtime system to speculatively parallelize possibly-parallel loops in sequential programs on GPUs. GPU-TLS parallelizes a possibly-parallel loop by chopping it into smaller sub-loops, each of which is executed in parallel by a GPU kernel, speculating that no inter-iteration dependencies exist. After dependency checking, the buffered writes of iterations without mis-speculations are copied to the master memory while iterations encountering mis-speculations are re-executed. GPU-TLS addresses several key problems of speculative loop parallelization on GPUs: (1) The larger mis-speculation rate caused by larger number of threads is reduced by three approaches: the loop chopping parallelization approach, the deferred memory update scheme and intra-warp value forwarding method. (2) The larger overhead of dependency checking is reduced by a hybrid scheme: eager intra-warp dependency checking combined with lazy inter-warp dependency checking. (3) The bottleneck of serial commit is alleviated by a parallel commit scheme, which allows different iterations to enter the commit phase out of order but still guarantees sequential semantics. Extensive evaluations using both microbenchmarks and reallife applications on two recent NVIDIA GPU cards show that speculative loop parallelization using GPU-TLS can achieve speedups ranging from 5 to 160 for sequential programs with possibly-parallel loops. © 2013 IEEE.published_or_final_versio
Towards payment-bound analysis in cloud systems with task-prediction errors
Conference Theme: Change we are leadingIn modern cloud systems, how to optimize user service level based on virtual resources customized on demand is a critical issue. In this paper, we comprehensively analyze the payment bound under a cloud model with virtual machines (VMs), by taking into account that task’s workload may be predicted with errors. The analysis is based on an optimized resource allocation algorithm with polynomial time complexity. We theoretically derive the upper bound of task payment based on a particular margin of workload prediction-error. We also extend the payment-minimization algorithm to adapt to the dynamic changes of host availability over time, and perform the evaluation by a real-cluster environment with 56 VMs deployed. Experiments confirm the correctness of our theoretical inference, and show that our payment-minimization solution can keep 95% of user payments below 1.15 times as large as the theoretical values of the ideal payment with hypothetically accurate information. The ratio for the rest user payments can be limited to about 1.5 at the worst case.postprin
Recommended from our members
An Overview of the Use of Neural Networks for Data Mining Tasks
In the recent years the area of data mining has experienced a considerable demand for technologies that extract knowledge from large and complex data sources. There is a substantial commercial interest as well as research investigations in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from datasets. Artificial Neural Networks (NN) are popular biologically inspired intelligent methodologies, whose classification, prediction and pattern recognition capabilities have been utilised successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks
Graphene for spintronics: giant Rashba splitting due to hybridization with Au
Graphene in spintronics has so far primarily meant spin current leads of high
performance because the intrinsic spin-orbit coupling of its pi-electrons is
very weak. If a large spin-orbit coupling could be created by a proximity
effect, the material could also form active elements of a spintronic device
such as the Das-Datta spin field-effect transistor, however, metal interfaces
often compromise the band dispersion of massless Dirac fermions. Our
measurements show that Au intercalation at the graphene-Ni interface creates a
giant spin-orbit splitting (~100 meV) in the graphene Dirac cone up to the
Fermi energy. Photoelectron spectroscopy reveals hybridization with Au-5d
states as the source for the giant spin-orbit splitting. An ab initio model of
the system shows a Rashba-split dispersion with the analytically predicted
gapless band topology around the Dirac point of graphene and indicates that a
sharp graphene-Au interface at equilibrium distance will account for only ~10
meV spin-orbit splitting. The ab initio calculations suggest an enhancement due
to Au atoms that get closer to the graphene and do not violate the sublattice
symmetry.Comment: 16 pages (3 figures) + supplementary information 16 pages (14
figures
SILAC-based proteomic quantification of chemoattractant-induced cytoskeleton dynamics on a second to minute timescale
Cytoskeletal dynamics during cell behaviours ranging from endocytosis and exocytosis to cell division and movement is controlled by a complex network of signalling pathways, the full details of which are as yet unresolved. Here we show that SILAC-based proteomic methods can be used to characterize the rapid chemoattractant-induced dynamic changes in the actin–myosin cytoskeleton and regulatory elements on a proteome-wide scale with a second to minute timescale resolution. This approach provides novel insights in the ensemble kinetics of key cytoskeletal constituents and association of known and novel identified binding proteins. We validate the proteomic data by detailed microscopy-based analysis of in vivo translocation dynamics for key signalling factors. This rapid large-scale proteomic approach may be applied to other situations where highly dynamic changes in complex cellular compartments are expected to play a key role
XRCC2 R188H (rs3218536), XRCC3 T241M (rs861539) and R243H (rs77381814) single nucleotide polymorphisms in cervical cancer risk
Human Papillomavirus (HPV) is the main cause of cervical cancer and its precursor lesions. Transformation may be induced by several mechanisms, including oncogene activation and genome instability. Individual differences in DNA damage recognition and repair have been hypothesized to influence cervical cancer risk. The aim of this study was to evaluate whether the double strand break gene polymorphisms XRCC2 R188H G>A (rs3218536), XRCC3 T241M C>T (rs861539) and R243H G>A (rs77381814) are associated to cervical cancer in Argentine women. A case control study consisting of 322 samples (205 cases and 117 controls) was carried out. HPV DNA detection was performed by PCR and genotyping of positive samples by EIA (enzyme immunoassay). XRCC2 and 3 polymorphisms were determined by pyrosequencing. The HPV-adjusted odds ratio (OR) of XRCC2 188 GG/AG genotypes was OR = 2.4 (CI = 1.1-4.9, p = 0.02) for cervical cancer. In contrast, there was no increased risk for cervical cancer with XRCC3 241 TT/CC genotypes (OR = 0.48; CI = 0.2-1; p = 0.1) or XRCC3 241 CT/CC (OR = 0.87; CI = 0.52-1.4; p = 0.6). Regarding XRCC3 R243H, the G allele was almost fixed in the population studied. In conclusion, although the sample size was modest, the present data indicate a statistical association between cervical cancer and XRCC2 R188H polymorphism. Future studies are needed to confirm these findings.Fil: Perez, Luis Orlando. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico CONICET- La Plata. Instituto de Genética Veterinaria "Ing. Fernando Noel Dulout". Universidad Nacional de La Plata. Facultad de Ciencias Veterinarias. Instituto de Genética Veterinaria; ArgentinaFil: Crivaro, Andrea Natalia. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico CONICET- La Plata. Instituto de Genética Veterinaria "Ing. Fernando Noel Dulout". Universidad Nacional de La Plata. Facultad de Ciencias Veterinarias. Instituto de Genética Veterinaria; ArgentinaFil: Barbisan, Gisela. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico CONICET- La Plata. Instituto de Genética Veterinaria "Ing. Fernando Noel Dulout". Universidad Nacional de La Plata. Facultad de Ciencias Veterinarias. Instituto de Genética Veterinaria; ArgentinaFil: Poleri, Lucía Belén. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico CONICET- La Plata. Instituto de Genética Veterinaria "Ing. Fernando Noel Dulout". Universidad Nacional de La Plata. Facultad de Ciencias Veterinarias. Instituto de Genética Veterinaria; ArgentinaFil: Golijow, Carlos Daniel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico CONICET- La Plata. Instituto de Genética Veterinaria "Ing. Fernando Noel Dulout". Universidad Nacional de La Plata. Facultad de Ciencias Veterinarias. Instituto de Genética Veterinaria; Argentin
Gauged Flavor Group with Left-Right Symmetry
We construct an anomaly-free extension of the left-right symmetric model,
where the maximal flavor group is gauged and anomaly cancellation is guaranteed
by adding new vectorlike fermion states. We address the question of the lowest
allowed flavor symmetry scale consistent with data. Because of the mechanism
recently pointed out by Grinstein et al. tree-level flavor changing neutral
currents turn out to play a very weak constraining role. The same occurs, in
our model, for electroweak precision observables. The main constraint turns out
to come from WR-mediated flavor changing neutral current box diagrams,
primarily K - Kbar mixing. In the case where discrete parity symmetry is
present at the TeV scale, this constraint implies lower bounds on the mass of
vectorlike fermions and flavor bosons of 5 and 10 TeV respectively. However,
these limits are weakened under the condition that only SU(2)_R x U(1)_{B-L} is
restored at the TeV scale, but not parity. For example, assuming the SU(2)
gauge couplings in the ratio gR/gL approx 0.7 allows the above limits to go
down by half for both vectorlike fermions and flavor bosons. Our model provides
a framework for accommodating neutrino masses and, in the parity symmetric
case, provides a solution to the strong CP problem. The bound on the lepton
flavor gauging scale is somewhat stronger, because of Big Bang Nucleosynthesis
constraints. We argue, however, that the applicability of these constraints
depends on the mechanism at work for the generation of neutrino masses.Comment: 1+23 pages, 1 table, 5 figures. v3: some more textual fixes (main
change: discussion of Lepton Flavor Violating observables rephrased). Matches
journal versio
Testing and comparing two self-care-related instruments among older Chinese adults
Objectives The study aimed to test and compare the reliability and validity, including sensitivity and specificity of the two self-care-related instruments, the Self-care Ability Scale for the Elderly (SASE), and the Appraisal of Self-care Agency Scale-Revised (ASAS-R), among older adults in the Chinese context. Methods A cross-sectional design was used to conduct this study. The sample consisted of 1152 older adults. Data were collected by a questionnaire including the Chinese version of SASE (SASE-CHI), the Chinese version of ASAS-R (ASAS-R-CHI) and the Exercise of Self-Care Agency scale (ESCA). Homogeneity and stability, content, construct and concurrent validity, and sensitivity and specificity were assessed. Results The Cronbach's alpha (α) of SASE-CHI was 0.89, the item-to-total correlations ranged from r = 0.15 to r = 0.81, and the test-retest correlation coefficient (intra-class correlation coefficient, ICC) was 0.99 (95% CI, 0.99±1.00; P<0.001). The Cronbach's α of ASAS-R-CHI was 0.78, the item-to-total correlations ranged from r = 0.20 to r = 0.65, and the test-retest ICC was 0.95 (95% CI, 0.92±0.96; P<0.001). The content validity index (CVI) of SASE-CHI and ASAS-R-CHI was 0.96 and 0.97, respectively. The findings of exploratory and confirmatory factor analyses (EFA and CFA) confirmed a good construct validity of SASE-CHI and ASAS-R-CHI. The Pearson's rank correlation coefficients, as a measure of concurrent validity, between total score of SASE-CHI and ESCA and ASAS-R-CHI and ESCA were assessed to 0.65 (P<0.001) and 0.62 (P<0.001), respectively. Regarding ESCA as the criterion, the area under the receiver operator characteristic (ROC) curve for the cut-point of SASE-CHI and ASAS-R-CHI were 0.93 (95% CI, 0.91±0.94) and 0.83 (95% CI, 0.80±0.86), respectively. Conclusion There is no significant difference between the two instruments. Each has its own characteristics, but SASE-CHI is more suitable for older adults. The key point is that the users can choose the most appropriate scale according to the specific situation.publishedVersionNivå
Neutrino Mass, Sneutrino Dark Matter and Signals of Lepton Flavor Violation in the MRSSM
We study the phenomenology of mixed-sneutrino dark matter in the Minimal
R-Symmetric Supersymmetric Standard Model (MRSSM). Mixed sneutrinos fit
naturally within the MRSSM, as the smallness (or absence) of neutrino Yukawa
couplings singles out sneutrino A-terms as the only ones not automatically
forbidden by R-symmetry. We perform a study of randomly generated sneutrino
mass matrices and find that (i) the measured value of is well
within the range of typical values obtained for the relic abundance of the
lightest sneutrino, (ii) with small lepton-number-violating mass terms
for the right-handed sneutrinos, random
matrices satisfying the constraint have a decent probability of
satisfying direct detection constraints, and much of the remaining parameter
space will be probed by upcoming experiments, (iii) the terms radiatively generate appropriately small Majorana neutrino
masses, with neutrino oscillation data favoring a mostly sterile lightest
sneutrino with a dominantly mu/tau-flavored active component, and (iv) a
sneutrino LSP with a significant mu component can lead to striking signals of
e-mu flavor violation in dilepton invariant-mass distributions at the LHC.Comment: Revised collider analysis in Sec. 5 after fixing error in particle
spectrum, References adde
Beyond the culture effect on credibility perception on microblogs
We investigated the credibility perception of tweet readers from the USA and by readers from eight Arabic countries; our aim was to understand if credibility was affected by country and/or by culture. Results from a crowd-sourcing experiment, showed a wide variety of factors affected credibility perception, including a tweet author's gender, profile image, username style, location, and social network overlap with the reader. We found that culture determines readers' credibility perception, but country has no effect. We discuss the implications of our findings for user interface design and social media systems
- …
