765 research outputs found
Influence of earthquake ground-motion duration on damage estimation - application to steel moment resisting frames
This paper presents an analytical study evaluating the influence of ground motion duration on structural damage of 3-story, 9-story, and 20-story SAC steel moment resisting frame buildings designed for downtown Seattle, WA, USA, using pre-Northridge codes. Two-dimensional nonlinear finite element models of the buildings are used to estimate the damage induced by the ground motions. A set of 44 ground motions is used to study the combined effect of spectral acceleration and ground motion significant duration on drift and damage measures. In addition, 10 spectrally equivalent short-duration shallow crustal ground motions and long-duration subduction zone records are selected to isolate duration effect and assess its effect on the response. For each ground motion pair, incremental dynamic analyses are performed at at least 20 intensity levels and response measures such as peak interstory drift ratio and energy dissipated are tracked. These response measures are combined into two damage metrics that account for the ductility and energy dissipation. Results indicate that the duration of the ground motion influences, above all, the combined damage measures, although some effect on drift-based response measures is also observed for larger levels of drift. These results indicate that because the current assessment methodologies do not capture the effects of ground motion duration, both performance-based and code-based assessment methodologies should be revised to consider damage measures that are sensitive to duration
You Can’t Get There from Here: On Interpreting Learning Experiments
Artificial language learning experiments provide a unique opportunity to observe learning under controlled conditions. We cannot, however, observe what learning strategy participants use; we can only carefully design the language and observe the response. This poses an inference problem that I name the poverty of the experiment. I use computational learning models to address this inference problem, using data from an artificial grammar learning study (Saffran 2001) in which the authors conclude that participants learned hierarchical structure from distributional cues. Simulations show that that learning hierarchical structure is not required to pass the tests administered in those experiments and that a heuristic learner is the best fit for the observed human performance. Artificial language learning experiments cannot in themselves provide evidence for a particular learning strategy; they must be paired with appropriate modeling work to confirm that an implementation of a proposed learning strategy actually produces the expected results
Reconfigurations of Combinatorial Problems: Graph Colouring and Hamiltonian Cycle
We explore algorithmic aspects of two known combinatorial problems, Graph Colouring and Hamiltonian Cycle, by examining properties of their solution space. One can model the set of solutions of a combinatorial problem by the solution graph , where vertices are solutions of and there is an edge between two vertices, when the two corresponding solutions satisfy an adjacency reconfiguration rule. For example, we can define the reconfiguration rule for graph colouring to be that two solutions are adjacent when they differ in colour in exactly one vertex.
The exploration of the properties of the solution graph can give rise to interesting questions. The connectivity of is the most prominent question in this research area. This is reasonable, since the main motivation for modelling combinatorial solutions as a graph is to be able to transform one into the other in a stepwise fashion, by following paths between solutions in the graph. Connectivity questions can be made binary, that is expressed as decision problems which accept a 'yes' or 'no' answer. For example, given two specific solutions, is there a path between them? Is the graph of solutions connected?
In this thesis, we first show that the diameter of the solution graph of vertex -colourings of k-colourable chordal and chordal bipartite graphs is , where and n is the number of vertices of . Then, we formulate a decision problem on the connectivity of the graph colouring solution graph, where we allow extra colours to be used in order to enforce a path between two colourings with no path between them. We give some results for general instances and we also explore what kind of graphs pose a challenge to determine the complexity of the problem for general instances. Finally, we give a linear algorithm which decides whether there is a path between two solutions of the Hamiltonian Cycle Problem for graphs of maximum degree five, and thus providing insights towards the complexity classification of the decision problem
Digital Linear Tape (DLT) technology and product family overview
The demand that began a couple of years ago for increased data storage capacity continues. Peripheral Strategies (a Santa Barbara, California, Storage Market Research Firm) projects the amount of data stored on the average enterprise network will grow by 50 percent to 100 percent per year. Furthermore, Peripheral Strategies says that a typical mid-range workstation system containing 30GB to 50GB of storage today will grow at the rate of 50 percent per year. Dan Friedlander, a Boulder, Colorado-based consultant specializing in PC-LAN backup, says, 'The average NetWare LAN is about 8GB, but there are many that have 30GB to 300GB.....' The substantial growth of storage requirements has created various tape technologies that seek to satisfy the needs of today's and, especially, the next generations's systems and applications. There are five leading tape technologies in the market today: QIC (Quarter Inch Cartridge), IBM 3480/90, 8mm, DAT (Digital Audio Tape) and DLT (Digital Linear Tape). Product performance specifications and user needs have combined to classify these technologies into low-end, mid-range, and high-end systems applications. Although the manufacturers may try to position their products differently, product specifications and market requirements have determined that QIC and DAT are primarily low-end systems products while 8mm and DLT are competing for mid-range systems applications and the high-end systems space, where IBM compatibility is not required. The 3480/90 products seem to be used primarily in the IBM market, for interchangeability purposes. There are advantages and disadvantages for each of the tape technologies in the market today. We believe that DLT technology offers a significant number of very important features and specifications that make it extremely attractive for most current as well as emerging new applications, such as Hierarchical Storage Management (HSM). This paper will demonstrate why we think that the DLT technology and family of DLT products will become the technology of choice for most new applications in the mid-range and high-end (non-IBM) markets
ParaNames 1.0: Creating an Entity Name Corpus for 400+ Languages using Wikidata
We introduce ParaNames, a massively multilingual parallel name resource
consisting of 140 million names spanning over 400 languages. Names are provided
for 16.8 million entities, and each entity is mapped from a complex type
hierarchy to a standard type (PER/LOC/ORG). Using Wikidata as a source, we
create the largest resource of this type to date. We describe our approach to
filtering and standardizing the data to provide the best quality possible.
ParaNames is useful for multilingual language processing, both in defining
tasks for name translation/transliteration and as supplementary data for tasks
such as named entity recognition and linking. We demonstrate the usefulness of
ParaNames on two tasks. First, we perform canonical name translation between
English and 17 other languages. Second, we use it as a gazetteer for
multilingual named entity recognition, obtaining performance improvements on
all 10 languages evaluated.Comment: Accepted to LREC-COLING 2024. arXiv admin note: text overlap with
arXiv:2202.1403
The Effectiveness of Morphology-aware Segmentation in Low-Resource Neural Machine Translation
This paper evaluates the performance of several modern subword segmentation
methods in a low-resource neural machine translation setting. We compare
segmentations produced by applying BPE at the token or sentence level with
morphologically-based segmentations from LMVR and MORSEL. We evaluate
translation tasks between English and each of Nepali, Sinhala, and Kazakh, and
predict that using morphologically-based segmentation methods would lead to
better performance in this setting. However, comparing to BPE, we find that no
consistent and reliable differences emerge between the segmentation methods.
While morphologically-based methods outperform BPE in a few cases, what
performs best tends to vary across tasks, and the performance of segmentation
methods is often statistically indistinguishable.Comment: EACL 2021 Student Research Worksho
ParaNames: A Massively Multilingual Entity Name Corpus
This preprint describes work in progress on ParaNames, a multilingual
parallel name resource consisting of names for approximately 14 million
entities. The included names span over 400 languages, and almost all entities
are mapped to standardized entity types (PER/LOC/ORG). Using Wikidata as a
source, we create the largest resource of this type to-date. We describe our
approach to filtering and standardizing the data to provide the best quality
possible. ParaNames is useful for multilingual language processing, both in
defining tasks for name translation/transliteration and as supplementary data
for tasks such as named entity recognition and linking. We demonstrate an
application of ParaNames by training a multilingual model for canonical name
translation to and from English. Our resource is released at
\url{https://github.com/bltlab/paranames} under a Creative Commons license (CC
BY 4.0)
B-glucosidase Mutation T352V Catalytic Efficiency and Thermal Stability
This study aimed to produce data to improve protein modeling software for B-glucosidase (BglB), a crucial enzyme in producing glucose from cellulose. It was hypothesized that BglB mutant T352V would demonstrate decreased catalytic efficiency and thermal stability compared to the wild type. The T352V mutation was first observed using Foldit Standalone modeling software. DNA sequencing and SDS-PAGE analysis confirmed mutation expression and purity. The kinetic assay indicated a decrease in catalytic efficiency in the T352V mutant. The thermostability assay showed no activity for the T352V mutant, suggesting an error occurred or the temperature range was too high
LR-Sum: Summarization for Less-Resourced Languages
This preprint describes work in progress on LR-Sum, a new
permissively-licensed dataset created with the goal of enabling further
research in automatic summarization for less-resourced languages. LR-Sum
contains human-written summaries for 40 languages, many of which are
less-resourced. We describe our process for extracting and filtering the
dataset from the Multilingual Open Text corpus (Palen-Michel et al., 2022). The
source data is public domain newswire collected from from Voice of America
websites, and LR-Sum is released under a Creative Commons license (CC BY 4.0),
making it one of the most openly-licensed multilingual summarization datasets.
We describe how we plan to use the data for modeling experiments and discuss
limitations of the dataset
- …
