67 research outputs found
"I'm" Lost in Translation: Pronoun Missteps in Crowdsourced Data Sets
As virtual assistants continue to be taken up globally, there is an
ever-greater need for these speech-based systems to communicate naturally in a
variety of languages. Crowdsourcing initiatives have focused on multilingual
translation of big, open data sets for use in natural language processing
(NLP). Yet, language translation is often not one-to-one, and biases can
trickle in. In this late-breaking work, we focus on the case of pronouns
translated between English and Japanese in the crowdsourced Tatoeba database.
We found that masculine pronoun biases were present overall, even though
plurality in language was accounted for in other ways. Importantly, we detected
biases in the translation process that reflect nuanced reactions to the
presence of feminine, neutral, and/or non-binary pronouns. We raise the issue
of translation bias for pronouns and offer a practical solution to embed
plurality in NLP data sets.Comment: 6 page
Exoskeleton for the Mind: Exploring Strategies Against Misinformation with a Metacognitive Agent
Misinformation is a global problem in modern social media platforms with few
solutions known to be effective. Social media platforms have offered tools to
raise awareness of information, but these are closed systems that have not been
empirically evaluated. Others have developed novel tools and strategies, but
most have been studied out of context using static stimuli, researcher prompts,
or low fidelity prototypes. We offer a new anti-misinformation agent grounded
in theories of metacognition that was evaluated within Twitter. We report on a
pilot study (n=17) and multi-part experimental study (n=57, n=49) where
participants experienced three versions of the agent, each deploying a
different strategy. We found that no single strategy was superior over the
control. We also confirmed the necessity of transparency and clarity about the
agent's underlying logic, as well as concerns about repeated exposure to
misinformation and lack of user engagement.Comment: Pages 209-22
An investigation on the muscle synergy model into signal source and muscle type discrimination through hand motion estimation
Recommended from our members
"I'm" Lost in Translation: Crowdsourced Mistranslations of Pronouns in NLP Data Sets
Initiatives in natural language processing (NLP) for training natural conversation and communication between people and virtual assistants (VAs) is taking off. Many are gathering and offering large, crowdsourced data sets for this purpose. An ongoing challenge is that most are English and geared towards the US context. In recognition of this, new efforts have started to be undertaken with the goal of ensuring that a diversity of languages and cultural contexts are represented. Most are crowdsourced translation activities. However, critical analyses have raised concerns about biases in translation, notably around gender and word choice. One issue that remains under-explored is how pronouns are translated from language to language. Languages vary greatly on pronoun use. For example, in English, a small set of pronouns are commonly used, but in Japanese, pronouns are less common and arguably more varied. Translators would need to make careful choices about how to translate pronouns in a case-by-case fashion, and also be aware of the cultural context of variable pronouns such as "they/them," which can be interpreted as a neutral plural pronoun or a singular non-binary pronoun. As a first step, we evaluated pronoun translations in four open NLP data sets that were translated from English to Japanese through crowdsourcing on two respects: (i) relative number of pronouns linked to certain genders and (ii) whether or not direct translations of pronouns were faithfully preserved. We aim to raise awareness of sociocultural biases that may be embedded in translation work and offer a technical solution to account for pronoun diversity and context-sensitive variability
An investigation on the muscle synergy model into signal source and muscle type discrimination through hand motion estimation
Extracting the factors influencing chlorophyll-a concentrations in the Nakdong River using a decision tree algorithm
An investigation on the muscle synergy model into signal source and muscle type discrimination through hand motion estimation
Analysis of Causal Relationships for Nutrient Removal of Activated Sludge Process Based on Structural Equation Modeling Approaches
The removal process of activated sludge in sewage treatment plants is very nonlinear, and removal performance has a complex causal relationship depending on environmental factors, influent load, and operating factors. In this study, how causal relationships are expressed in collected data was identified by structural equation modeling. First, path modeling was attempted as a preliminary step in structural equation model (SEM) construction and, as a result, the nutrient-removal mechanism could not be sufficiently represented as a direct causal relationship between measured variables. However, as a result of the deduced SEMs for effluent total nitrogen (T-N) and total phosphorus (T–P) concentrations, accompanied by exploratory factor analysis to extract latent variables, a causal network was formed that describes the direct or indirect effect of the latent factors and measured variables. Hereby, this study suggests that it is possible to construct an SEM explaining the nutrient-removal mechanism of the activated-sludge process with latent variables. Moreover, nonlinear features embedded in the mechanism could be represented by SEM, which is a model based on linearity, by including causal relations and variables that were not derived by path analysis. This attempt to model the direct and indirect causalities of the process could enhance the understanding of the process, and help decision making such as changing the driving conditions that would be required
Analysis of Causal Relationships for Nutrient Removal of Activated Sludge Process Based on Structural Equation Modeling Approaches
The removal process of activated sludge in sewage treatment plants is very nonlinear, and removal performance has a complex causal relationship depending on environmental factors, influent load, and operating factors. In this study, how causal relationships are expressed in collected data was identified by structural equation modeling. First, path modeling was attempted as a preliminary step in structural equation model (SEM) construction and, as a result, the nutrient-removal mechanism could not be sufficiently represented as a direct causal relationship between measured variables. However, as a result of the deduced SEMs for effluent total nitrogen (T-N) and total phosphorus (T–P) concentrations, accompanied by exploratory factor analysis to extract latent variables, a causal network was formed that describes the direct or indirect effect of the latent factors and measured variables. Hereby, this study suggests that it is possible to construct an SEM explaining the nutrient-removal mechanism of the activated-sludge process with latent variables. Moreover, nonlinear features embedded in the mechanism could be represented by SEM, which is a model based on linearity, by including causal relations and variables that were not derived by path analysis. This attempt to model the direct and indirect causalities of the process could enhance the understanding of the process, and help decision making such as changing the driving conditions that would be required.</jats:p
- …
