827 research outputs found
Do Neural Ranking Models Intensify Gender Bias?
Concerns regarding the footprint of societal biases in information retrieval
(IR) systems have been raised in several previous studies. In this work, we
examine various recent IR models from the perspective of the degree of gender
bias in their retrieval results. To this end, we first provide a bias
measurement framework which includes two metrics to quantify the degree of the
unbalanced presence of gender-related concepts in a given IR model's ranking
list. To examine IR models by means of the framework, we create a dataset of
non-gendered queries, selected by human annotators. Applying these queries to
the MS MARCO Passage retrieval collection, we then measure the gender bias of a
BM25 model and several recent neural ranking models. The results show that
while all models are strongly biased toward male, the neural models, and in
particular the ones based on contextualized embedding models, significantly
intensify gender bias. Our experiments also show an overall increase in the
gender bias of neural models when they exploit transfer learning, namely when
they use (already biased) pre-trained embeddings.Comment: In Proceedings of ACM SIGIR 202
Data augmentation and semi-supervised learning for deep neural networks-based text classifier
User feedback is essential for understanding user needs. In this paper, we use free-text obtained from a survey on sleep-related issues to build a deep neural networks-based text classifier. However, to train the deep neural networks model, a lot of labelled data is needed. To reduce manual data labelling, we propose a method which is a combination of data augmentation and pseudo-labelling: data augmentation is applied to labelled data to increase the size of the initial train set and then the trained model is used to annotate unlabelled data with pseudo-labels. The result shows that the model with the data augmentation achieves macro-averaged f1 score of 65.2% while using 4,300 training data, whereas the model without data augmentation achieves macro-averaged f1 score of 68.2% with around 14,000 training data. Furthermore, with the combination of pseudo-labelling, the model achieves macro-averaged f1 score of 62.7% with only using 1,400 training data with labels. In other words, with the proposed method we can reduce the amount of labelled data for training while achieving relatively good performance
Language Models for Image Captioning: The Quirks and What Works
Two recent approaches have achieved state-of-the-art results in image
captioning. The first uses a pipelined process where a set of candidate words
is generated by a convolutional neural network (CNN) trained on images, and
then a maximum entropy (ME) language model is used to arrange these words into
a coherent sentence. The second uses the penultimate activation layer of the
CNN as input to a recurrent neural network (RNN) that then generates the
caption sequence. In this paper, we compare the merits of these different
language modeling approaches for the first time by using the same
state-of-the-art CNN as input. We examine issues in the different approaches,
including linguistic irregularities, caption repetition, and data set overlap.
By combining key aspects of the ME and RNN methods, we achieve a new record
performance over previously published results on the benchmark COCO dataset.
However, the gains we see in BLEU do not translate to human judgments.Comment: See http://research.microsoft.com/en-us/projects/image_captioning for
project informatio
- …
