28 research outputs found

    Scalable Annotation of Fine-Grained Categories Without Experts

    Full text link
    We present a crowdsourcing workflow to collect image annotations for visually similar synthetic categories without requiring experts. In animals, there is a direct link between taxonomy and visual similarity: e.g. a collie (type of dog) looks more similar to other collies (e.g. smooth collie) than a greyhound (another type of dog). However, in synthetic categories such as cars, objects with similar taxonomy can have very different appearance: e.g. a 2011 Ford F-150 Supercrew-HD looks the same as a 2011 Ford F-150 Supercrew-LL but very different from a 2011 Ford F-150 Supercrew-SVT. We introduce a graph based crowdsourcing algorithm to automatically group visually indistinguishable objects together. Using our workflow, we label 712,430 images by ~1,000 Amazon Mechanical Turk workers; resulting in the largest fine-grained visual dataset reported to date with 2,657 categories of cars annotated at 1/20th the cost of hiring experts.Comment: CHI 201

    Fine-Grained Car Detection for Visual Census Estimation

    Full text link
    Targeted socioeconomic policies require an accurate understanding of a country's demographic makeup. To that end, the United States spends more than 1 billion dollars a year gathering census data such as race, gender, education, occupation and unemployment rates. Compared to the traditional method of collecting surveys across many years which is costly and labor intensive, data-driven, machine learning driven approaches are cheaper and faster--with the potential ability to detect trends in close to real time. In this work, we leverage the ubiquity of Google Street View images and develop a computer vision pipeline to predict income, per capita carbon emission, crime rates and other city attributes from a single source of publicly available visual data. We first detect cars in 50 million images across 200 of the largest US cities and train a model to predict demographic attributes using the detected cars. To facilitate our work, we have collected the largest and most challenging fine-grained dataset reported to date consisting of over 2600 classes of cars comprised of images from Google Street View and other web sources, classified by car experts to account for even the most subtle of visual differences. We use this data to construct the largest scale fine-grained detection system reported to date. Our prediction results correlate well with ground truth income data (r=0.82), Massachusetts department of vehicle registration, and sources investigating crime rates, income segregation, per capita carbon emission, and other market research. Finally, we learn interesting relationships between cars and neighborhoods allowing us to perform the first large scale sociological analysis of cities using computer vision techniques.Comment: AAAI 201

    Using Deep Learning and Google Street View to Estimate the Demographic Makeup of the US

    Full text link
    The United States spends more than $1B each year on initiatives such as the American Community Survey (ACS), a labor-intensive door-to-door study that measures statistics relating to race, gender, education, occupation, unemployment, and other demographic factors. Although a comprehensive source of data, the lag between demographic changes and their appearance in the ACS can exceed half a decade. As digital imagery becomes ubiquitous and machine vision techniques improve, automated data analysis may provide a cheaper and faster alternative. Here, we present a method that determines socioeconomic trends from 50 million images of street scenes, gathered in 200 American cities by Google Street View cars. Using deep learning-based computer vision techniques, we determined the make, model, and year of all motor vehicles encountered in particular neighborhoods. Data from this census of motor vehicles, which enumerated 22M automobiles in total (8% of all automobiles in the US), was used to accurately estimate income, race, education, and voting patterns, with single-precinct resolution. (The average US precinct contains approximately 1000 people.) The resulting associations are surprisingly simple and powerful. For instance, if the number of sedans encountered during a 15-minute drive through a city is higher than the number of pickup trucks, the city is likely to vote for a Democrat during the next Presidential election (88% chance); otherwise, it is likely to vote Republican (82%). Our results suggest that automated systems for monitoring demographic trends may effectively complement labor-intensive approaches, with the potential to detect trends with fine spatial resolution, in close to real time.Comment: 41 pages including supplementary material. Under review at PNA

    Model Cards for Model Reporting

    Full text link
    Trained machine learning models are increasingly used to perform high-impact tasks in areas such as law enforcement, medicine, education, and employment. In order to clarify the intended use cases of machine learning models and minimize their usage in contexts for which they are not well suited, we recommend that released models be accompanied by documentation detailing their performance characteristics. In this paper, we propose a framework that we call model cards, to encourage such transparent model reporting. Model cards are short documents accompanying trained machine learning models that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type) that are relevant to the intended application domains. Model cards also disclose the context in which models are intended to be used, details of the performance evaluation procedures, and other relevant information. While we focus primarily on human-centered machine learning models in the application fields of computer vision and natural language processing, this framework can be used to document any trained machine learning model. To solidify the concept, we provide cards for two supervised models: One trained to detect smiling faces in images, and one trained to detect toxic comments in text. We propose model cards as a step towards the responsible democratization of machine learning and related AI technology, increasing transparency into how well AI technology works. We hope this work encourages those releasing trained machine learning models to accompany model releases with similar detailed evaluation numbers and other relevant documentation

    Image Counterfactual Sensitivity Analysis for Detecting Unintended Bias

    Full text link
    Facial analysis models are increasingly used in applications that have serious impacts on people's lives, ranging from authentication to surveillance tracking. It is therefore critical to develop techniques that can reveal unintended biases in facial classifiers to help guide the ethical use of facial analysis technology. This work proposes a framework called \textit{image counterfactual sensitivity analysis}, which we explore as a proof-of-concept in analyzing a smiling attribute classifier trained on faces of celebrities. The framework utilizes counterfactuals to examine how a classifier's prediction changes if a face characteristic slightly changes. We leverage recent advances in generative adversarial networks to build a realistic generative model of face images that affords controlled manipulation of specific image characteristics. We then introduce a set of metrics that measure the effect of manipulating a specific property on the output of the trained classifier. Empirically, we find several different factors of variation that affect the predictions of the smiling classifier. This proof-of-concept demonstrates potential ways generative models can be leveraged for fine-grained analysis of bias and fairness.Comment: Presented at CVPR 2019 Workshop on Fairness Accountability Transparency and Ethics in Computer Visio

    Community Driven Approaches to Research in Technology & Society CCC Workshop Report

    Full text link
    Based on our workshop activities, we outlined three ways in which research can support community needs: (1) Mapping the ecosystem of both the players and ecosystem and harm landscapes, (2) Counter-Programming, which entails using the same surveillance tools that communities are subjected to observe the entities doing the surveilling, effectively protecting people from surveillance, and conducting ethical data collection to measure the impact of these technologies, and (3) Engaging in positive visions and tools for empowerment so that technology can bring good instead of harm. In order to effectively collaborate on the aforementioned directions, we outlined seven important mechanisms for effective collaboration: (1) Never expect free labor of community members, (2) Ensure goals are aligned between all collaborators, (3) Elevate community members to leadership positions, (4) Understand no group is a monolith, (5) Establish a common language, (6) Discuss organization roles and goals of the project transparently from the start, and (7) Enable a recourse for harm. We recommend that anyone engaging in community-based research (1) starts with community-defined solutions, (2) provides alternatives to digital services/information collecting mechanisms, (3) prohibits harmful automated systems, (4) transparently states any systems impact, (5) minimizes and protects data, (6) proactively demonstrates a system is safe and beneficial prior to deployment, and (7) provides resources directly to community partners. Throughout the recommendation section of the report, we also provide specific recommendations for funding agencies, academic institutions, and individual researchers

    Whose Side are Ethics Codes On? Power, Responsibility and the Social Good

    Full text link
    The moral authority of ethics codes stems from an assumption that they serve a unified society, yet this ignores the political aspects of any shared resource. The sociologist Howard S. Becker challenged researchers to clarify their power and responsibility in the classic essay: Whose Side Are We On. Building on Becker's hierarchy of credibility, we report on a critical discourse analysis of data ethics codes and emerging conceptualizations of beneficence, or the "social good", of data technology. The analysis revealed that ethics codes from corporations and professional associations conflated consumers with society and were largely silent on agency. Interviews with community organizers about social change in the digital era supplement the analysis, surfacing the limits of technical solutions to concerns of marginalized communities. Given evidence that highlights the gulf between the documents and lived experiences, we argue that ethics codes that elevate consumers may simultaneously subordinate the needs of vulnerable populations. Understanding contested digital resources is central to the emerging field of public interest technology. We introduce the concept of digital differential vulnerability to explain disproportionate exposures to harm within data technology and suggest recommendations for future ethics codes.Comment: Conference on Fairness, Accountability, and Transparency (FAT* '20), January 27-30, 2020, Barcelona, Spain. Correcte
    corecore