Autonomous Bias Detection in Text-to-Image Models: Uncovering Biases through Image-to-Text Analysis and Confusion Matrix Visualization

Thumbnail Image

Date

2023-08-25

Journal Title

Journal ISSN

Volume Title

Publisher

Saudi Digital Library

Abstract

This paper contributes to the ongoing efforts of bias detection and mitigation in AI systems, offering a foundation for developing more inclusive and equitable technologies. In the realm of AI bias detection, recent methods relying on human in the loop have exhibited limitations in scalability and reliability. Addressing biases in text- to-image models has become paramount for ensuring ethical AI practices. This paper introduces a novel approach to uncovering gender and ethnicity biases and other kinds of biases such as Stereotype, Visual, and Ambiguity based biases. This approach compares generative model input text with image-to-text (CLIP and BLIP-2)models responses and leverages the confusion matrix for detailed visualization. Ultimately, the culmination of our approach entails a textual assessment. Utilizing a stable diffusion model, this approach involves two experiments: analyzing 10,000 images generated by gender-protected captions from CelebA (experiment 1), secondly probing biases in occupation profiling via 26,000 images generated from structured input text featuring Professions and Adjectives(experiment 2). This innovative tech- nique offers the potential to effectively mitigate biases, providing robust and new autonomous bias detection and analyzing it with two different datasets. Through experiments utilizing a stable diffusion model, biases are analyzed in both gender attributes and occupation profiling, elucidating disparities through the lens of CLIP and BLIP-2 responses. Our approach successfully uncovered evident biases towards male gender dominance and a noticeable prevalence of white ethnicity, along with other biases, within the Text-to-Image (TTI) model. This robust implemen- tation strongly supports the effectiveness of our approach in uncovering such biases.

Description

our novel technique is designed to unearth biases related to gender and ethnicity in text-to-image generative models . As we can see from the figure 1 this approach Use Text-to-Image (TTI) Stable Diffusion , which is a Model A. This model takes ambiguous captions prompts as input then generates images. These images are then given to model B,which is an ( CLIP and BLIP-2)Image-to-Text (ITT)Visual models. Mobel B takes the images and generate responses ”text” based on the images. After this, we have two sets of text: one from the input of Model A and the other from the output of Model B. To assess how well the models are aligning, we use a function called ”confusion matrix” to evaluate the performance of a classification model. This matrix helps us visually compare the two sets of text and see how many values match. This comparison helps us understand if the models accurately capture the features. When the diagonal values (values on the main diagonal of the matrix) are lower than the surrounding values, it indicates the possible presence of bias in the model, especially related to specific features. The confusion matrix emerges as more robust due to its quantitative nature, flexibility in capturing diverse biases, scalability to different TTI generative models. By leveraging standardised assessments, this technique ensures greater objectivity in bias detection. Furthermore, it circumvents the challenges posed by image-centric evaluations by focusing on textual content.

Keywords

Stable Diffusion, Bias Identification, Generative models, Ambiguity bias, Visual bias, Stereotype Bias, Gender bias, Ethnicity Bias, text to image, Image to text, Automated Biases Detection

Citation

Endorsement

Review

Supplemented By

Referenced By

Copyright owned by the Saudi Digital Library (SDL) © 2024