• gmtom@lemmy.world
    link
    fedilink
    arrow-up
    137
    ·
    9 months ago

    Not sure if someone else has brought this up, but this is because these AI models are massively biased towards generating white people so as a lazy “fix” they randomly add race tags to your prompts to get more racially diverse results.

    • kromem@lemmy.world
      link
      fedilink
      English
      arrow-up
      28
      arrow-down
      2
      ·
      edit-2
      9 months ago

      Exactly. I wish people had a better understanding of what’s going on technically.

      It’s not that the model itself has these biases. It’s that the instructions given them are heavy handed in trying to correct for an inversely skewed representation bias.

      So the models are literally instructed things like “if generating a person, add a modifier to evenly represent various backgrounds like Black, South Asian…”

      Here you can see that modifier being reflected back when the prompt is shared before the image.

      It’s like an ethnicity AdLibs the model is being instructed to fill out whenever generating people.

    • Marcbmann@lemmy.world
      link
      fedilink
      arrow-up
      8
      arrow-down
      1
      ·
      9 months ago

      I mean, I don’t think it’s an easy thing to fix. How do you eliminate bias in the training data without eliminating a substantial percentage of your training data. Which would significantly hinder performance.

      • bamboo@lemmy.blahaj.zone
        link
        fedilink
        English
        arrow-up
        10
        ·
        9 months ago

        Rather than eliminating the some of the training data, you could add more training data to create an even balance.

        • kromem@lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          ·
          9 months ago

          Indeed - there’s a very good argument for using synthetic data to introduce diversity as long as you can avoid model collapse.