• anachronist@midwest.social
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    1
    ·
    9 months ago

    Models don’t get bigger as you add more stuff.

    They will get less coherent and/or “forget” the earlier data if you don’t increase the parameters with the training set.

    There are two-gigabyte networks that have been trained on hundreds of millions of images

    You can take a huge tiff of an image, put it through JPEG with the quality cranked all the way down and get a tiny file out the other side, which is still a recognizable derivative of the original. LLMs are extremely lossy compression of their training set.

    • mindbleach@sh.itjust.works
      link
      fedilink
      arrow-up
      4
      ·
      9 months ago

      which is still a recognizable derivative of the original

      Not in twelve bytes.

      Deep models are a statistical distillation of a metric shitload of data. Smaller models with more training on more data don’t get worse, they get more abstract - and in adversarial uses they often kick big networks’ asses.