• 0 Posts
  • 99 Comments
Joined 1 year ago
cake
Cake day: July 23rd, 2023

help-circle



  • Toribor@corndog.socialtoMemes@lemmy.mlMany such cases
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    1
    ·
    1 month ago

    This message brought to you by Jill Stein who pops up every four years to grift money and accomplish nothing except cozying up to Putin. I guess there is still the brain worm dead animal guy?

    Third parties in the US are unserious. I wish that weren’t the case but that’s the reality.













  • I’ve been testing Ollama in Docker/WSL with the idea that if I like it I’ll eventually move my GPU into my home server and get an upgrade for my gaming pc. When you run a model it has to load the whole thing into VRAM. I use the 8gb models so it takes 20-40 seconds to load the model and then each response is really fast after that and the GPU hit is pretty small. After I think five minutes by default it will unload the model to free up VRAM.

    Basically this means that you either need to wait a bit for the model to warm up or you need to extend that timeout so that it stays warm longer. That means that I cannot really use my GPU for anything else while the LLM is loaded.

    I haven’t tracked power usage, but besides the VRAM requirements it doesn’t seem too intensive on resources, but maybe I just haven’t done anything complex enough yet.