

61·
2 days agoI was using a Nvidia 3060 for a while, then had 2 in one box, then switched to a 3090.
The amount of vram is a big factor for decent performance. Getting it to not sound like a predictably repetitive bot though is a whole separate thing that is still kind of elusive.

My go to for messing with chat bots is Kobold that’ll let you split the work between multiple GPUs. I get the impression the actual processing is only done on one but it lets you load larger models with the extra memory.