

I’m surprised that you’re talking about models being CUDA-specific or AMD-specific. I’ve had a bunch of models running on my amd-only pc, using ollama, lemonade, and lm-studio, through either rocm or vulkan. None of these models were billed as AMD-specific. I had to do some config tweaking for ollama to use my graphics card but that’s more because I have a weird in-between-generations card that also predates the LLM hype (6700XT).
However, I did generally need to look for the GGUF format versions of things - usually accounts like unsloth have them uploaded on huggingface barely a day or two after the original version gets posted.
















The textual expect/diff for UI screenshot testing is enough to make me want to try this out, let alone the promise of being able to use ocaml for everything.