Really impressed with with Antirez is doing over at https://github.com/antirez/ds4. My main gripe is that I'm running a 128GB Strix Halo machine with ROCm, and that's not a primary target for DS4. I think the endgame will be llama.cpp fixing this issue (https://github.com/ggml-org/llama.cpp/issues/22319 requests model support for DS4), and then running that via lemonade's build of llama.cpp (https://github.com/lemonade-sdk/llamacpp-rocm/releases) with an Unsloth quant of DeepSeek v4 Flash (like https://huggingface.co/unsloth/DeepSeek-V4-Flash). Just waiting for progress on llama.cpp, since everything else should be ready.