17-20 GB total memory (RAM + VRAM) for 4-bit.
17-20 GB total memory (RAM + VRAM) for 4-bit; strongest Gemma 4 variant; llama.cpp supports CPU and GPU inference
llama.cpp
llml