5.
5.5-6 GB total memory for 4-bit; 9-12 GB for 8-bit; dense model for laptops; supports text, image, and audio; llama.cpp supports CPU and GPU inference
llama.cpp
llml