4bit awq
#1
by ArtemSultanov - opened
Hi, are there any plans to release this version in 4bit awq format?
Team mradermacher only really provides GGUF quants and @RichardErkhov got storage limited by HuggingFace so he won't be able to upload them even if he decides to create such quants. Maybe just run https://github.com/vllm-project/llm-compressor yourself to create them? Maybe just use 4-bit BnB quants for now which vllm can generate on the fly based on the original model.