4bit awq

by ArtemSultanov - opened 7 days ago

Discussion

ArtemSultanov

7 days ago

Hi, are there any plans to release this version in 4bit awq format?

nicoboss

7 days ago

Team mradermacher only really provides GGUF quants and @RichardErkhov got storage limited by HuggingFace so he won't be able to upload them even if he decides to create such quants. Maybe just run https://github.com/vllm-project/llm-compressor yourself to create them? Maybe just use 4-bit BnB quants for now which vllm can generate on the fly based on the original model.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment