AWQ: Activation-Aware Weight Quantization for On-Device LLM Compression and Acceleration- Awarded Best Paper in MLSys 2024.