AWQ: Activation-Aware Weight Quantization for On-Device LLM Compression and Acceleration
- Awarded Best Paper in MLSys 2024.
- Awarded Best Paper in MLSys 2024.
- Using an automated LLM workflow to assist with the design and refinement of a network algorithm.