2 docs tagged with "large language models"

AWQ: Activation-Aware Weight Quantization for On-Device LLM Compression and Acceleration

- Awarded Best Paper in MLSys 2024.

- Using an automated LLM workflow to assist with the design and refinement of a network algorithm.