LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Publication
The 62nd Annual Meeting of the Association for Computational Linguistics (ACL), 2024
Iman Mirzadeh
Iman Mirzadeh
Machine Learning Research Engineer

PhD candidate & graduate research assistant.