Publications

GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
OpenELM: An Efficient Language Model Family with Open Training and Inference Framework
OpenELM: An Efficient Language Model Family with Open Training and Inference Framework
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
The 62nd Annual Meeting of the Association for Computational Linguistics (ACL), 2024
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
ReLU Strikes Back: Exploiting Activation Sparsity in Large Language Models
International Conference on Learning Representations (ICLR), 2024 [Oral]
ReLU Strikes Back: Exploiting Activation Sparsity in Large Language Models
Wide Neural Networks Forget Less Catastrophically
International Conference on Machine Learning (ICML), 2022
Wide Neural Networks Forget Less Catastrophically
CL-Gym: Full-Featured PyTorch Library for Continual Learning
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2021
CL-Gym: Full-Featured PyTorch Library for Continual Learning
Linear Mode Connectivity in Multitask and Continual Learning
International Conference on Learning Representations (ICLR), 2021
Linear Mode Connectivity in Multitask and Continual Learning
Understanding the Role of Training Regimes in Continual Learning
Advances in Neural Information Processing Systems (NeurIPS), 2020
The abstract version presented at ICML 2020 Workshop on Continual Learning
Understanding the Role of Training Regimes in Continual Learning
Dropout as an Implicit Gating Mechanism For Continual Learning
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2020
Runner-up award at CVPR'20 Workshop on Continual Learning in Computer Vision
Dropout as an Implicit Gating Mechanism For Continual Learning
Optimal Policy for Deployment of Machine Learning Models on Energy-Bounded Systems
International Joint Conference on Artificial Intelligence (IJCAI), 2020
Optimal Policy for Deployment of Machine Learning Models on Energy-Bounded Systems
Improved Knowledge Distillation via Teacher Assistant
AAI Conference on Artificial Intelligence (AAAI), 2020
Improved Knowledge Distillation via Teacher Assistant
LabelMerger: Learning Activities in Uncontrolled Environments
Scalable Algorithm for improving aggregated noisy-labels in data collection process.
LabelMerger: Learning Activities in Uncontrolled Environments