Optimization of LLMs
Workflow Architecture

-
The implementation of Particle Swarm Optimization (PSO) for weight extraction was initiated but could not be successfully completed.
- PSO Implementation Documentation.
-
Following a bottom-up approach, started working on the quantization of the model.
- Quantized HuggingFace models, using Quanto Library.
- Implemented 8-bit Modality Agnostic Quantizer.
- Read Quantization Documentation for more.
-
Implemented a character-wise GPT model to gain a deeper understanding of the GPT structure.
- Read GPT Implementation Documentation for more.
- Quantized the implemented GPT model.
-
Working on Finetuning the LLMs.
- Fine-tuning Documentation.
- Starting up with Federated Learning
- Deployment of Models.
- Deployment Documentation.
Future Work
- Learn and implement federated learning.
- Explore the implementation of a 2-bit quantizer for edge devices.
- Quantize and fine-tune the LLaMA model.