Skip to content

Optimization of LLMs

Workflow Architecture

workflow

  1. The implementation of Particle Swarm Optimization (PSO) for weight extraction was initiated but could not be successfully completed.

  2. Following a bottom-up approach, started working on the quantization of the model.

    • Quantized HuggingFace models, using Quanto Library.
    • Implemented 8-bit Modality Agnostic Quantizer.
    • Read Quantization Documentation for more.
  3. Implemented a character-wise GPT model to gain a deeper understanding of the GPT structure.

  4. Working on Finetuning the LLMs.

  5. Starting up with Federated Learning
  6. Deployment of Models.

Future Work

  • Learn and implement federated learning.
  • Explore the implementation of a 2-bit quantizer for edge devices.
  • Quantize and fine-tune the LLaMA model.

PSO Implementation ->