Large Language Models (LLMs) are a class of machine learning models that have recently gained a lot of attention. These models are trained on large amounts of data and after training can be used in many applications.
Although models from OpenAI and Google can be used as services online, it is often desirable to have a model that can be used offline. This training will show you how to deploy and use such models locally.
Learning outcomes
When you complete this training you will
- understand what LLMs are and how they are trained;
- be able to use a pre-trained LLM for text generation;
- be able to use Retrieval Augmented Generation (RAG) for question answering on your own data;
- understand how quantization works and how it can be used to reduce the size of a model;
- be able to fine-tune a pre-trained LLM for a specific task.
Schedule
Total duration: 6.5 hours.
Subject | Duration |
---|---|
preparation | 20 min. |
introduction and motivation | 10 min. |
neural networks: the basics | 15 min. |
large language models | 90 min. |
local deployment | 30 min. |
simple applications | 10 min. |
Retrieval Augmented Generation (RAG) | 50 min. |
quantization | 60 min. |
fine-tuning models | 60 min. |
wrap up | 10 min. |
Training materials
Slides are available in the GitHub repository, as well as example code and hands-on material.
Target audience
This training is for you if you need to deploy Larqe Language Models (LLMs) on your own infrastructure.
Prerequisites
You will need experience programming in Python. This is not a training that starts from scratch.
Familiarity with Linux or HPC environments is recommented.
Trainer(s)
- Geert Jan Bex (geertjan.bex@uhasselt.be)