Large Language Models (LLMs) are a class of machine learning models that have recently gained a lot of attention. These models are trained on large amounts of data and after training can be used in many applications.
Although models from OpenAI and Google can be used as services online, it is often desirable to have a model that can be used offline. This training will show you how to deploy and use such models locally.
Learning outcomes
When you complete this training you will
- understand what LLMs are and how they are trained;
- be able to use a pre-trained LLM for text generation;
- be able to use Retrieval Augmented Generation (RAG) for question answering on your own data;
- understand how quantization works and how it can be used to reduce the size of a model;
- be able to fine-tune a pre-trained LLM for a specific task.
Schedule
Total duration: 6.5 hours.
| Subject | Duration |
|---|---|
| preparation | 20 min. |
| introduction and motivation | 10 min. |
| neural networks: the basics | 15 min. |
| large language models | 90 min. |
| local deployment | 30 min. |
| simple applications | 10 min. |
| Retrieval Augmented Generation (RAG) | 50 min. |
| quantization | 60 min. |
| fine-tuning models | 60 min. |
| wrap up | 10 min. |
Training materials
Slides are available in the GitHub repository, as well as example code and hands-on material.
Target audience
This training is for you if you need to deploy Larqe Language Models (LLMs) on your own infrastructure.
Prerequisites
You will need experience programming in Python. This is not a training that starts from scratch.
Familiarity with Linux or HPC environments is recommented.
Quick self-assessment
If you can do most of the tasks below, you are likely ready for this training.
- run Python code in a script or notebook;
- install or activate a Python environment and import installed packages;
- use the command line to run commands and inspect output;
- explain at a high level what a machine-learning model does during inference;
- understand the difference between local files and online services;
- work with text files or small document collections as input data;
- log in to a remote Linux or HPC system if that is where the examples will run;
- make a small change to an example command or script and run it again.
If several of these items still feel difficult, the training will probably move too fast. In that case, it is better to first refresh basic Python, command-line use, and the local or remote environment you plan to use.
Software and access requirements
For following along hands-on, you need
- laptop or desktop with internet access.
- a system set up so you can connect to an HPC system, an account on an HPC system (e.g., VSC, CECI, …), compute credits if that is required to run jobs on the HPC system if you want to use an HPC system;
- a Python environment that can run Jupyter Lab if you want to use your own system (note that you would require a GPU for most of the examples to work);
- access to Google Colaboratory if you prefer not to install software.
Level
- Introductory: 30 %
- Intermetiate: 40 %
- Advanced: 30 %
Trainer(s)
- Geert Jan Bex (geertjan.bex@uhasselt.be)