View on GitHub

Training-sessions

Overview of training sessions on (parallel) programming, scientific computing, data analysis and HPC workflows

Training sessions

This is a (non-exhaustive) list of training sessions offered on a regular basis by the Vlaams Supercomputing Centrum (VSC), but also to research institutes and companies.

All of these training sessions can be tailored to the audience if required, and ad-hoc training sessions are possible as well.

For many sessions, a dedicated website has been created that provides detailed information, as well as a link to a GitHub repository that hosts the material specific for that training.

For those sessions that do not yet have their own dedicated website, the material can be found in the training-materials repository: https://github.com/gjbex/training-material

C++

C++ is an important language for scientific programming and HPC.

C++ for scientific programming (14 hours)

This training focuses on using C++ for scientific programming and on the way Bjarne Stroustrup advocates modern C++ in his book “A tour of C++”. Some C++17 features and third party libraries will also be discussed.

For detailed information, see: https://gjbex.github.io/Scientific-C-plus-plus/

C++ software engineering

This training focuses on building applications and libraries in a professional way using best practices in software engineering.

Subjects:

Prerequisites:

Parallel programming with the STL and Threading Building Blocks (4 hours)

Out of the box, the STL contains many algorithms that can be run in parallel. In this training, you will learn how to do that, and what the performance benefits may be. Threading Building Blocks (TBB) is a C++ template library for developing shared memory applications. It integrates well with the C++ Standard Template Library (STL) and can be used with any modern C++ compiler. TBB is especially suited for nested parallelism, and pipelining when part of the application can use accelerator hardware.

Subjects:

Prerequisites:

Other training sessions useful for C++ programmers

Fortran

Fortran for programmers (8 hours)

Fortran is used a lot in the context of HPC. For many, it has a reputation of being an old and ugly programming language. However, modern Fortran is a far cry from Fortran 77 that its detractors have in mind. Fortran 2003/2008 is a language that is well tailored towards scientific computing. This training introduces the language and its features.

For detailed information, see: https://gjbex.github.io/Fortran-for-programmers/

Other training sessions useful for Fortran programmers

C

C for programmers (12 hours)

C is still one of the more popular programming languages today. It is probably mainly used for systems programming, but there are also many applications for scientific computing that have been developed in this programming language.

Subjects:

Prerequisites:

Other training sessions useful for Fortran programmers

Python training sessions

Python for programmers (4 hours)

Python is an all-round programming language that has applications in many domains. This training session introduces the programming language to participants who have programming experience with other programming languages such as R, MATLAB, C/C++ or Fortran.

Subjects:

Prerequisites:

Python systems programming (4 hours)

Python is a very versatile programming language that has a wide range of applications. This training concentrates on interaction with the operating system, the file system, other applications and the network. This is useful for systems programming, but also when you want to use Python as a coordination language for your workflows. This training introduces modules that are useful in that context.

For detailed information, see: https://gjbex.github.io/Python-for-systems-programming/

Python software engineering (4 hours)

Quality of software matters, whether you share it with others or not. Software should be easy to install, easy to use, and well documented. This training will cover those aspects from the perspective of the Python ecosystem. However, it is also important that software is easy to maintain, so coding style matters, API-level documentation should be available, as well as a battery of tests to ensure the software’s integrity. Of course, good design is at least as important.

For detailed information, see: https://gjbex.github.io/Python-software-engineering/

Scientific Python (4 hours)

Python is a nice programming language for scientific programming. Many high quality libraries are available as building block in a wide variety of scientific domains. In this training we will concentrate on the core libraries, and give some examples of domain specific libraries.

For detailed information, see: https://gjbex.github.io/Scientific-Python/

Python for data science (4 hours)

Python is one of the dominant languages in data science. In this training we will cover a number of modules that are useful for data preparation, analyzing data, visualization, and machine learning.

For detailed information, see: https://gjbex.github.io/Python-for-data-science/

Python for machine learning (4 hours)

Python is one of the dominant languages in the area of machine learning and AI. This training will provide an introduction to machine learning methodology for data preparation and machine learning, as well as some machine learning algorithms.

For detailed information, see: https://gjbex.github.io/Python-for-machine-learning/

Python for HPC (8 hours)

Although vanilla Python is fairly slow and hence not a good candidate, there are several options to significantly increase the efficiency of Python programs.

For detailed information, see: https://gjbex.github.io/Python-for-HPC/

Python on GPUs (4 hours)

Python is more and more used for scientific computing and can be used to develop code that runs on GPGPUs. Additionally, a number of libraries that are commonly used in scientific computing, data science and machine learning can use GPGPUs to improve performance.

For detailed information, see: https://gjbex.github.io/Python-on-GPUs/

Python dashboards

Dashboards are a useful way to present data in an interactive way. They can be used for reporting, but also for data exploration, or even to create simple web applications. You can run dashboards locally, or deploy them via GitHub or GitLab, or even on a server in the cloud. Creating dashboards in Python is quite straightforward. In this training you will learn about several frameworks and their strengths and weaknesses.

For detailed information, see: https://gjbex.github.io/Python-dashboards/

Julia: the good, the bad and the ugly (4 hours)

Julia has gained quite some traction over the last couple of years, but is it for you? In this training you will see how it compares to MATLAB and Python, and what Julia’s strong and weak points are.

For detailed information, see: https://gjbex.github.io/Julia_good_bad_ugly/

Jupyter notebooks (2 hours)

Jupyter notebooks are a versatile tools for data exploration and exploratory programming in a wide variety of programming languages. In this session you will learn how to use Jupyter Notebooks effectively.

Subjects:

Prerequisites:

Parallel programming

These courses have been developed by Rolf Rabenseifner (HLRS, Stuttgart, Germany) and the training has followed Rolf’s train-the-trainer program.

Parallel programming with MPI (14 hours)

MPI (Message Passing Interface) is the de-facto standard for distributed parallel programming of scientific applications. It specifies language bindings for C and Fortran. This training covers the MPI 3.1 standard, but also emphasizes potential pitfalls and best practices.

Subjects:

Prerequisites:

Parallel programming with OpenMP (7 hours)

OpenMP provides a very convenient programming model for scientific applications that run multiple threads. It is supported by C, C++ and Fortran compilers. Its main advantage is that it is mostly used through annotation of code, which makes it easy to pick low hanging fruit when starting to parallelize code. The training covers OpenMP 4.5, and emphasizes pitfalls and best practices.

Subjects:

Prerequisites:

Defensive programming and debugging (4 hours + 2 hours optional)

All code contains bugs, finding and fixing them is boring. In this training, best practices are presented to reduce the number of bugs in your code. You will also learn about debuggers and debugging techniques to find bugs more efficiently.

Subjects:

Prerequisites:

Supplementary material:

Code optimization (4 hours)

For HPC applications, performance is a major concern. A thorough understanding of the relevant hardware and software components is required. In this training, you will learn about a number of these components, and how they influence the efficiency of you application. You will also learn how to profile your application to identify performance bottlenecks.

For detailed information, see: https://gjbex.github.io/Code-optimization/

Best practices in scientific programming

As a scientist, your core business is science, not software engineering. Nevertheless, having a good understanding of how best practices in software engineering can help you work more efficiently, or give your work more exposure is quite useful. This training is flexible in that it can be tailored to any combination of C, C++, Fortran or Python, and to many levels of detail.

Subjects:

Prerequisites:

Supplementary material:

Linux bash shell (4 hours)

In HPC, most interaction with the systems is through the terminal and a shell. Working efficiently in such an environment saves a lot of time and frustration. In this training, you will learn the basics of bash, as well as some Linux OS concepts.

Subjects:

Prerequisites:

HPC introduction (3 hours)

Although working on High Performance Computing (HPC) infrastructures is not rocket science, there are still a number of things you need to be aware of to do so efficiently. Note that this session is specific for (potential) users of the infrastructure managed by the Vlaams Supercopmuting Centrum (VSC).

Subjects:

Prerequisites:

Container for HPC (3 hours)

Singularity can be viewed as the Docker for HPC, i.e., a Singularity image contains all the operating system/software components for your application and can be deployed and run on HPC infrastructure, as well as on your own laptop.

For detailed information, see: https://gjbex.github.io/Containers-for-HPC/

Worker and atools (3 hours)

Many scientific workloads are embarrassingly parallel, i.e, they can be broken down into many independent computations. Examples of such workflow are data preprocessing or postprocessing of many files, or parameter exploration where the same program is run for many different parameter instantiations. Two tools have been developed to help you minimize the hassle with such workloads. This training session will highlight the strength of both tools, help you to pick the right one for your workflow, and use it efficiently.

For detailed information, see: https://gjbex.github.io/worker-and-atools/

Version control with git (4 hours)

Version control is an essential part of the software development process and is crucial for scientific application to help reproducibility. You will learn how to use a version control system (either git or SVN) to document the changes in your source code.

For detailed information, see: https://gjbex.github.io/Version-control-with-git/

Code of conduct

Please note that by participating in any training listed here, either online, hybrid, or face-to-face you implicitly agree to adhere to the code of conduct.