View on GitHub

Python-for-data-science

Repository for participants of the "Python for data science" training

The Python programming language is increasingly popular. It is a versatile language for general purpose programming and accessible for novice programmers. However, it is also increasingly used for data science applications. This training introduces modules that are useful in that context.

Learning outcomes

When you complete this training you will

Schedule

Total duration: 4 hours.

Subject Duration
introduction and motivation 5 min.
pandas & seaborn 105 min.
coffee break 10 min.
text parsing with regular expressions 40 min.
querying relational databases 30 min.
web scraping 10 min.
geographical information with geopandas 30 min.
wrap up 10 min.

Training materials

Slides are available in the GitHub repository, as well as example code and hands-on material.

Target audience

This training is for you if you need to use Python for data analysis.

Prerequisites

You will need experience programming in Python. This is not a training that starts from scratch. Familiarity with numpy is not required, but would be beneficial.

If you plan to do Python programming in a Linux or HPC environment you should be familiar with these as well.

Trainer(s)