Scientific Programming in Python

PHYS4038/MLiS and AS1/MPAGS

Course information

This postgraduate course is designed to give a general introduction to the Python programming language and its wider ecosystem, with a focus on the elements most important for data analysis and scientific research.

The course is aimed at students on the MSc Machine Learning in Science (MLiS) programme at the University of Nottingham (for which it is PHYS4038) as well as first-year PhD students at the University of Nottingham and a number of other institutions, via the Midland Physics Alliance Graduate School (MPAGS AS1). Others may be able to join, on request, both for credits or on a more informal (unassessed) basis. Teaching content will be primarily delivered in an asynchronous manner, through recorded video and independent exercises. There will also be some synchronous sessions providing an opportunity for questions and discussion.

As students taking this module are expected to hold an undergraduate degree in a science subject, a limited level of prior programming experience (in any language) is assumed. Please note that, although the majority of the course will be useful for all science postgrads, there will be some subject-specific content, including an overview of key tools for astronomers.

Introduction

If you are an MPAGS student intending to take this course, please watch this introductory video. This was made last year, so the dates of synchronous sessions and coursework deadlines aren’t quite right: please see below for the correct information.

MSc students will be given an introduction to the course in their first synchronous session (see Moodle for details).

Joining the course

MSc students should enrol via the University of Nottingham module enrolment form, after discussing their options with their tutor. You can self enrol on the module Moodle page

PhD students should enrol for the module via the MPAGS module sign-up page. If you are enrolling late, please also email the convenor.

Any others wishing to join should email the convenor.

Lecturer contact

The course is taught by Dr Steven Bamford. Outside of synchronous teaching sessions he can be contacted via email.

MPAGS students can also ask questions via a Slack channel. MPAGS students will be added to this after enrolment.

Timetable and format

The main course runs for the whole of the Autumn term.

The course will be delivered in the form of weekly pre-recorded videos, for you to watch in your own time. MSc students can access the videos via the module Moodle page. MPAGS students will be added to a Slack channel and links to access these videos will be posted in that channel.

There will also be synchronous sessions, held at 10am every Wednesday, starting on 13th October, via Teams. The link to access the meeting will be shared via Moodle / Slack. These synchronous sessions are primarily an opportunity for you to ask questions about the material covered in the previous week of the course, your coursework, or anything relating to scientific computing or Python. Please come prepared with questions or the sessions will be short!

Preceding this there is a MLiS-only introductory session at 10am on Wednesday 6th October.

If you have registered, but don’t receive a link to join the Slack channel by the start of the course, please get in touch.

The course will be assessed by the development of a complete Python program performing some scientific analysis of your choice (details below).

Topics

An outline of the topics is given below. This may be slightly altered and expanded as the course progresses.

Lecture slides

The slides accompanying each session are available below. Note that these may be updated as the course progresses.

Some examples given in the lectures can be found as Jupter notebooks on GitHub.

Exercises

Suggested exercises for some of the sessions are available below.

The solutions to these exercises can be found on GitHub. You are strongly advised to avoid consulting the solutions until you have tried the problems yourself!

Assessment

To qualify for MSc or MPAGS credits, you will need to produce a Python program. Your program may address any scientific purpose you like, e.g. data analysis, simulation, modelling, experiment control, visualisation, etc. Ideally the program will do something that is relevant for your research projects, particularly in the case of MPAGS students.

In addition to producing the code itself, MSc students will also give a presentation on their ongoing development, and a short final report. See the course Moodle page for details.

MPAGS credits are awarded for reasonable engagement and an acceptable final code submission; intermediate submissions for feedback are encouraged, but optional. No report or presentation are required.

Code

Your program should (as a rough guide)…

You should submit the source code (.py/.ipynb file), together with pdf/png files of the output plot(s).

Your code should be submitted in the form of a GitHub repository containing the source code (.py/.ipynb file), together with pdf/png files of the output plot(s). The repository should also contain a README file explaining the functionality of the code and explaining how it should be run.

To create the GitHub repo for each submission you must use the links below. Further details regarding setting up your GitHub repository for submission will be given in the lecture videos.

There will be three submission deadlines (at 3pm on the respective dates):

Presentation

MSc students must also deliver a 5 minute presentation during the week of 22 Nov, describing their on-going development. Details will been announced via Moodle.

Report

Along with their final code, MSc students must additionally submit a short (~3 sides of A4) report describing the purpose of the code, any key design decisions, the outputs, and scope for improvements. Details will been announced via Moodle.

Preliminaries

If you intend to use your own computer, then, before the course begins, you should ensure that you have working Python interpreter available, ideally the Anaconda distribution (see below). For this course you are strongly recommended to use a recent version (3.7+). Linux and OSX already have Python installed, but you are not recommended to use them! Instead, install a version of Python specifically for your research, as described below. If you have any difficulties installing and running Python, please ask for help in a synchronous session or email to arrange a meeting.

You should use Python 3. While Python 2 is still available and in use, most projects have moved over to Python 3 by now. Note that Python 3 is not backward compatible with Python 2 due to a small number of significant changes, i.e. code that works with Python 2 will not necessarily work with Python 3. Some of the differences will be explained during the course.

Getting Python

You are highly recommended to install Python using the freely-available Anaconda distribution. This gives you the most convenient route to a standalone Python 3 installation with most (if not all) of the modules you need easily available.

If you are missing a Python module, you can usually get it with one of the following (in the order you should try them):

Beware that you can have multiple versions of condo and pip on one computer, each associated with a particular Python distribution (and perhaps a specific Anaconda environment). Make sure are using the correct one!

Using your operating system’s own version of Python, or installing Python in any other manner, is just asking for trouble.

Writing Python

While you can write Python in any text editor, you should choose one with support for Python code, i.e. providing automatic formatting, code highlighting, and ideally integrated documentation. We’ll cover some options at the start of the course.