EduXchange.EU: Machine Learning Operations

About this course

Introduce the student to several tools and software development practices that will help them organize, scale, deploy and monitor machine learning models either in a research or production setting. To provide hands-on experience with a number of frameworks, both local and in the cloud, for working with large scale machine learning pipelines.Proper coding environments, code organization, good coding practices, code and data version control, reproducible and containerized environments, reproducible experiment management, debugging tools, code profiling, large scale collaborative experiment logging and monitoring, unit testing, continuous integration, continuous machine learning, cloud infrastructure, cloud based machine learning, distributed data loading and training, optimization methods for inference, local and cloud based deployment, monitoring of deployed applications. The course includes lectures, exercises and project work. The lectures are short and provide context for why each topic is important. The main focus is on exercises with emphasis on practical tools and coding skills for machine learning in production. Finally, approximately 30% of the course is spent on project work in groups of 3-5 persons, where tools used throughout the course should be applied on a self-chosen machine learning problem

Learning outcomes

At the end of the course the learner will be able to: • Organize code in an efficient way for easy maintainability and shareability • Capable of using version control systems to efficiently collaborate on code development and handle large amounts of data • Being able to create reproduceable software environments and reproduceable containerized applications and experiments • Being able to debug, profile, visualize and monitor multiple experiments to assess model performance • Implement basic testing of software and apply continuous integration (CI) for automating code development • Capable of using cloud based computing services to scale experiments and automate processes • Able to deploy machine learning models, both locally and in the cloud and monitor the lifecycle of the model after deployment • Demonstrate how to scale data loading, training and inference of the machine learning pipeline using distributed frameworks and optimization strategies. • Conduct a research project in collaboration with follow students using the frameworks taught in the course

Examination

Evaluation of exercises/reports. Graded pass/fail only.

Course requirements

General understanding of machine learning (datasets, probability, classifiers, overfitting etc.) and basic knowledge about deep learning (backpropagation, convolutional neural networks, auto-encoders etc.). Familiar with coding in Pytorch

Resources

https://skaftenicki.github.io/dtu_mlops/

Activities

The course includes lectures, exercises and project work. , Approximately 30% of the course is spent on project work in groups of 3-5 persons, where tools throughout the course should be applied on a self-chosen machine learning problem.

Additional information

Institution location
Anker Engelunds Vej 1, Online

More info
Coursepage on website of Technical University of Denmark
Contact a coordinator
Nicki Skafte Detlefsen
Søren Hauberg

Credits
ECTS 5
Level
Master
Contact hours per week
0
Instructors
Nicki Skafte Detlefsen, Søren Hauberg
Mode of instruction
Online - at a specific time

If anything remains unclear, please check the FAQ of DTU (Denmark).

There are currently no offerings available for students of EPFL (Switzerland)