About this course
Introduce the student to several tools and software development practices that will help them organize, scale, deploy and monitor machine learning models either in a research or production setting. To provide hands-on experience with a number of frameworks, both local and in the cloud, for working with large scale machine learning pipelines.Proper coding environments, code organization, good coding practices, code and data version control, reproducible and containerized environments, reproducible experiment management, debugging tools, code profiling, large scale collaborative experiment logging and monitoring, unit testing, continuous integration, continuous machine learning, cloud infrastructure, cloud based machine learning, distributed data loading and training, optimization methods for inference, local and cloud based deployment, monitoring of deployed applications. The course includes lectures, exercises and project work. The lectures are short and provide context for why each topic is important. The main focus is on exercises with emphasis on practical tools and coding skills for machine learning in production. Finally, approximately 30% of the course is spent on project work in groups of 3-5 persons, where tools used throughout the course should be applied on a self-chosen machine learning problem
Learning outcomes
At the end of the course the learner will be able to: • Organize code in an efficient way for easy maintainability and shareability • Capable of using version control systems to efficiently collaborate on code development and handle large amounts of data • Being able to create reproduceable software environments and reproduceable containerized applications and experiments • Being able to debug, profile, visualize and monitor multiple experiments to assess model performance • Implement basic testing of software and apply continuous integration (CI) for automating code development • Capable of using cloud based computing services to scale experiments and automate processes • Able to deploy machine learning models, both locally and in the cloud and monitor the lifecycle of the model after deployment • Demonstrate how to scale data loading, training and inference of the machine learning pipeline using distributed frameworks and optimization strategies. • Conduct a research project in collaboration with follow students using the frameworks taught in the course
Enrolment details
NB! This course is open for enrolment from May 14 – July 29, 2024. In case no ‘Apply now’ information is shown in the right-hand side, please check enrolment information under ‘Help’ – ‘How do I register’ for your institution.
Examination
Evaluation of exercises/reports. Graded pass/fail only.
Course requirements
General understanding of machine learning (datasets, probability, classifiers, overfitting etc.) and basic knowledge about deep learning (backpropagation, convolutional neural networks, auto-encoders etc.). Familiar with coding in Pytorch
Resources
- https://skaftenicki.github.io/dtu_mlops/
Activities
The course includes lectures, exercises and project work. , Approximately 30% of the course is spent on project work in groups of 3-5 persons, where tools throughout the course should be applied on a self-chosen machine learning problem.
Additional information
- Institution locationAnker Engelunds Vej 1, Online
- More infoCoursepage on website of Technical University of Denmark
- Contact a coordinator
- CreditsECTS 5
- LevelMaster
- Contact hours per week0
- InstructorsNicki Skafte Detlefsen, Søren Hauberg
- Mode of instructionOnline - at a specific time
Offering(s)
Start date
6 January 2025
- Ends24 January 2025
- Term *January 2025
- LocationOnline
- Instruction language
Enrolment period closed