EduXchange.EU

Machine Learning Operations

02476
Computer Science and ICT, Data, AI

About this course

Introduce the student to several tools and software development practices that will help them organize, scale, deploy and monitor machine learning models either in a research or production setting. To provide hands-on experience with a number of frameworks, both local and in the cloud, for working with large scale machine learning pipelines.Proper coding environments, code organization, good coding practices, code and data version control, reproducible and containerized environments, reproducible experiment management, debugging tools, code profiling, large scale collaborative experiment logging and monitoring, unit testing, continuous integration, continuous machine learning, cloud infrastructure, cloud based machine learning, distributed data loading and training, optimization methods for inference, local and cloud based deployment, monitoring of deployed applications. The course includes lectures, exercises and project work. The lectures are short and provide context for why each topic is important. The main focus is on exercises with emphasis on practical tools and coding skills for machine learning in production. Finally, approximately 30% of the course is spent on project work in groups of 3-5 persons, where tools used throughout the course should be applied on a self-chosen machine learning problem

Learning outcomes

At the end of the course the learner will be able to: • Organize code in an efficient way for easy maintainability and shareability • Capable of using version control systems to efficiently collaborate on code development and handle large amounts of data • Being able to create reproduceable software environments and reproduceable containerized applications and experiments • Being able to debug, profile, visualize and monitor multiple experiments to assess model performance • Implement basic testing of software and apply continuous integration (CI) for automating code development • Capable of using cloud based computing services to scale experiments and automate processes • Able to deploy machine learning models, both locally and in the cloud and monitor the lifecycle of the model after deployment • Demonstrate how to scale data loading, training and inference of the machine learning pipeline using distributed frameworks and optimization strategies. • Conduct a research project in collaboration with follow students using the frameworks taught in the course

Examination

Evaluation of exercises/reports. Graded pass/fail only.

Course requirements

General understanding of machine learning (datasets, probability, classifiers, overfitting etc.) and basic knowledge about deep learning (backpropagation, convolutional neural networks, auto-encoders etc.). Familiar with coding in Pytorch

Resources

  • https://skaftenicki.github.io/dtu_mlops/

Activities

The course includes lectures, exercises and project work. , Approximately 30% of the course is spent on project work in groups of 3-5 persons, where tools throughout the course should be applied on a self-chosen machine learning problem.

Additional information

  • Credits
    ECTS 5
  • Level
    Master
  • Contact hours per week
    0
  • Instructors
    Nicki Skafte Detlefsen, Søren Hauberg
  • Mode of instruction
    Online - at a specific time
If anything remains unclear, please check the FAQ of DTU (Denmark).

Offering(s)

  • Start date

    6 January 2025

    • Ends
      24 January 2025
    • Term *
      January 2025
    • Location
      Online
    • Instruction language
    Enrolment period closed
These offerings are valid for students of TUM (Germany)