Model Parallelism: Building and Deploying Large Neural Networks (MPBDLNN) – Outline

Detailed Course Outline

Introduction

  • Meet the instructor.
  • Create an account at courses.nvidia.com/join

Introduction to Training of Large Models

  • Learn about the motivation behind and key challenges of training large models.
  • Get an overview of the basic techniques and tools needed for large-scale training.
  • Get an introduction to distributed training and the Slurm job scheduler.
  • Train a GPT model using data parallelism.
  • Profile the training process and understand execution performance.

Model Parallelism: Advanced Topics

  • Increase the model size using a range of memory-saving techniques.
  • Get an introduction to tensor and pipeline parallelism.
  • Go beyond natural language processing and get an introduction to DeepSpeed.
  • Auto-tune model performance.
  • Learn about mixture-of-experts models.

Inference of Large Models

  • Understand the challenges of deployment associated with large models.
  • Explore techniques for model reduction.
  • Learn how to use TensorRT-LLM.
  • Learn how to use Triton Inference Server.
  • Understand the process of deploying GPT checkpoint to production.
  • See an example of prompt engineering.

Final Review

  • Review key learnings and answer questions.
  • Complete the assessment and earn a certificate.
  • Complete the workshop survey.