Name: Amazon SageMaker
Availability: InStock
Author: Amazon

Learning Objectives

Understand what SageMaker is and how it fits into AWS's AI product portfolio alongside Bedrock
Identify SageMaker's key components: Studio, JumpStart, Pipelines, and HyperPod
Evaluate when to use SageMaker versus Bedrock for ML workloads

What Is Amazon SageMaker?

Amazon SageMaker is AWS's fully managed platform for the entire machine learning lifecycle — from data preparation and labeling through model training, tuning, deployment, and monitoring. While Amazon Bedrock provides model-as-a-service (call an API, get a response), SageMaker is the platform for teams that need to build, train, and operate their own ML systems.

SageMaker has been available since 2017 and is one of the most widely used enterprise ML platforms globally. It provides the infrastructure, tools, and managed services that let ML teams focus on model development rather than infrastructure management.

💡Key Concept

Bedrock vs. SageMaker: Amazon Bedrock is for consuming foundation models via API — you send a prompt, get a response. SageMaker is for building ML systems — you train models on your data, deploy them on your infrastructure, and manage the full lifecycle. Many teams use both: Bedrock for LLM features, SageMaker for custom ML models.

Key Components

SageMaker Studio

A web-based IDE for ML development:

Jupyter notebooks with managed compute (no infrastructure setup)
Visual experiment tracking and model comparison
Integrated debugging and profiling tools
Collaboration features for ML teams

SageMaker JumpStart

A model hub and solution catalog:

Hundreds of pre-trained models (Llama, Mistral, Stable Diffusion, Hugging Face models)
One-click deployment for foundation models
Pre-built ML solutions for common use cases (fraud detection, demand forecasting, image classification)
Fine-tuning workflows for customizing models on your data

SageMaker Pipelines

MLOps automation:

Define ML workflows as code (data processing, training, evaluation, deployment)
Automated model retraining on new data
Model registry for versioning and approval workflows
Integration with CI/CD for production ML deployments

SageMaker HyperPod

Distributed training infrastructure:

Managed GPU/Trainium clusters for training large models
Automatic node health monitoring and replacement
Optimized for multi-node training of foundation models
Reduces the operational burden of managing training clusters

SageMaker Canvas

No-code ML for business analysts:

Visual interface for building ML models without writing code
Point-and-click data import, model training, and prediction generation
Supports tabular data (forecasting, classification, regression)
Connects to data in S3, Redshift, and other AWS data stores

Pricing

SageMaker uses pay-as-you-go pricing across multiple dimensions:

Plan	Price	Features
Studio Notebooks	Per hour (instance type)	Free tier: 250 hours (first 2 months)
Training	Per hour (GPU/CPU instances)	Spot instances available for up to 90% savings
Inference (endpoints)	Per hour (instance type)	Auto-scaling available Serverless option for variable traffic
JumpStart models	Per hour (hosting)	Model-dependent Some free to deploy
Canvas	Per session/hour	Included in some enterprise agreements

Studio NotebooksPer hour (instance type)

Free tier: 250 hours (first 2 months)

TrainingPer hour (GPU/CPU instances)

Spot instances available for up to 90% savings

Inference (endpoints)Per hour (instance type)

Auto-scaling available
Serverless option for variable traffic

JumpStart modelsPer hour (hosting)

Model-dependent
Some free to deploy

CanvasPer session/hour

Included in some enterprise agreements

SageMaker vs. Alternatives

Platform	Cloud	Strengths	Best For
Amazon SageMaker	AWS	Broadest feature set; JumpStart model hub; HyperPod; deep AWS integration	AWS-native teams; custom ML at scale
Google Vertex AI	GCP	Strong AutoML; Gemini integration; TPU access	Google Cloud teams; AutoML workflows
Azure ML Studio	Azure	Microsoft ecosystem; OpenAI integration; responsible AI tools	Azure/Microsoft teams
Hugging Face	Multi-cloud	Largest open model hub; community; simple inference API	Open-source model deployment; prototyping

Strengths

End-to-end ML platform — covers data prep, training, deployment, monitoring, and MLOps in one service
JumpStart model hub — hundreds of pre-trained models deployable with one click
HyperPod for large-scale training — managed clusters for training foundation models on GPU/Trainium
Canvas for no-code ML — accessible to business analysts without ML expertise
Deep AWS integration — native connections to S3, Redshift, Glue, Lambda, and other AWS services
Mature and battle-tested — available since 2017; used by thousands of enterprises in production

Limitations & Considerations

AWS lock-in — deeply integrated with AWS; migrating SageMaker workloads to another cloud is significant effort
Complexity — the breadth of features means a steep learning curve; many teams only use a fraction of capabilities
Cost management — pay-per-use across many dimensions (notebooks, training, inference, storage) can be difficult to predict
Not for simple LLM use cases — if you just need to call an LLM API, use Bedrock instead; SageMaker is for custom ML development
Overhead for small teams — the platform is designed for enterprise ML teams; solo developers may find it heavyweight

Key Takeaways

Amazon SageMaker is AWS's fully managed ML platform — covering the entire lifecycle from data preparation through model training, deployment, and monitoring
Distinct from Bedrock: SageMaker is for building and training custom ML systems; Bedrock is for consuming foundation models via API
JumpStart provides one-click access to hundreds of pre-trained models; HyperPod manages distributed training infrastructure; Canvas offers no-code ML for business users
Most compelling for enterprise ML teams on AWS; solo developers and simple LLM use cases are better served by Bedrock or Hugging Face

Amazon SageMaker

Audio & video lessons are paid features