036: ML model serving with SageMaker

Amazon SageMaker is a Machine-Learning-as a-Service (MLaaS) framework that focuses on ML model development and automation, for example model serving.

The daily mood

I realize that I am looking much at Cloud Native technology but not using much of Cloud provider solutions, which is probably as much important if you do not want to re-invent the wheel at application level.

As already discussed in my previous post about "ML model deployment", we've been using both AWS Databricks and Amazon SageMaker as part of our internal Data lake project. In my last post, I looked at MLflow which is actually the relevant part of Databricks for deployment. Today I am looking at Amazon SageMaker which is indeed integrated by MLFlow for model deployment via the SageMaker SDK, but also offers a slightly different tooling and approach for development and operations.

Why AWS for ML

AWS currently offers one of the largest set of managed capabilities for Machine Learning. The offer consists in different individual services coming both from eCommerce (Amazon) and cloud infrastructure (AWS):

AI Services (SaaS)

Voice regonition e.g. Amazon Alexa
Text processing and search e.g. AWS Kendra
Business predictions e.g. AWS Fraud detector and Amazon Forecast

ML Services (PaaS)

Amazon SageMaker proprietary platform with open-sourced Neo-AI device runtime and compiler for optimized execution performance
Underlying open-source frameworks like Jupyter, Apache MXNet/Gluon, TensorFlow/Keras, Pytorch, Scikitlearn etc.

ML Infrastructure (IaaS)

Specific hardware using dedicated Central processing unit (CPU) from Intel, Graphics processing unit (GPU) from Nvidia or ARM architecture.
Elastic inference e.g. Inferentia
Field-programmable gate array (FPGA)

What's in Amazon SageMaker

Amazon SageMaker consists in the following features:

Preparation

SageMaker Ground Truth: Set up and manage labeling jobs for highly accurate training datasets using active learning and human supervision.
SageMaker Autopilot: Automatically generate insights on raw data.

Development

SageMaker Studio: Create notebooks, infer models from training jobs, train and tune.
SageMaker Samples: Reuse existing code samples from many frameworks and pre-built algorithms.

Deployment:

SageMaker SDK: Deploy and expose models via service endpoints that run on a hosted environment of your choice (ex. EC2, ECS, EKS) depending on model format and endpoint configuration. This is the SDK implemented by MLflow.
SageMaker Kubeflow Pipelines (Preview): Automate lifecylce and integration for Kubernetes.

Monitoring:

SageMaker Model Monitor: Keep models accurate over time.

Getting started

Obviously you need an AWS account with permission to use SageMaker. It is also available for free during 12 months as part fo the free tier program (pricing details here).

SageMaker Notebooks are basically Jupyter servers running on dedicated virtual machines (EC2 instance) that you can size to your convenience, and that bill on hourly consumption basis.

Once the SageMaker notebook instance is provisioned and started, you can open it from SageMaker Studio (fat-client) or directly via the Web-UI (browser-client), and create Jupyter Notebooks.

Jupyter Notebooks are tight to a SageMaker supported kernel (ex. Sparkmagic). They allow you to write and locally execute IPython code cells, directly access your S3 bucket, re-use 40+ built-in algorithms to create and train your model, serialize it, deploy it to an automatically generated endpoint that is able to route to multiple models (usefull for example for A/B testing). Unfortunately, a distributed exeuction environment is not automatically provisioned and configured by SageMaker so that's additional effort.

Remark on SQL analysis: Unlike Databricks, SageMaker doesn't offer any direct SQL-interface for analysing data from the notebook.

In a frist step, you can use AWS Athena query editor for querying data from different possible raw, rational database and NoSQL sources (Beta features) and storing one of the following back into S3:

SQL-code statements
SQL-queries that can execute on-demand based on AWS Glue, for exmaple using a crawler that automatically retrieves schema information
SQL-results

In a second step, you may install the Python DB API 2.0 compliant client for Athena (PyAthena) from the notebook, and then integrate Athena operations to your Python code.

Remark on ML Workflow orchestration: It is best-practice to put all the pieces (training, inference, scoring) together using AWS Step Functions.

Take away

Amazon SageMaker offers a broad set of components for end-to-end use-case coverage and performance optimization. Databricks has a nicer integration with the SQL- (Deltalake), governance (MLFlow) and execution (Serverless Spark) layer, but lacks for operability if you decide to monitor and serve your model via an endpoint. MLflow itself actually reveals a very nice integration with SageMaker whereas SageMaker doesn't seem to facilitate integration of any Third-party source, execution cluster or downstream application. We'll have some challenge remaining for bringing the different building-blocks together without increasing the architecture and network complexity, avoid vendor lock-in, and finally control the monthly pay-as-you-go bill.

References

https://www.sagemakerworkshop.com/

https://github.com/awslabs/amazon-sagemaker-examples

The newbie cloud architect diary

Search This Blog

036: ML model serving with SageMaker

Labels

Comments

Post a Comment