Amazon SageMaker

What is Amazon SageMaker?

If you are new to the machine learning development world you not know what you need to know about Amazon SageMaker. Here are the top 5 things you should know about it:

  1. Amazon SageMaker is an cloud-based AI development platform
  2. You have to give the algorithm examples of right answers
  3. You can compile your machine learning algorithm for performance
  4. SageMaker Canvas is a "point-and-click" version of Amazon SageMaker
  5. SageMaker is a low-level tool, and there are more powerful tools in the AWS ML ecosystem

(If you need help with your machine learning project, don't hesitate to reach out.)

1. Amazon SageMaker is an cloud-based AI development platform

Amazon SageMaker is a cloud-based artificial intelligence (AI) development platform which provides a consolidated, build-to-train-to-production flow including:

  • Pre-built "Notebooks" - essentially the IDE (integrated Development Environment) of the Machine Learning space
  • Built-in high performance algorithms
  • One-click training
  • Hyperparameter optimization
  • One-click deployment
  • Fully managed hosting with auto-scaling
components of Amazon SageMaker

If you're new to ML the value of those elements may not register, but it is a LOT easier than rolling your own environment from the ground up. All of that said, the greatest value in the AWS Machine Learning ecosystem can be drawn from more advanced Machine Learning/Deep learning services which I dive into more detail on below.

  • Amazon Rekognition - image/video object detection, facial recognition and more
  • Amazon Lex - voice and/or text chatbot which can decode language and identify intent
  • Amazon Personalize - product recommendations, product re-ranking, and customized direct marketing at scale
  • Amazon Polly - create automated spoken voice from text in over a dozen languages
  • Amazon Comprehend - extract insights from large amounts of written text like product reviews or document libraries
  • Amazon Translate - accurately convert text from one language to another in real time (can be used in combination with Lex & Polly for speech applications)
  • AWS DeepLens - a deep learning enabled video camera
  • Amazon Forecast - a machine learning time-series forecasting service

2. You have to give the algorithm examples of right answers

On the most simple level, machine learning is about putting data in and getting a prediction out. For example, based on a users demographic information what upsell product are they most likely to buy? The data that goes into the algorithm are called features, and you can have a lot of them in a complex model. The output or prediction is called the label, and there should only be one per "row" of data.  The label is the right answer, an example of what you want the algorithm to do.  Data without a label isn't useful for a machine, because the algorithm doesn't know what it is trying to predict.  The process of giving the data a label is a core part of training a machine algorithm. AWS facilitates this process with several tools:

  • Amazon SageMaker Ground Truth - allow you to identify raw data including images, text files, and videos, and add informative labels to create high-quality training datasets for your machine learning models.
  • Amazon Mechanical Turk - a crowdsourcing marketplace that allows outsourcing of labeling jobs to a distributed real human workforce.

3. You can compile your machine learning algorithm for performance

Because many of trained machine learning algorithms need to operate in realtime (100ms or less) performance is key. If you take even a few seconds to recommend a related product, you may miss your opportunity. SageMaker Neo optimizes (aka "compiles") trained machine learning models for use on cloud instances and edge devices (even IoT & mobile) to run up tp x25 faster with no loss in accuracy. It currently supports the following ML algorithms:

  • DarkNet
  • Keras
  • MXNet
  • PyTorch
  • TensorFlow
  • TensorFlow-Lite
  • ONNX
  • XGBoost

4. SageMaker Canvas is a "point-and-click" version of Amazon SageMaker

Announced in December 2021, SageMakers Canvas is a "visual point-and-click" version of Amazon SageMaker designed to allow Business Analysts to harness the power of machine learning. As a lot of machine learning is actually data munching, and visual analysis of data, it's possible that this will catch on, but we're still at the very early stages. In my experience a lot of the machine learning space is still very much academic, requiring an understanding of some pretty nuanced vernacular. For example, precision, accuracy and recall are all very specific things in the ML space, but that nuance is lost on people who haven't had to study statistics or who don't have experience with navigating the AWS ecosystem.  Here, for example is the registration screen:

SageMaker Canvas Registration Screen

5. SageMaker is a low-level tool, and there are more powerful tools in the AWS ML ecosystem

Amazon SageMaker is a powerful, super sophisticated tool and a great place for people to start if they are interested in getting their machine learning certification, but it is a low-level, high-customizable service. Depending on your specific needs, it's likely that one of the higher-level Deep Learning/Machine Learning tools in the Amazon ecosystem will be a better fit.  While this ecosystem is always growing here are some of the more interesting machine learning algorithms available:

  • Amazon Rekognition
  • Amazon Lex
  • Amazon Polly
  • Amazon Comprehend
  • Amazon Translate
  • AWS DeepLens

Amazon Rekognition

Technically a "Deep learning" service, Amazon Rekognition is machine learning image and video processing algorithm that was initially trained on Amazon Prime Photos. It is capable of a lot of out-of-the-box functionality including:

  • Content moderation
  • Face detection and analysis
  • Face compare and search
  • Celebrity recognition
  • Text detection
  • Object detection
  • Custom labeling
  • Video segment detection
  • Personal Protective Equipment (PPE) Detection

All of this is accessible via an API in which the video is encrypted in transit and at rest, making it an amazing bolt-on service for expanding metadata information for visual libraries of content. The algorithm can also be trained for better performance against particular datasets (lesser known celebrities, etc.). If you want to learn more I've written about the top 5 things to know about Amazon Rekognition.

Amazon Lex

The same technology that powers Amazon's home speaker, Alexa, Lex is an API-based services that provides Automatic Speech Recognition (ASR) -which identifies the words being used- along with Natural Language Understanding (NLU) -which identifies intent- to enable custom voice or text-based chatbots. Lex is extensible allowing the use of custom vocabularies.

Amazon Polly

Amazon Polly reverses the flow of information from speech-to-text, converting text back into spoken word. Out of the box there are several different voices, of different genders, and there is support for more than 2 dozen languages. Using Lex together with Polly, it is possible to build a completely voice powered application.

Amazon Comprehend

Amazon Comprehend is an algorithm designed to consume massive amounts of written text using Natural Language Processing (NLP), and output a number of insights including:

  • Discovering insights and relationships in text
  • Identify language based on text (is this Spanish or English?)
  • Extract key phrases, places, people, brands or events
  • Understand positive or negative (was this a positive review?)
  • Automatically organize a collection of text files by topi

Amazon Translate

Amazon Translate is a neural machine translation service which understands over a dozen languages which enables:

  • Fluent translation of text
  • Localization for international users
  • Easy translation of large volumes of text efficiently

AWS DeepLens

AWS DeepLens is a hardware device with custom on the edge inference engine that is capable of calculating its first inference just 10 minutes after unboxing.
 

DeepLens Technical Specifications

Is the AWS Machine Learning Certification worth it?

For those of you considering getting a Machine Learning certification on one of the big cloud providers (AWS, Google, Azure), this is an overview of some of the ready-made AWS Deep Learning algorithms, designed to showcase what the AWS machine learning ecosystem is capable of.  If you are wondering if the AWS Machine Learning Certification is worth it, I answer that question by looking at volume of Google search trends, Gartner's Magic quadrant, and my own professional experience.

 

Date posted: January 18, 2022

Add new comment

Restricted HTML

  • Allowed HTML tags: <a href hreflang> <em> <strong> <cite> <blockquote cite> <code> <ul type> <ol start type> <li> <dl> <dt> <dd> <h2 id> <h3 id> <h4 id> <h5 id> <h6 id>
  • You can enable syntax highlighting of source code with the following tags: <code>, <blockcode>, <cpp>, <java>, <php>. The supported tag styles are: <foo>, [foo].
  • Web page addresses and email addresses turn into links automatically.
  • Lines and paragraphs break automatically.

Metal Toad is an Advanced AWS Consulting Partner. Learn more about our AWS Managed Services

About the Author

Joaquin Lippincott, CEO

Joaquin is a 20+ year technology veteran helping to lead businesses in the move to the Cloud. He frequently speaks on panels about the future of tech ranging from IoT and Machine Learning to the latest innovation in the entertainment industry.  He has helped to modernize software for industry leaders like Sony, Daimler, Intel, the Golden Globes, Siemens Wind Power, ABC, NBC, DC Comics, Warner Brothers & the Linux Foundation.

As the CEO and Founder of Metal Toad, an AWS Advanced Consulting Partner, his primary job is to "get the right people in the room".  This one responsibility is cross-functional and includes both external business development functions as well as internal delegation and leadership development.

A UCLA alumni, he also serves in the community as a Board Member for the Los Angeles Area Chamber of Commerce, the Beverly Hills Chamber of Commerce, and Stand for Children Oregon - a public education political advocacy group. As an outspoken advocate for entry-level job creation in tech he helped found the non-profit, P4TH, an organization dedicated to increasing the number of entry-level jobs in the tech industry, and is in the process of organizing an Advisory Board for the Bixel Exchange, a Los Angeles non-profit that provides almost 200 tech internships every year.

 

Have questions?