Drupal 8 provides a flexible, plugin-based architecture for migrating data into a site. In Part 3 of this series, we explored how to migrate...
What is Deep Learning?
Deep learning is a category of artificial intelligence (AI) used to find patterns within various types of complex datasets which can include textual, audio, and visual data.
Deep learning is a category of artificial intelligence (AI) used to find patterns within various types of complex datasets which can include textual, audio, and visual data. Deep learning is often conducted using large cloud-based datasets and all of the major cloud providers have various branded deep learning products and there are also several open source libraries publicly available to support it.
Although it's a little in the weeds, the core test for whether something is "deep learning" vs. machine learning or artificial intelligence, deep learning is a subcategory of machine learning. The core test is whether the framework uses networked GPU (Graphics Processing Unit) to create synthetic neural networks. Seen visually:
Deep Learning Frameworks
There are two major categories of deep learning tools:
- Open-source deep learning projects
- Cloud-branded deep learning tools
Open-source deep learning projects
Open-source deep learning frameworks include the following:
- MXNet - released and managed by the Apache Software Foundation
- Caffe - released and managed by the UC Berkeley Artificial Intelligence Research (BAIR)
- PyTorch - the merger of the Caffe2 & Torch projects, now underpinned by Facebook's AI Research (FAIR) Lab
- Keras - A broadly supported open source Python project
- TensorFlow - a project managed by the Google Brain Team
- Theano - a Python Package Index (PyPi) project supported by the Python Software Foundation
Additionally there are a few now deprecated platforms worth calling out:
- Microsoft Cognitive Toolkit (deprecated) - final release 4/18/2019
- Caffe2 (deprecated) - now part of PyTorch
- Torch (deprecated) - now part of PyTorch
While open-source deep learning projects are generally the "roll your own" approach, they do still operate within cloud platforms. For example, AWS prominently offers support for MXNet, TensorFlow, and PyTorch.
Cloud deep learning tools
In addition to supporting most (if not all) of the open-source deep learning projects, the major cloud providers (AWS, Google, Microsoft) all have their own deep learning tools. The difference between these and the open source projects is that they are encapsulated and bill-for-service based on utilization. As an AWS Consulting Partner, we've written extensively on AWS deep learning.
Advantage of cloud-based deep learning
The advantage of using something within a cloud-based deep learning ecosystem (like AWS or one of the other majors providers) are:
- Ready to run - you don't have to spend time finding where you are going to deploy your models or store your data and getting those things connected, or even waste time getting your local machine setup as a development hub. It's already ready to run.
- Connected to the internet - if you are working on a local machine and you want to share your data or conclusions with the world, you have to deploy it somehow. If you are running in the cloud it can be as simple as changing permissions.
- Collaborator friendly - since you platform is cloud-native, it's easy to collaborate with people.
- Guard rails - when attempting something the first time, there are numerous was to mess it up. With a pre-built service, the number of ways to get it wrong are smaller. That's not to say it's easy, but it's certainly easier than a "roll your own" framework.
Getting started in deep learning
When it comes to deep learning it may be temping to jump in and experiment at a low level, any commercial venture attempting to create a new algorithm is likely doomed to failure. As seen above, all the major players: Amazon, Google, Facebook, Microsoft are deeply involved in developing both the underlying platforms - and then using those platforms to underpin their commercial products, and they are continuing to improve their algorithms and expand their feature sets every month. So unless you have the resources to compete with some of the largest most innovative technology companies in the world implement, don't create.
If you know all this and still want to build your own machine learning algorithm or maybe you just want to understand the tech I recommend building on top of AWS SageMaker or perhaps even the "point-and-click" version SageMaker Canvas. It's easy to get an AWS account created and many services have a free tier. But be careful! It can be much easier than you might think to "leave the faucet on" and end up accidentally paying hundreds of dollars per month on something intended as a hobby project.