Getting Started With Azure Machine Learning Studio
Cloudreach Cloud Architect, Dwayne Monroe provides a brief introduction to Azure Machine Learning Studio and walks us through an example project to get readers started.
This post is about Azure Machine Learning Studio, an Azure offering which makes it possible for non-specialists to benefit from the capabilities of machine learning and bring those benefits (such as fraud detection) to organizations of any size. Azure Machine Learning Studio provides a visual interface that gives you the ability to create, test and deploy statistical models without writing code (for example, Python).
Why Azure Machine Learning Studio?
Until very recently, data scientists and other experts, writing complex code, were essential to creating a solution using predictive analytics. Additionally, organizations had to invest in powerful, and very expensive hardware to support such solutions. In the cloud era, the hardware obstacle has been removed. And now, with Azure Machine Learning Studio, the coding barrier to entry has been lowered.
Experts are still vital to building analytics solutions for the most challenging and large-scale situations (and Azure Machine Service provides a platform to meet that need).
For many other situations, Azure Machine Learning Studio is exactly what you need.
In this post, we will review Azure Machine Learning Studio’s abilities at a high-level and provide an example to help you get started.
Although Azure Machine Learning Studio doesn’t require coding ability or expert-level knowledge of Machine Learning, you should be familiar with common terms such as:
- Analysis models
- Training experiments
- Predictive experiments
Here’s a visual overview of the workflow (image from Microsoft):
To learn more about these concepts in depth, check out Microsoft’s AI courses.
Let’s get started…
You log in to Azure Machine Learning Studio with your Azure admin account (to get a trial account, visit Microsoft’s Azure Free Account page).
With your account, log in to studio.azureml.net
Notice the menu selector in the upper-left-hand corner of the page? Select Studio.
After choosing Studio, you will see the main page. On this page, you will notice a tutorial and the template gallery which hosts many useful examples to help you get started.
When you minimize the template interface (by clicking the x on the right-hand side of the interface), take a look at the left-hand side of the main page.
PROJECTS – Experiments, datasets, notebooks and other resources that constitute an individual project
EXPERIMENTS – The experiments you’ve created
WEB SERVICES – Web services deployed from your experiments
NOTEBOOKS – Jupyter notebooks you’ve made (A Jupyter notebooks is an open source web application used to create and share documents that store live code)
DATASETS – Data – such as CSV files – uploaded to Studio
TRAINED MODELS – Models trained via experiments and saved in Studio
SETTINGS – Used to configure your account and resources
Notice the “NEW” option shown at the bottom left-hand side of the interface shown above. Click that once to show the Experiment Tutorial.
Click the Experiment Tutorial tile to start a walkthrough.
Click “Get Started” to begin the tutorial (and although the experiment can be built in 5 steps, the full process, including the deployment of a web service, takes more than five steps).
To show you the steps required to create a new experiment and describe the techniques used, the tutorial moves the cursor to “NEW” then “Experiment”. The next step (Step 1) in the process is choosing a dataset.
Using an income projection dataset for sample purposes, the tutorial moves to the next stage (Step 2) of the process: splitting the dataset.
This step randomly splits the dataset into two sets: training and test. You can read more about why data splitting is a part of the process at this Microsoft article.
In Step 3, you apply a machine learning algorithm to the dataset, beginning the training process. The tutorial engine pre-selects an algorithm called a “Two Boosted Class Decision Tree”. A boosted decision tree is a learning method in which the second tree is used to correct errors found in the first tree, the third tree corrects for errors found in the second and so on. The advantage is a self-reinforcing methodology for creating the best possible dataset. There’s more information about boosted decision trees at this Microsoft article.
In Step 4, we use our trained model to create predictions based on the test dataset:
Now that the dataset has been trained, scored and evaluated (all performed during the prediction step) we can run our experiment (Step 5) i.e. see if the predictive model performs as expected.
Azure Machine Learning Studio confirms that the model performs as expected.
The next stage of the process is creating a web service which makes the model available to applications.
We constrain the input schema to help the system infer what’s important (in this case, the “Income” column is excluded from consideration).
Then we configure the web service to ‘Infer the web service output schema’ by including only scored labels and scored probabilities (i.e., labels and probabilities scored during the experiment run in an earlier stage).
With all parameters configured (dataset inputted, ML algorithm applied, experiments run, web service created and input and output schemas set) the model is completed.
Now the web service API is ready to be deployed to mobile devices and desktop applications.
Let’s review what we’ve done. First the training experiment:
- The dataset is the Adult Census Income sample
- An ML algorithm applied – in this case, the Two-Class Boosted Decision Tree
- The dataset is split into training and test trees
- The model is trained
- The model is scored to determine accuracy
- The model is evaluated
Next, let’s look at the predictive experiment:
- The web service input schema is configured (or, ‘inferred’)
- Relevant columns are selected from the dataset
- The model is scored
- The web service output schema is configured
With the training and predictive experiments trained and deployed and the web service API configured a complete ML solution is ready for use.
Do you have any questions? Leave them in the comments below.
Find out how another Cloudreacher has been getting to know Microsoft Azure Machine Learning tools via a recent OpenHack event.