A Step-by-Step Guide to Azure Machine Learning

A Step-by-Step Guide to Azure Machine Learning

27 Aug 2024
Advanced
775 Views
22 min read
Learn with an interactive course and practical hands-on labs

Free Azure Online Course with Certificate [For Beginners]

Azure Machine Learning

Azure Machine Learning is a complete cloud-based platform that speeds up the development, training, and deployment of machine learning models. To accommodate different project needs and skill levels, Azure Machine Learning offers a range of services and technologies.

In this Azure tutorial, We will learn more about Azure Machine Learning, including what it is. Azure Machine Learning Advantages, etc. Don't forget to take into account our Azure Developer Certification Training if you wish to become a Certified Azure Developer.

What is Azure Machine Learning?

  • Microsoft offers a cloud service called Azure Machine Learning
  • It facilitates the entire machine-learning lifecycle.
  • It makes building, training, and implementing machine learning models at scale easier.
  • Azure ML provides a multitude of tools, such as deployment services, data preprocessing, model training, and automated machine learning (AutoML).
  • It is made to make machine learning model creation faster and more accessible to both new and seasoned data scientists.

Are you curious about how to tackle this question in your Azure interview? Explore our Azure Interview Questions article for detailed answers and tips!

Top 50 Azure Interview Questions and Answers

Key features of Azure Machine Learning

Azure's machine learning service includes some helpful features that simplify things for a larger group of people. These features would come in quite handy, particularly if you are not accustomed to manually configuring the machine learning workflow and environment.
Key features of Azure Machine Learning
Azure Machine Learning has several key features that improve its functionality and usability:

  1. On-Demand Scalable Compute: Customizable on-demand computing according to workload.
  2. Data Ingestion Engine: The data ingestion engine's acceptance of sources is extensive, in my opinion.
  3. Machine Learning Workflow Orchestration: Azure makes machine learning workflow orchestration very easy.
  4. Machine Learning Model Management: Azure Machine Learning offers specialized features for managing machine learning models, which are useful if you prefer to test several models before deciding on the best one.
  5. Metrics and Monitoring: All of the services and metrics we use for model training are easily accessible on the platform.
  6. Model deployment: You can instantly implement your model using Azure ML.

Types of services in Azure Machine Learning

1. Azure ML Studio

  • Azure ML Studio is a visual workspace for creating, honing, and deploying models without requiring a lot of coding.

2. Azure ML Designer

  • A drag-and-drop interface for modeling experimentation and machine learning pipeline creation.

3. Azure ML Notebooks

  • Jupyter notebooks offer an interactive setting for experimentation, model building, and data discovery.

4. Azure ML computational

Distributed computing and GPU support among other scalable computational resources for model deployment and training.

Example

Here’s a basic example of how you can use Python in Azure ML to create and run a simple machine-learning app

from azureml.core import Workspace, Dataset
from azureml.train.automl import AutoMLConfig
from azureml.train.automl.run import AutoMLRun

# Connect to the Azure ML workspace
ws = Workspace.from_config()

# Load dataset
dataset = Dataset.get_by_name(ws, name='my_dataset')

# Define AutoML configuration
automl_config = AutoMLConfig(
    task='classification',
    primary_metric='AUC_weighted',
    iteration_timeout_minutes=5,
    iterations=10,
    X=dataset.drop_columns(['target']),
    y=dataset['target']
)

# Run AutoML experiment
experiment = Experiment(ws, 'automl-experiment')
run = experiment.submit(automl_config)
run.wait_for_completion(show_output=True)

# Get the best model
best_run, best_model = run.get_output()
print(f'Best model: {best_model}')
    
After setting up an AutoML setup, loading a dataset, running the experiment, and producing the best model, this script connects to your Azure ML workspace.

A Step-by-Step Guide To Building Machine Learning Models in Azure

There are three methods to build Machine Learning Models in Azure
  1. Expert mode
  2. Azure ML studio (based on Automated Machine learning)
  3. Designer mode

Method 1:Custom settings and building a model of our choice(Expert Mode)

Step 1: Create Azure Workspace

A workspace is a central location where resources needed for model training are managed. Workspaces are useful for organizing resources according to projects, organization units, testing and production environments, etc. In Azure, workspaces are defined inside resource groups.A workspace's resources include pipelines, notebooks, experiments, models, data for training models, and computing goals.

The following is also created when we build an Azure workspace:

  • Data storage account for training models
  • Applications insights to keep an eye on prediction services
  • Azure Key Vault for credential management

Users must authenticate using the Azure Active directory in order to access these resources in Azure Workspace.The steps to create an Azure workspace are listed below:

1. By using the Azure Portal, create a machine learning resource.

2. After the resource instance has been built, build the workspace and enter the necessary data, such as the workspace name, region, storage account, and application insights.

Step 2: Create a Compute Instance

Python code can be written and executed online using compute instances, which come pre-installed with a development environment. A workspace can contain more than one compute instance. We can choose a compute instance with the required CPU, GPU, RAM, and Storage based on our needs.

The steps to build a compute instance are listed below.

1. Choose a new compute instance from the navigation on the left.

2. Give the name of the computed instance. Choose the needed virtual machine and start the compute instance.

Step 3: Create DataSet

1. To create a new DataSet, click DataSet. I have used the from local files option to upload the data.

2. Upload the file and provide the name of the DataSet.

Step 4: Create an Azure Notebook and Connect to Workspace

1. Click on Create New Notebook in Machine Learning Studio.

2. Create a new Notebook

3. Import the Python package called azureml-core, which allows us to connect and create code that uses the workspace's resources.

from azureml.core import Workspace
ws = Workspace.from_config()
print(ws.name, ws.location, ws.resource_group, ws.subscription_id)

For batch scoring or real-time inference, deploy trained models to various settings, such as Azure Kubernetes Service (AKS) or Azure Container Instances (ACI).

Step 5: Create a Training Script to train the model

The following command will be used to create a script in this phase that will train the model and save it as a.py file in our folder:

%%writefile $training_folder/employess_training.py   

1. To store all of the Python scripts, let's make a folder. The training scripts will be stored in a folder called employees-training.

training_folder = 'employess-training'
os.makedirs(training_folder, exist_ok=True)   

2. Locate the CSV file path that we submitted in the Dataset. Datastores provide for the access of data files. Azure storage service connection details are kept in datastores.The datastore name in my instance is "workspaceblobstorage." To see every registered data source, navigate to Home > Datasets > Registered DataSets.

datastore_name = 'workspaceblobstorage'
datastore_paths = [(datastore_name, 'employess.csv')]  

3. Obtain the context for the run. An experiment's run is equivalent to one trial. A single experiment can contain several Runs.Run allows us to keep an eye on the trial, record metrics, store trial output, and examine the results .

 run = Run.get_context()   

4. Use the Pandas library to read the dataset.

employees= pd.read_csv(Dataset.Tabular.from_delimited_files(path=datastore_paths))   
5. Create the datasets X and Y. X has the feature variables and Y has the output variable
X, y = employees['Manoj','Asmin','sourav','aman','shailesh','kumari','priyanka','manav']].values, employees['Employee'].values    
6. To address the classification problem of determining whether an organization employs people or not, divide the dataset into train and test groups and apply a logistic regression model. The train and test data are divided in this case in a 70:30 ratio.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=0)
# Set regularization hyperparameter
reg = 0.01
# Train a logistic regression model
run.log('Regularization Rate',  np.float(reg))
model = LogisticRegression(C=1/reg, solver="liblinear").fit(X_train, y_train)   
7. Next, we assess our model code by determining the model's accuracy and AUC. # calculate the AUC
# calculate accuracy
y_pred = model.predict(X_test)
acc = np.average(y_pred == y_test)
print('Accuracy:', acc)
run.log('Accuracy', np.float(acc))
# calculate AUC
y_score = model.predict_proba(X_test)
auc = roc_auc_score(y_test,y_score[:,1])
print('AUC: ' + str(auc))
run.log('AUC', np.float(auc)    
8. Save the model after training into the specified folder. In this instance, the model is being saved to the output folder. The trained model should be saved in the outputs folder
# Save the trained model in the outputs folder
os.makedirs('outputs', exist_ok=True)
joblib.dump(value=model, filename='outputs/employees_model.pkl')
run.complete()    
We have simply created a script to train the model in this stage and saved it in a folder. The script hasn't been run yet.

Step 6: Run the training Script as an Experiment

1. A set of trials representing several model runs is called an experiment. Experiments can also be conducted using alternative codes, data, and settings. Run Class 1 represents each trial in an experiment, and Experiment Class represents the experiment as a whole. To conduct the experiment, we build a Python environment.
env = Environment.from_conda_specification("experiment_env", "environment.yml")   
where the environment specifications are contained in environment.yml
2. After that, we create a ScriptRunConfig which packages gather information to submit a run like Script, compute targets, environments etc.
script_config = ScriptRunConfig(source_directory=training_folder,
                               script='employees_training.py',
                               environment=env)
3. After that, we pass the ScriptConfig details and submit the experiment run.
experiment_name = 'train-employees'    
experiment = Experiment(workspace=ws, name=experiment_name)
run = experiment.submit(config=script_config)    
4. After that, we wait for the experiment to finish running.
RunDetails(run).show()
run.wait_for_completion()   

Step 7: Retrieve the metrics and output of the run object and print it in Notebook.

To print metrics like accuracy, AUC, regularization rate, etc., we can utilize the run class's get_metric method.

metrics = run.get_metrics()
for key in metrics.keys():
       print(key, metrics.get(key))  

Step 8: Register the trained model.

Remember that, The model was saved as a pkl file in Step 5. In order to keep track of the model versions, we will now register the model in the workspace.

# Register the model
run.register_model(model_path='outputs/employees_model.pkl', model_name='employees_model',
                  tags={'Training context':'Script'},
                  properties={'AUC': run.get_metrics()['AUC'], 'Accuracy': run.get_metrics()['Accuracy']})   
After the model has been trained and registered, the results of each run of the experiment are displayed in the left navigation

When we click on the name of the experiment and each run's corresponding status, we can observe the different runs.

We may view related metrics, outputs, logs, etc. for every run. The model's accuracy in this specific algorithm is 0.774, and the area under the curve is 0.848.

Method 2: Automated Machine Learning using Azure Machine Learning Studio

Step 1: To train the data, specify the dataset with labels. I've made a fresh automated machine learning run using the same diabetic dataset.

Step 2: Set up the automated machine learning run by providing the experiment's name, target label, and compute target.
Step 3: Decide which algorithm and configuration parameters to use, such as feature settings, regression, classification, or time series.
Step 4: Examine the top model currently produced.
Step 5: After clicking the experiment, you can observe that several kinds of algorithms were run multiple times to determine which model would perform best.

Method 3: Training Model using Azure Machine Learning Designer(Designer Mode)

Step 1: Training Pipeline(Processing the data, and normalizing the features using drag and drop.)
Step 2: Splitting the data.
Splitting the data like 30% of the data is for testing, while the remaining 70% is for training.
Step 3: Train the model
The model training process starts as soon as you select the training option in the pipeline.
Automated Generation of Explanations
The explanation generator is one of the ML designer's most useful features. As seen in the screenshot below, you may turn this option on by navigating to the Parameters setting. Thus, saving computing resources. You would need to explicitly turn this on because it is turned off by default.
We create the explanation to understand the context and more effectively analyze the outcomes once the model has been trained using this functionality of the Azure ML designer. The explanation of the model's performance and the significance of its aggregate features is shown in the screenshot below.

1. An explainer for Model performance

2. Explainer for Aggregate Feature Importance

Step 4: Score the Model
Step 5: Evaluate the model
The model needs to be evaluated as the last step. We can use the following metrics to assess the model.

Lift curve, ROC curve, and precision-recall curve.

The confusion matrix helps us assess several metrics such as sensitivity and specificity and provides information about the ratios surrounding true positives and false positives.

Advantages of using Azure Machine Learning

Azure Machine Learning has a lot of benefits.

1. Simplified Development

  • By simplifying the process of developing models, Azure ML's automated tools and user-friendly interface free up developers to concentrate on experimentation and creativity.

2. Scalability

  • Machine learning models can be scaled to handle big datasets and sophisticated computations with efficiency thanks to Azure's cloud infrastructure.

3. Teamwork

  • Azure ML enables teamwork by providing integrated project management tools, shared workspaces, and version control.

4. Expense-effectiveness

Pay-as-you-go pricing makes sure you only pay for the resources you really use, so it's an affordable option for companies of all kinds.

5. Deployment Flexibility

  • This enables models to be seamlessly deployed to a range of contexts, such as cloud-based apps, on-premises servers, and edge devices.
Conclusion

By using Azure Machine Learning's many tools and services, developers can expedite the whole machine learning lifecycle, from data preparation to model deployment and monitoring. This will enable machine learning solutions to be developed more quickly and efficiently. To gain a deeper comprehension of more Azure fundamentals, take a look at our Azure Certification course.

FAQs

These products are free up to the specified monthly amounts. Some are always free to all Azure customers, and some are free for 12 months to new customers only.

Azure Machine Learning Studio
It streamlines the process from data preparation to model deployment, offering a no-code or low-code experience that makes machine learning accessible to a broader range of users, from beginners to seasoned data scientists.

Azure ML provides a central model registry for the entire organization with a full lineage for models. 

Take our Azure skill challenge to evaluate yourself!

In less than 5 minutes, with our skill challenge, you can identify your knowledge gaps and strengths in a given skill.

GET FREE CHALLENGE

Share Article
About Author
Shailendra Chauhan (Microsoft MVP, Founder & CEO at ScholarHat)

Shailendra Chauhan, Founder and CEO of ScholarHat by DotNetTricks, is a renowned expert in System Design, Software Architecture, Azure Cloud, .NET, Angular, React, Node.js, Microservices, DevOps, and Cross-Platform Mobile App Development. His skill set extends into emerging fields like Data Science, Python, Azure AI/ML, and Generative AI, making him a well-rounded expert who bridges traditional development frameworks with cutting-edge advancements. Recognized as a Microsoft Most Valuable Professional (MVP) for an impressive 9 consecutive years (2016–2024), he has consistently demonstrated excellence in delivering impactful solutions and inspiring learners.

Shailendra’s unique, hands-on training programs and bestselling books have empowered thousands of professionals to excel in their careers and crack tough interviews. A visionary leader, he continues to revolutionize technology education with his innovative approach.
Accept cookies & close this