Microsoft Azure - DP100

azure
This note helps you to prepare the Azure Assoicate Data Scientist DP-100 exam. I took DP100 in Mar 2021 and includes some important notes for study. Particularly, syntax types questions are very common. You need to study the lab and make sure you understand and remember some syntax to pass this exam.
Author

noklam

Published

March 27, 2021

Last Updated: 2021-04-22

Warning - On Azure website, it states that exam outline will be updated on May 20, 2021. Make sure you check out what’s changed.

Introduction

There are 49 questions, some questions have more scores. You need to pass with at least 700/1000 points. For my exam, there are around 10-15% questions that are quite hard, but as long as you score the easy one, you should be able to pass the exam.

There are different type of questions. * Cases Studies - Usually there are some scenarios provided, and you can not go back to previous questions once you answered. For other questions, you can review anytime you want. * Matching - You need to pick a few choice and arrange them in order. i.e. How to create an envioronment tht fulfill the requirements provided. * Multiple Choices - Typical MC - It may ask to to select Azure Machine Learning Service/Azure Machine Learning Studio/Azure Databricks/Azure Kubernetes Service (AKS) - Syntax type questions (there are quite a few of them) - Diagram (Azure ML Designer)

I struggled with the first few questions and scratching my head for a while. Don’t panic if you just can’t remember the answer, take a guess, marked it as review question (You can do this with the UI). At the end, I scored 809/1000.

How to prepare for the exam

I spent roughly 15 hours to prepare this exam. Half of the time I used on the lab, the other half for reading docs and historical exam questions. Prior to this exam, I have a little bit experience cloud and I work as a Data Scientist, so it gives me some edges for this exam. But you don’t need to have a lot background knowledge, most of the data science concepts tested is very general and you can learnt from the labs.

Official Suggested Materials

  • ❗ https://docs.microsoft.com/en-us/learn/paths/build-ai-solutions-with-azure-ml-service/ - This should be your main focus, try to finish the labs and read the tutorials. You need to understand the use case of different products and familiar yourself with the syntax of Azure ML SDK etc.
  • https://docs.microsoft.com/en-us/learn/paths/create-no-code-predictive-models-azure-machine-learning/ - You should at least finish 1 of the lab to get some sense of the UI. It would be included in the exam for sure (2-3 questions maybe)
  • https://docs.microsoft.com/en-us/learn/paths/create-machine-learn-models/ - I didn’t spend much time on it as most of them are baisc data science concepts. You would need to how to apply different types of models (Regression/Classification/Time Series) & AutoML for given scenario.

Key Concepts

I am pretty sure these concepts will come up in every exam set, so be prepared.

  • Workspace
  • DataStore/Blobstore
  • Compute Target
    • Cluster/VM/ACI link

Compute Target

Machine Learning Studio - single/multi - Development/Experiment - Local Machine/Cloud VM. - Scale up to larger data/distributed - training compute target - Azure Machine Learning compute cluster - Azure Machine Learning compute instance - Deploy Compute Target (You need to able to judge the most appropiate option based on the context.) - Local web service - Azure Kubernetes Service (AKS) - Azure Container Instances - Azure Machine Learning compute clusters (Batch Inference)

DataStore

  • Azure Storage (blob and file containers)
  • Azure Data Lake stores
  • Azure SQL Database
  • Azure Databricks file system (DBFS)

Syntax Type Questions

For me, this are the hard questions. There are at least 10 questions that requires you to remember some syntax. If you did not prepared for this, you will find all options seems to be correct. These are the questions that I encountered in my exam.

Come back to here to check your knowledge after you finish the labs.

  • Run vs mlflow (How to log a metric? What is the syntax with or without mlflow?)
    • run = Run.get_context()
    • mflow
  • Workspace/Config (how to create a workspace? How to get a reference of a specific workspace? What are the required arguments?) link
  • Training a model and register a model/deploy
  • AutoML - Think about what you would do if you are using AutoML.
    • How to retrieve the best iteration model provided the experiment name or Run ID?
    • What are the Early Stopping choices you can use? link
  • Pipelines - I didn’t prepare well for this part, there are a few questions related to this topic. Again, familiar yourself with the syntax.
    • How to create/publish/schedule a pipeline, what are the syntax?
    • Do you publish a pipeline or schedule the pipeline first? link
    • How to retrieve a publihsed pipeline? link
  • How to troubleshoot a service? Where can you find the log? link
  • Explainer (What are the different use cases for different explainer? What are their limitation?) link
    • How many Explainers are avaiable? (ans: 3!)

Example Questiions

You can also leverage the exam simulator, the free demo version will give you access to 19 questions, if you are willing to pay, you can access all 120 questions. Out of the 19 questions, there are around 4~5 similar questions appear in my DP-100 exam. If your goal is to pass this exam

Finanlly, good luck to your exam. Try compare and contrast the workflow when you are doing the tutorials. Overall, Azure is well aware with the MLOps trend, the platform makes it quite easy to handle machine learning pipeline and deploy a model. Once you can related it to your daily work, you would find most of the steps are reasonable and easier to remember.