ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 15 - DP-100 discussion

Report
Export

You use the Azure Machine Learning Python SDK to define a pipeline that consists of multiple steps.

When you run the pipeline, you observe that some steps do not run. The cached output from a previous run is used instead.

You need to ensure that every step in the pipeline is run, even if the parameters and contents of the source directory have not changed since the previous run.

What are two possible ways to achieve this goal? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

A.
Use a PipelineData object that references a datastore other than the default datastore.
Answers
A.
Use a PipelineData object that references a datastore other than the default datastore.
B.
Set the regenerate_outputs property of the pipeline to True.
Answers
B.
Set the regenerate_outputs property of the pipeline to True.
C.
Set the allow_reuse property of each step in the pipeline to False.
Answers
C.
Set the allow_reuse property of each step in the pipeline to False.
D.
Restart the compute cluster where the pipeline experiment is configured to run.
Answers
D.
Restart the compute cluster where the pipeline experiment is configured to run.
E.
Set the outputs property of each step in the pipeline to True.
Answers
E.
Set the outputs property of each step in the pipeline to True.
Suggested answer: B, C

Explanation:

B: If regenerate_outputs is set to True, a new submit will always force generation of all step outputs, and disallow data reuse for any step of this run. Once this run is complete, however, subsequent runs may reuse the results of this run.

C: Keep the following in mind when working with pipeline steps, input/output data, and step reuse.

If data used in a step is in a datastore and allow_reuse is True, then changes to the data change won't be detected. If the data is uploaded as part of the snapshot (under the step's source_directory), though this is not recommended, then the hash will change and will trigger a rerun.

Reference: https://docs.microsoft.com/en-us/python/api/azureml-pipeline-core/azureml.pipeline.core.pipelinestep https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/machine-learning-pipelines/intro-to-pipelines/aml-pipelines-getting-started.ipynb

asked 02/10/2024
Christoph Reithmayr
37 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first