ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 108 - Professional Machine Learning Engineer discussion

Report
Export

You work on the data science team for a multinational beverage company. You need to develop an ML model to predict the company's profitability for a new line of naturally flavored bottled waters in different locations. You are provided with historical data that includes product types, product sales volumes, expenses, and profits for all regions. What should you use as the input and output for your model?

A.
Use latitude, longitude, and product type as features. Use profit as model output.
Answers
A.
Use latitude, longitude, and product type as features. Use profit as model output.
B.
Use latitude, longitude, and product type as features. Use revenue and expenses as model outputs.
Answers
B.
Use latitude, longitude, and product type as features. Use revenue and expenses as model outputs.
C.
Use product type and the feature cross of latitude with longitude, followed by binning, as features. Use profit as model output.
Answers
C.
Use product type and the feature cross of latitude with longitude, followed by binning, as features. Use profit as model output.
D.
Use product type and the feature cross of latitude with longitude, followed by binning, as features. Use revenue and expenses as model outputs.
Answers
D.
Use product type and the feature cross of latitude with longitude, followed by binning, as features. Use revenue and expenses as model outputs.
Suggested answer: C

Explanation:

Option A is incorrect because using latitude, longitude, and product type as features, and using profit as model output is not the best way to develop an ML model to predict the company's profitability for a new line of naturally flavored bottled waters in different locations. This option does not capture the interaction between latitude and longitude, which may affect the profitability of the product. For example, the same product may have different profitability in different regions, depending on the climate, culture, or preferences of the customers. Moreover, this option does not account for the granularity of the location data, which may be too fine or too coarse for the model. For example, using the exact coordinates of a city may not be meaningful, as the profitability may vary within the city, or using the country name may not be informative, as the profitability may vary across the country.

Option B is incorrect because using latitude, longitude, and product type as features, and using revenue and expenses as model outputs is not a suitable way to develop an ML model to predict the company's profitability for a new line of naturally flavored bottled waters in different locations. This option has the same drawbacks as option A, as it does not capture the interaction between latitude and longitude, or account for the granularity of the location data. Moreover, this option does not directly predict the profitability of the product, which is the target variable of interest. Instead, it predicts the revenue and expenses of the product, which are intermediate variables that depend on other factors, such as the price, the cost, or the demand of the product. To obtain the profitability, we would need to subtract the expenses from the revenue, which may introduce errors or uncertainties in the prediction.

Option C is correct because using product type and the feature cross of latitude with longitude, followed by binning, as features, and using profit as model output is a good way to develop an ML model to predict the company's profitability for a new line of naturally flavored bottled waters in different locations. This option captures the interaction between latitude and longitude, which may affect the profitability of the product, by creating a feature cross of these two features.A feature cross is a synthetic feature that combines the values of two or more features into a single feature1. This option also accounts for the granularity of the location data, by binning the feature cross into discrete buckets.Binning is a technique that groups continuous values into intervals, which can reduce the noise and complexity of the data2. Moreover, this option directly predicts the profitability of the product, which is the target variable of interest, by using it as the model output.

Option D is incorrect because using product type and the feature cross of latitude with longitude, followed by binning, as features, and using revenue and expenses as model outputs is not a valid way to develop an ML model to predict the company's profitability for a new line of naturally flavored bottled waters in different locations. This option has the same advantages as option C, as it captures the interaction between latitude and longitude, and accounts for the granularity of the location data, by creating a feature cross and binning it. However, this option does not directly predict the profitability of the product, which is the target variable of interest, but rather predicts the revenue and expenses of the product, which are intermediate variables that depend on other factors, as explained in option B.

Feature cross

Binning

[Profitability]

[Revenue and expenses]

[Latitude and longitude]

[Product type]

asked 18/09/2024
Edward Eric
36 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first