ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 80 - Professional Data Engineer discussion

Report
Export

How can you get a neural network to learn about relationships between categories in a categorical feature?

A.
Create a multi-hot column
Answers
A.
Create a multi-hot column
B.
Create a one-hot column
Answers
B.
Create a one-hot column
C.
Create a hash bucket
Answers
C.
Create a hash bucket
D.
Create an embedding column
Answers
D.
Create an embedding column
Suggested answer: D

Explanation:

There are two problems with one-hot encoding. First, it has high dimensionality, meaning that instead of having just one value, like a continuous feature, it has many values, or dimensions. This makes computation more time-consuming, especially if a feature has a very large number of categories. The second problem is that it doesn't encode any relationships between the categories.

They are completely independent from each other, so the network has no way of knowing which ones are similar to each other.

Both of these problems can be solved by representing a categorical feature with an embedding column. The idea is that each category has a smaller vector with, let's say, 5 values in it. But unlike a one-hot vector, the values are not usually 0. The values are weights, similar to the weights that are used for basic features in a neural network. The difference is that each category has a set of weights (5 of them in this case).

You can think of each value in the embedding vector as a feature of the category. So, if two categories are very similar to each other, then their embedding vectors should be very similar too.

Reference: https://cloudacademy.com/google/introduction-to-google-cloud-machine-learningengine-course/a-wide-and-deep-model.html

asked 18/09/2024
manuele groppi
30 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first