ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 216 - MLS-C01 discussion

Report
Export

A machine learning (ML) specialist needs to extract embedding vectors from a text series. The goal is to provide a ready-to-ingest feature space for a data scientist to develop downstream ML predictive models. The text consists of curated sentences in English. Many sentences use similar words but in different contexts. There are questions and answers among the sentences, and the embedding space must differentiate between them.

Which options can produce the required embedding vectors that capture word context and sequential QA information? (Choose two.)

A.
Amazon SageMaker seq2seq algorithm
Answers
A.
Amazon SageMaker seq2seq algorithm
B.
Amazon SageMaker BlazingText algorithm in Skip-gram mode
Answers
B.
Amazon SageMaker BlazingText algorithm in Skip-gram mode
C.
Amazon SageMaker Object2Vec algorithm
Answers
C.
Amazon SageMaker Object2Vec algorithm
D.
Amazon SageMaker BlazingText algorithm in continuous bag-of-words (CBOW) mode
Answers
D.
Amazon SageMaker BlazingText algorithm in continuous bag-of-words (CBOW) mode
E.
Combination of the Amazon SageMaker BlazingText algorithm in Batch Skip-gram mode with a custom recurrent neural network (RNN)
Answers
E.
Combination of the Amazon SageMaker BlazingText algorithm in Batch Skip-gram mode with a custom recurrent neural network (RNN)
Suggested answer: B, E

Explanation:

To capture word context and sequential QA information, the embedding vectors need to consider both the order and the meaning of the words in the text.

Option B, Amazon SageMaker BlazingText algorithm in Skip-gram mode, is a valid option because it can learn word embeddings that capture the semantic similarity and syntactic relations between words based on their co-occurrence in a window of words.Skip-gram mode can also handle rare words better than continuous bag-of-words (CBOW) mode1.

Option E, combination of the Amazon SageMaker BlazingText algorithm in Batch Skip-gram mode with a custom recurrent neural network (RNN), is another valid option because it can leverage the advantages of Skip-gram mode and also use an RNN to model the sequential nature of the text.An RNN can capture the temporal dependencies and long-term dependencies between words, which are important for QA tasks2.

Option A, Amazon SageMaker seq2seq algorithm, is not a valid option because it is designed for sequence-to-sequence tasks such as machine translation, summarization, or chatbots.It does not produce embedding vectors for text series, but rather generates an output sequence given an input sequence3.

Option C, Amazon SageMaker Object2Vec algorithm, is not a valid option because it is designed for learning embeddings for pairs of objects, such as text-image, text-text, or image-image.It does not produce embedding vectors for text series, but rather learns a similarity function between pairs of objects4.

Option D, Amazon SageMaker BlazingText algorithm in continuous bag-of-words (CBOW) mode, is not a valid option because it does not capture word context as well as Skip-gram mode. CBOW mode predicts a word given its surrounding words, while Skip-gram mode predicts the surrounding words given a word.CBOW mode is faster and more suitable for frequent words, but Skip-gram mode can learn more meaningful embeddings for rare words1.

References:

1: Amazon SageMaker BlazingText

2: Recurrent Neural Networks (RNNs)

3: Amazon SageMaker Seq2Seq

4: Amazon SageMaker Object2Vec

asked 16/09/2024
Diogo Vitor
38 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first