A company uses a long short-term memory (LSTM) model to evaluate the risk factors of a particular energy sector. The model reviews multi-page text documents to analyze each sentence of the text and categorize it as either a potential risk or no risk. The model is not performing well, even though the Data Scientist has experimented with many different network structures and tuned the corresponding hyperparameters.
Which approach will provide the MAXIMUM performance boost?

Question

A company uses a long short-term memory (LSTM) model to evaluate the risk factors of a particular energy sector. The model reviews multi-page text documents to analyze each sentence of the text and categorize it as either a potential risk or no risk. The model is not performing well, even though the Data Scientist has experimented with many different network structures and tuned the corresponding hyperparameters.

Which approach will provide the MAXIMUM performance boost?

Oleksii Ivanov · Accepted Answer

Initialize the words by word2vec embeddings pretrained on a large collection of news articles related to the energy sector.

Oleksii Ivanov · Answer

Initialize the words by term frequency-inverse document frequency (TF-IDF) vectors pretrained on a large collection of news articles related to the energy sector.

Oleksii Ivanov · Answer

Use gated recurrent units (GRUs) instead of LSTM and run the training process until the validation loss stops decreasing.

Oleksii Ivanov · Answer

Reduce the learning rate and run the training process until the training loss stops decreasing.

Question list

List of questions

Question 1

(0)

Question 2

(0)

Question 3

(0)

Question 4

(0)

Question 5

(0)

Question 6

(0)

Question 7

(0)

Question 8

(0)

Question 9

(0)

Question 10

(0)

Related questions

Question 158 - MLS-C01 discussion

Suggested answer: D

0 comments