ExamGecko
Question list
Search
Search

List of questions

Search

Related questions











Question 29 - MLS-C01 discussion

Report
Export

A pharmaceutical company performs periodic audits of clinical trial sites to quickly resolve critical findings. The company stores audit documents in text format. Auditors have requested help from a data science team to quickly analyze the documents. The auditors need to discover the 10 main topics within the documents to prioritize and distribute the review work among the auditing team members. Documents that describe adverse events must receive the highest priority.

A data scientist will use statistical modeling to discover abstract topics and to provide a list of the top words for each category to help the auditors assess the relevance of the topic.

Which algorithms are best suited to this scenario? (Choose two.)

A.
Latent Dirichlet allocation (LDA)
Answers
A.
Latent Dirichlet allocation (LDA)
B.
Random Forest classifier
Answers
B.
Random Forest classifier
C.
Neural topic modeling (NTM)
Answers
C.
Neural topic modeling (NTM)
D.
Linear support vector machine
Answers
D.
Linear support vector machine
E.
Linear regression
Answers
E.
Linear regression
Suggested answer: A, C

Explanation:

The algorithms that are best suited to this scenario are latent Dirichlet allocation (LDA) and neural topic modeling (NTM), as they are both unsupervised learning methods that can discover abstract topics from a collection of text documents.LDA and NTM can provide a list of the top words for each topic, as well as the topic distribution for each document, which can help the auditors assess the relevance and priority of the topic12.

The other options are not suitable because:

Option B: A random forest classifier is a supervised learning method that can perform classification or regression tasks by using an ensemble of decision trees.A random forest classifier is not suitable for discovering abstract topics from text documents, as it requires labeled data and predefined classes3.

Option D: A linear support vector machine is a supervised learning method that can perform classification or regression tasks by using a linear function that separates the data into different classes.A linear support vector machine is not suitable for discovering abstract topics from text documents, as it requires labeled data and predefined classes4.

Option E: A linear regression is a supervised learning method that can perform regression tasks by using a linear function that models the relationship between a dependent variable and one or more independent variables.A linear regression is not suitable for discovering abstract topics from text documents, as it requires labeled data and a continuous output variable5.

References:

1: Latent Dirichlet Allocation

2: Neural Topic Modeling

3: Random Forest Classifier

4: Linear Support Vector Machine

5: Linear Regression

asked 16/09/2024
wendy brouwer
38 questions
User
Your answer:
0 comments
Sorted by

Leave a comment first