ExamGecko
Home / iSQI / CT-AI / List of questions
Ask Question

iSQI CT-AI Practice Test - Questions Answers, Page 4

Add to Whishlist

List of questions

Question 31

Report Export Collapse

Which ONE of the following models BEST describes a way to model defect prediction by looking at the history of bugs in modules by using code quality metrics of modules of historical versions as input?


Identifying the relationship between developers and the modules developed by them.

Identifying the relationship between developers and the modules developed by them.

Search of similar code based on natural language processing.

Search of similar code based on natural language processing.

Clustering of similar code modules to predict based on similarity.

Clustering of similar code modules to predict based on similarity.

Using a classification model to predict the presence of a defect by using code quality metrics as the input data.

Using a classification model to predict the presence of a defect by using code quality metrics as the input data.

Suggested answer: D
Explanation:

Defect prediction models aim to identify parts of the software that are likely to contain defects by analyzing historical data and code quality metrics. The primary goal is to use this predictive information to allocate testing and maintenance resources effectively. Let's break down why option D is the correct choice:

Understanding Classification Models:

Classification models are a type of supervised learning algorithm used to categorize or classify data into predefined classes or labels. In the context of defect prediction, the classification model would classify parts of the code as either 'defective' or 'non-defective' based on the input features.

Input Data - Code Quality Metrics:

The input data for these classification models typically includes various code quality metrics such as cyclomatic complexity, lines of code, number of methods, depth of inheritance, coupling between objects, etc. These metrics help the model learn patterns associated with defects.

Historical Data:

Historical versions of the code along with their defect records provide the labeled data needed for training the classification model. By analyzing this historical data, the model can learn which metrics are indicative of defects.

Why Option D is Correct:

Option D specifies using a classification model to predict the presence of defects by using code quality metrics as input data. This accurately describes the process of defect prediction using historical bug data and quality metrics.

Eliminating Other Options:

A . Identifying the relationship between developers and the modules developed by them: This does not directly involve predicting defects based on code quality metrics and historical data.

B . Search of similar code based on natural language processing: While useful for other purposes, this method does not describe defect prediction using classification models and code metrics.

C . Clustering of similar code modules to predict based on similarity: Clustering is an unsupervised learning technique and does not directly align with the supervised learning approach typically used in defect prediction models.

ISTQB CT-AI Syllabus, Section 9.5, Metamorphic Testing (MT), describes various testing techniques including classification models for defect prediction.

'Using AI for Defect Prediction' (ISTQB CT-AI Syllabus, Section 11.5.1).

asked 25/12/2024
July Truong
42 questions

Question 32

Report Export Collapse

Which ONE of the following options describes a scenario of A/B testing the LEAST?


A comparison of two different websites for the same company to observe from a user acceptance perspective.

A comparison of two different websites for the same company to observe from a user acceptance perspective.

A comparison of two different offers in a recommendation system to decide on the more effective offer for same users.

A comparison of two different offers in a recommendation system to decide on the more effective offer for same users.

A comparison of the performance of an ML system on two different input datasets.

A comparison of the performance of an ML system on two different input datasets.

A comparison of the performance of two different ML implementations on the same input data.

A comparison of the performance of two different ML implementations on the same input data.

Suggested answer: C
Explanation:

A/B testing, also known as split testing, is a method used to compare two versions of a product or system to determine which one performs better. It is widely used in web development, marketing, and machine learning to optimize user experiences and model performance. Here's why option C is the least descriptive of an A/B testing scenario:

Understanding A/B Testing:

In A/B testing, two versions (A and B) of a system or feature are tested against each other. The objective is to measure which version performs better based on predefined metrics such as user engagement, conversion rates, or other performance indicators.

Application in Machine Learning:

In ML systems, A/B testing might involve comparing two different models, algorithms, or system configurations on the same set of data to observe which yields better results.

Why Option C is the Least Descriptive:

Option C describes comparing the performance of an ML system on two different input datasets. This scenario focuses on the input data variation rather than the comparison of system versions or features, which is the essence of A/B testing. A/B testing typically involves a controlled experiment with two versions being tested under the same conditions, not different datasets.

Clarifying the Other Options:

A . A comparison of two different websites for the same company to observe from a user acceptance perspective: This is a classic example of A/B testing where two versions of a website are compared.

B . A comparison of two different offers in a recommendation system to decide on the more effective offer for the same users: This is another example of A/B testing in a recommendation system.

D . A comparison of the performance of two different ML implementations on the same input data: This fits the A/B testing model where two implementations are compared under the same conditions.

ISTQB CT-AI Syllabus, Section 9.4, A/B Testing, explains the methodology and application of A/B testing in various contexts.

'Understanding A/B Testing' (ISTQB CT-AI Syllabus).

asked 25/12/2024
Glen Makhuvele
41 questions

Question 33

Report Export Collapse

Max. Score: 2

Al-enabled medical devices are used nowadays for automating certain parts of the medical diagnostic processes. Since these are life-critical process the relevant authorities are considenng bringing about suitable certifications for these Al enabled medical devices. This certification may involve several facets of Al testing (I - V).

I . Autonomy

II . Maintainability

III . Safety

IV . Transparency

V . Side Effects

Which ONE of the following options contains the three MOST required aspects to be satisfied for the above scenario of certification of Al enabled medical devices?


Aspects II, III and IV

Aspects II, III and IV

Aspects I, II, and III

Aspects I, II, and III

Aspects III, IV, and V

Aspects III, IV, and V

Aspects I, IV, and V

Aspects I, IV, and V

Suggested answer: C
Explanation:

For AI-enabled medical devices, the most required aspects for certification are safety, transparency, and side effects. Here's why:

Safety (Aspect III): Critical for ensuring that the AI system does not cause harm to patients.

Transparency (Aspect IV): Important for understanding and verifying the decisions made by the AI system.

Side Effects (Aspect V): Necessary to identify and mitigate any unintended consequences of the AI system.

Why Not Other Options:

Autonomy and Maintainability (Aspects I and II): While important, they are secondary to the immediate concerns of safety, transparency, and managing side effects in life-critical processes.

asked 25/12/2024
Karlis Priede
40 questions

Question 34

Report Export Collapse

Which ONE of the following options represents a technology MOST TYPICALLY used to implement Al?


Search engines

Search engines

Procedural programming

Procedural programming

Case control structures

Case control structures

Genetic algorithms

Genetic algorithms

Suggested answer: D
Explanation:

Technology Most Typically Used to Implement AI: Genetic algorithms are a well-known technique used in AI . They are inspired by the process of natural selection and are used to find approximate solutions to optimization and search problems. Unlike search engines, procedural programming, or case control structures, genetic algorithms are specifically designed for evolving solutions and are commonly employed in AI implementations.

Reference: ISTQB_CT-AI_Syllabus_v1.0, Section 1.4 AI Technologies, which identifies different technologies used to implement AI.

asked 25/12/2024
Nisanka Mandara
40 questions

Question 35

Report Export Collapse

Which ONE of the following characteristics is the least likely to cause safety related issues for an Al system?


Non-determinism

Non-determinism

Robustness

Robustness

High complexity

High complexity

Self-learning

Self-learning

Suggested answer: B
Explanation:

The question asks which characteristic is least likely to cause safety-related issues for an AI system. Let's evaluate each option:

Non-determinism (A): Non-deterministic systems can produce different outcomes even with the same inputs, which can lead to unpredictable behavior and potential safety issues.

Robustness (B): Robustness refers to the ability of the system to handle errors, anomalies, and unexpected inputs gracefully. A robust system is less likely to cause safety issues because it can maintain functionality under varied conditions.

High complexity (C): High complexity in AI systems can lead to difficulties in understanding, predicting, and managing the system's behavior, which can cause safety-related issues.

Self-learning (D): Self-learning systems adapt based on new data, which can lead to unexpected changes in behavior. If not properly monitored and controlled, this can result in safety issues.

ISTQB CT-AI Syllabus Section 2.8 on Safety and AI discusses various factors affecting the safety of AI systems, emphasizing the importance of robustness in maintaining safe operation.

asked 25/12/2024
Gerrit Struik
59 questions

Question 36

Report Export Collapse

A system was developed for screening the X-rays of patients for potential malignancy detection (skin cancer). A workflow system has been developed to screen multiple cancers by using several individually trained ML models chained together in the workflow.

Testing the pipeline could involve multiple kind of tests (I - III):

I . Pairwise testing of combinations

II . Testing each individual model for accuracy

III . A/B testing of different sequences of models

Which ONE of the following options contains the kinds of tests that would be MOST APPROPRIATE to include in the strategy for optimal detection?


Only III

Only III

I and II

I and II

I and III

I and III

Only II

Only II

Suggested answer: B
Explanation:

The question asks which combination of tests would be most appropriate to include in the strategy for optimal detection in a workflow system using multiple ML models.

Pairwise testing of combinations (I): This method is useful for testing interactions between different components in the workflow to ensure they work well together, identifying potential issues in the integration.

Testing each individual model for accuracy (II): Ensuring that each model in the workflow performs accurately on its own is crucial before integrating them into a combined workflow.

A/B testing of different sequences of models (III): This involves comparing different sequences to determine which configuration yields the best results. While useful, it might not be as fundamental as pairwise and individual accuracy testing in the initial stages.

ISTQB CT-AI Syllabus Section 9.2 on Pairwise Testing and Section 9.3 on Testing ML Models emphasize the importance of testing interactions and individual model accuracy in complex ML workflows.

asked 25/12/2024
Leonardo Amorim
38 questions

Question 37

Report Export Collapse

''BioSearch'' is creating an Al model used for predicting cancer occurrence via examining X-Ray images. The accuracy of the model in isolation has been found to be good. However, the users of the model started complaining of the poor quality of results, especially inability to detect real cancer cases, when put to practice in the diagnosis lab, leading to stopping of the usage of the model.

A testing expert was called in to find the deficiencies in the test planning which led to the above scenario.

Which ONE of the following options would you expect to MOST likely be the reason to be discovered by the test expert?


A lack of similarity between the training and testing data.

A lack of similarity between the training and testing data.

The input data has not been tested for quality prior to use for testing.

The input data has not been tested for quality prior to use for testing.

A lack of focus on choosing the right functional-performance metrics.

A lack of focus on choosing the right functional-performance metrics.

A lack of focus on non-functional requirements testing.

A lack of focus on non-functional requirements testing.

Suggested answer: A
Explanation:

The question asks which deficiency is most likely to be discovered by the test expert given the scenario of poor real-world performance despite good isolated accuracy.

A lack of similarity between the training and testing data (A): This is a common issue in ML where the model performs well on training data but poorly on real-world data due to a lack of representativeness in the training data. This leads to poor generalization to new, unseen data.

The input data has not been tested for quality prior to use for testing (B): While data quality is important, this option is less likely to be the primary reason for the described issue compared to the representativeness of training data.

A lack of focus on choosing the right functional-performance metrics (C): Proper metrics are crucial, but the issue described seems more related to the data mismatch rather than metric selection.

A lack of focus on non-functional requirements testing (D): Non-functional requirements are important, but the scenario specifically mentions issues with detecting real cancer cases, pointing more towards data issues.

ISTQB CT-AI Syllabus Section 4.2 on Training, Validation, and Test Datasets emphasizes the importance of using representative datasets to ensure the model generalizes well to real-world data.

Sample Exam Questions document, Question #40 addresses issues related to data representativeness and model generalization.


asked 25/12/2024
Slawomir Kucharski
36 questions

Question 38

Report Export Collapse

A ML engineer is trying to determine the correctness of the new open-source implementation *X', of a supervised regression algorithm implementation. R-Square is one of the functional performance metrics used to determine the quality of the model.

Which ONE of the following would be an APPROPRIATE strategy to achieve this goal?


Add 10% of the rows randomly and create another model and compare the R-Square scores of both the model.

Add 10% of the rows randomly and create another model and compare the R-Square scores of both the model.

Train various models by changing the order of input features and verify that the R-Square score of these models vary significantly.

Train various models by changing the order of input features and verify that the R-Square score of these models vary significantly.

Compare the R-Square score of the model obtained using two different implementations that utilize two different programming languages while using the same algorithm and the same training and testing data.

Compare the R-Square score of the model obtained using two different implementations that utilize two different programming languages while using the same algorithm and the same training and testing data.

Drop 10% of the rows randomly and create another model and compare the R-Square scores of both the models.

Drop 10% of the rows randomly and create another model and compare the R-Square scores of both the models.

Suggested answer: C
Explanation:

A . Add 10% of the rows randomly and create another model and compare the R-Square scores of both the models.

Adding more data to the training set can affect the R-Square score, but it does not directly verify the correctness of the implementation.

B . Train various models by changing the order of input features and verify that the R-Square score of these models vary significantly.

Changing the order of input features should not significantly affect the R-Square score if the implementation is correct, but this approach is more about testing model robustness rather than correctness of the implementation.

C . Compare the R-Square score of the model obtained using two different implementations that utilize two different programming languages while using the same algorithm and the same training and testing data.

This approach directly compares the performance of two implementations of the same algorithm. If both implementations produce similar R-Square scores on the same training and testing data, it suggests that the new implementation 'X' is correct.

D . Drop 10% of the rows randomly and create another model and compare the R-Square scores of both the models.

Dropping data can lead to variations in the R-Square score but does not directly verify the correctness of the implementation.

Therefore, option C is the most appropriate strategy because it directly compares the performance of the new implementation 'X' with another implementation using the same algorithm and datasets, which helps in verifying the correctness of the implementation.

asked 25/12/2024
Vladimir Litvinenko
36 questions

Question 39

Report Export Collapse

'Splendid Healthcare' has started developing a cancer detection system based on ML. The type of cancer they plan on detecting has 2% prevalence rate in the population of a particular geography. It is required that the model performs well for both normal and cancer patients.

Which ONE of the following combinations requires MAXIMIZATION?


Maximize precision and accuracy

Maximize precision and accuracy

Maximize accuracy and recall

Maximize accuracy and recall

Maximize recall and precision

Maximize recall and precision

Maximize specificity number of classes

Maximize specificity number of classes

Suggested answer: C
Explanation:

Prevalence Rate and Model Performance:

The cancer detection system being developed by 'Splendid Healthcare' needs to account for the fact that the type of cancer has a 2% prevalence rate in the population. This indicates that the dataset is highly imbalanced with far fewer positive (cancer) cases compared to negative (normal) cases.

Importance of Recall:

Recall, also known as sensitivity or true positive rate, measures the proportion of actual positive cases that are correctly identified by the model. In medical diagnosis, especially cancer detection, recall is critical because missing a positive case (false negative) could have severe consequences for the patient. Therefore, maximizing recall ensures that most, if not all, cancer cases are detected.

Importance of Precision:

Precision measures the proportion of predicted positive cases that are actually positive. High precision reduces the number of false positives, meaning fewer people will be incorrectly diagnosed with cancer. This is also important to avoid unnecessary anxiety and further invasive testing for those who do not have the disease.

Balancing Recall and Precision:

In scenarios where both false negatives and false positives have significant consequences, it is crucial to balance recall and precision. This balance ensures that the model is not only good at detecting positive cases but also accurate in its predictions, reducing both types of errors.

Accuracy and Specificity:

While accuracy (the proportion of total correct predictions) is important, it can be misleading in imbalanced datasets. In this case, high accuracy could simply result from the model predicting the majority class (normal) correctly. Specificity (true negative rate) is also important, but for a cancer detection system, recall and precision take precedence to ensure positive cases are correctly and accurately identified.

Conclusion:

Therefore, for a cancer detection system with a low prevalence rate, maximizing both recall and precision is crucial to ensure effective and accurate detection of cancer cases.

asked 25/12/2024
Gaston Cruz
48 questions

Question 40

Report Export Collapse

Which ONE of the following options describes the LEAST LIKELY usage of Al for detection of GUI changes due to changes in test objects?


Using a pixel comparison of the GUI before and after the change to check the differences.

Using a pixel comparison of the GUI before and after the change to check the differences.

Using a computer vision to compare the GUI before and after the test object changes.

Using a computer vision to compare the GUI before and after the test object changes.

Using a vision-based detection of the GUI layout changes before and after test object changes.

Using a vision-based detection of the GUI layout changes before and after test object changes.

Using a ML-based classifier to flag if changes in GUI are to be flagged for humans.

Using a ML-based classifier to flag if changes in GUI are to be flagged for humans.

Suggested answer: A
Explanation:

A. Using a pixel comparison of the GUI before and after the change to check the differences.

Pixel comparison is a traditional method and does not involve AI . It compares images at the pixel level, which can be effective but is not an intelligent approach. It is not considered an AI usage and is the least likely usage of AI for detecting GUI changes.

B. Using computer vision to compare the GUI before and after the test object changes.

Computer vision involves using AI techniques to interpret and process images. It is a likely usage of AI for detecting changes in the GUI .

C. Using vision-based detection of the GUI layout changes before and after test object changes.

Vision-based detection is another AI technique where the layout and structure of the GUI are analyzed to detect changes. This is a typical application of AI .

D. Using a ML-based classifier to flag if changes in GUI are to be flagged for humans.

An ML-based classifier can intelligently determine significant changes and decide if they need human review, which is a sophisticated AI application.

asked 25/12/2024
Chad Remick
40 questions
Total 80 questions
Go to page: of 8
Search

Related questions