IIBA CBDA Practice Test - Questions Answers, Page 8
List of questions
Related questions
Question 71
DIAGRAM
Year-over-Year Sales
300
200
Explanation:
Explanation: According to the Guide to Business Data Analytics, a boxplot is used to provide a visual summary of one or more groups of data values through their quartiles. In this case, the boxplot shows two different years, 2017 and 2018, with distinct medians and interquartile ranges. The median is represented by the line inside the box, while the interquartile range is represented by the height of the box itself. Outliers are marked with circles above and below the box. From the boxplot, we can see that the median sales for 2018 are higher than the median sales for 2017, and the interquartile range for 2018 is narrower than the interquartile range for 2017. This means that the sales for 2018 are more concentrated around the median and have less variability than the sales for 2017. Therefore, the correct answer is B.
Question 72
A grocery store chain has requested help in determining how customer preferences are changing with regards to home delivery. An analytics team has completed researching the number of online orders received requesting home delivery versus in-store pickup. The business analyst has selected a model to enable a quick comparison between curbside pick-up, in-store pickup, and home delivery for the last 3 years. Which model has the business analyst chosen?
Explanation:
Explanation: A bar chart is a graphical representation of data that uses rectangular bars of different heights or lengths to show the values of one or more variables1. A bar chart is suitable for comparing the number of online orders received requesting different types of delivery options for the last 3 years, as it can show the frequency or proportion of each category across time.A bar chart can also help identify trends, patterns, or outliers in the data2. A pie chart is a circular chart that shows the relative sizes of data points in a whole by using different-sized and colored slices3. A pie chart is not suitable for comparing the number of online orders received requesting different types of delivery options for the last 3 years, as it can only show the distribution of one variable at a time, and it does not show the changes over time.A pie chart can also be misleading or confusing if there are too many categories or if the slices are too similar in size4. A funnel chart is a type of chart that shows the stages of a process and the amount of data that passes through each stage5. A funnel chart is not suitable for comparing the number of online orders received requesting different types of delivery options for the last 3 years, as it does not show the categories of delivery options, but rather the progression of customers through a sales or marketing funnel.A funnel chart can help visualize the conversion rates, drop-off rates, or bottlenecks in a process6. A scatter plot is a type of chart that shows the relationship between two numerical variables by using dots to represent the values of each pair of data points. A scatter plot is not suitable for comparing the number of online orders received requesting different types of delivery options for the last 3 years, as it does not show the categories of delivery options, but rather the correlation or association between two continuous variables.A scatter plot can help identify the direction, strength, and shape of the relationship, as well as any outliers or clusters in the data.
Question 73
Analytics is being used to estimate the number of machine breakdowns a company will experience next year. The business analyst provides an optimistic estimate of 10 breakdowns, a pessimistic estimate of 100 breakdowns, and a most likely value of 50 breakdowns. What type of estimation is being used?
Explanation:
Explanation: According to the Guide to Business Data Analytics, PERT (Program Evaluation and Review Technique) is a type of estimation that uses three values: optimistic, pessimistic, and most likely. The PERT estimate is calculated as the weighted average of these three values, with more weight given to the most likely value. PERT can be used to estimate the duration, cost, or other variables of a project or activity, taking into account the uncertainty and variability of the dat a. PERT can help provide a realistic and reliable estimate based on the available information.
Question 74
A data scientist at a consumer goods company, has been asked to do a detailed analysis on customer profiles. The Data Scientist has identified an external data source that carries valuable additional information on their customers. The data scientist also identifies the address column as the most reliable column to join the internal data source with the external data source. Addresses may appear in different formats for example:
File A = '13 Smith St'
File B = 'Unit 7, 13 Smith Street'
Which of the following techniques would be useful in this situation?
Explanation:
Probabilistic linkage is a technique that uses statistical methods to match records from different data sources based on the similarity of key variables, such as name, address, date of birth, etc1.Probabilistic linkage can handle variations, errors, or missing values in the data, and assign a score or probability to each potential match2. Probabilistic linkage would be useful in this situation, as the address column may have different formats, spellings, or abbreviations in the internal and external data sources, and a deterministic linkage (which requires exact matches) might miss some valid matches or create false matches.
Deterministic linkage is a technique that uses predefined rules or criteria to match records from different data sources based on the exact agreement of key variables, such as identifiers, codes, or hashes3. Deterministic linkage would not be useful in this situation, as the address column may not have consistent or unique values in the internal and external data sources, and a probabilistic linkage (which allows for some variation or uncertainty) might find more accurate matches or avoid false matches.
Genetic linkage is a term used in genetics to describe the tendency of genes or DNA sequences that are located close together on a chromosome to be inherited together4. Genetic linkage is not relevant to this situation, as it has nothing to do with matching records from different data sources based on the address column.
Cuff linkage is a term used in sewing to describe the process of attaching a cuff to a sleeve by stitching or fastening.Cuff linkage is not relevant to this situation, as it has nothing to do with matching records from different data sources based on the address column.
Question 75
There were 7 students enrolled in the Introduction to Artificial Intelligence course. The scores from the final exam were as follows: 64, 70. 80, 80, 90, 98, 100
What is the mean and median for the outlined scores?
Explanation:
Explanation: The mean of a set of numbers is the sum of the numbers divided by the number of numbers. The median of a set of numbers is the middle value when the numbers are arranged in ascending or descending order. To find the mean and median of the given scores, we can use the following steps: To find the mean, we add up all the scores and divide by 7, the number of students. The mean is (64 + 70 + 80 + 80 + 90 + 98 + 100) / 7 = 582 / 7 = 83.14 To find the median, we arrange the scores in ascending order: 64, 70, 80, 80, 90, 98, 100. Since there are an odd number of scores, the median is the middle score, which is 80. Therefore, the mean and median for the outlined scores are 83.14 and 80, respectively.
Question 76
An analytics team is sourcing data for a new analytics initiative and is deciding between two comparable data sources. One source being considered is a very large dataset and another consists of three smaller sources. What advantage will the larger dataset provide over the three smaller sources?
Explanation:
Explanation: A larger dataset may provide more significant results than three smaller sources, as it may have more statistical power to detect differences or relationships among variables1.Statistical power is the probability of finding a statistically significant result when there is a true effect in the population2.A larger dataset may have more power because it may have more variability, less sampling error, and higher precision than smaller datasets3. More significant results may lead to more confident and valid conclusions and recommendations for the analytics initiative. Higher validity, more reproducibility, and higher reliability are not necessarily advantages of a larger dataset over three smaller sources, as they depend on other factors besides the size of the dat a.Validity is the degree to which the data and the analysis measure what they are intended to measure4. Reproducibility is the degree to which the data and the analysis can be replicated by another analyst using the same methods and data sources. Reliability is the degree to which the data and the analysis produce consistent results under the same conditions. These qualities may be affected by the quality, accuracy, completeness, and relevance of the data, as well as the appropriateness, transparency, and rigor of the analysis methods.A larger dataset may not be valid, reproducible, or reliable if it has errors, biases, missing values, or irrelevant variables, or if the analysis methods are not suitable, documented, or verified.
Question 77
To ensure their recommendation can be acted upon, the business analysis professional on the analytics team helps the team complete financial analysis to support their recommendation. As part of the financial analysis that's completed, the cost-benefit analysis shows positive net benefits starting in the 2nd year. The team feels this is sufficient to proceed with their strong endorsement of the recommendation. The business analysis professional:
Explanation:
According to the Guide to Business Data Analytics, a cost-benefit analysis is a technique that compares the costs and benefits of a project or decision over a period of time. The net benefit is the difference between the total benefits and the total costs. A positive net benefit indicates that the benefits outweigh the costs. However, a positive net benefit in one year does not necessarily mean that the project or decision is financially viable. The business analysis professional should also consider the cumulative net benefit, which is the sum of the net benefits over the entire time horizon. The cumulative net benefit reflects the overall value of the project or decision, taking into account the time value of money and the opportunity cost of capital. A project or decision is only financially feasible if the cumulative net benefit is positive at the end of the time horizon. Therefore, the business analysis professional should disagree with the team and suggest that they review the cumulative net benefit before endorsing the recommendation.
Question 78
Which attribute in the Customerissues entity would be categorized as unstructured data?
* CustomerlD
* ConcernCategory
* ConcernSubCategory
* AgentID
* ComplaintNotes
* lssueResolved(Y/N)
Explanation:
Explanation: Unstructured data is data that does not have a predefined format, structure, or schema, and that cannot be easily stored, processed, or analyzed by traditional databases or tools1.Unstructured data may include text, images, audio, video, or other types of data that are rich in information but complex and diverse in nature2. In the Customerissues entity, the ComplaintNotes attribute would be categorized as unstructured data, as it may contain free-form text that captures the details, sentiments, or emotions of the customers' complaints, and that may vary in length, language, tone, or style.The ComplaintNotes attribute would require special techniques, such as natural language processing, text mining, or sentiment analysis, to extract meaningful insights from the unstructured data3. The other attributes in the Customerissues entity would be categorized as structured data, as they have a predefined format, structure, or schema, and that can be easily stored, processed, or analyzed by traditional databases or tools4.Structured data may include numbers, dates, codes, categories, or other types of data that are simple and consistent in nature5.In the Customerissues entity, the CustomerlD, ConcernCategory, ConcernSubCategory, AgentID, and lssueResolved(Y/N) attributes would be categorized as structured data, as they may contain numeric, alphanumeric, or binary values that represent the identifiers, classifications, or statuses of the customers' issues, and that may have fixed lengths, ranges, or domains.
Question 79
A manufacturing company, specializing in turf maintenance equipment, has recently seen a decline in their lawn mower sales. As a result, the analytics team is asked to review the latest customer satisfaction survey results. An analyst on this team creates a report for senior management with attractive visuals, supported by the KPI results. Upon reviewing the report, the analyst's manager mentions that the report is missing the narrative. What does this mean?
Explanation:
A narrative is a written or spoken explanation of the data analysis results that tells a story with the data, provides additional context and background information, highlights the key insights and findings, and draws correlations and implications for the decision makers12. The report is missing the narrative, meaning that it does not communicate the meaning and value of the data analysis effectively, and it leaves the interpretation and action to the senior management without any guidance or recommendation34.
Reference: 1: Guide to Business Data Analytics, IIBA, 2020, p. 672: Storytelling with Data, Cole Nussbaumer Knaflic, 2015, p. 93: Data Storytelling: The Essential Data Science Skill Everyone Needs, Brent Dykes, 2016, 14: The Power of Data Storytelling, Harvard Business Review, 2018, 2.
Question 80
The analytics team scheduled a meeting with key stakeholders to present their recommendations. The team envisioned this as the final step of their work and fully expected complete acceptance of those recommendations, particularly given that very few questions were asked. They were surprised when they received word that the organization wasn't ready to move forward. What did they overlook?
Explanation:
The analytics team overlooked the fact that communicating information is not a one-way or one-time process, but rather a bi-directional and iterative one. This means that the team should not only present their recommendations, but also solicit feedback, address concerns, clarify doubts, and confirm understanding from the stakeholders. By doing so, the team can ensure that the stakeholders are fully engaged, informed, and aligned with the recommendations, and that any potential barriers or risks are identified and mitigated before moving forward.
Reference:
* Understanding the Guide to Business Data Analytics, page 9
* Business Analysis Certification in Data Analytics, CBDA | IIBA, CBDA Competencies, Domain 4: Interpret and Report Results
* CERTIFICATION IN BUSINESS DATA ANALYTICS HANDBOOK - IIBA, page 5, Step 3 -- Schedule and Take The Exam
Question