DM WA-5

1. Consider the data set shown in Table 5.20 (439 page). (Chapter 5)

(a) Compute the support for itemsets {e}, {b, d}, and {b, d, e} by treating each transaction ID as a market basket.

(b) Use the results in part (a) to compute the confidence for the association rules {b, d} −→ {e} and {e} −→ {b, d}. Is confidence a symmetric measure?

(c) Repeat part (a) by treating each customer ID as a market basket. Each item should be treated as a binary variable (1 if an item appears in at least one transaction bought by the customer, and 0 otherwise). Use this result to compute the confidence for the association rules {b, d} −→ {e} and {e} −→ {b, d}.

2. Consider the transactions shown in Table 6.15, with an item taxonomy given in Figure 6.15 (515 page). (Chapter 6)

(a) What are the main challenges of mining association rules with item taxonomy?

(b) Consider the approach where each transaction t is replaced by an extended transaction t_ that contains all the items in t as well as their respective ancestors. For example, the transaction t = { Chips, Cookies} will be replaced by t_ = {Chips, Cookies, Snack Food, Food}. Use this approach to derive all frequent itemsets (up to size 4) with support ≥ 70%.

(c) Consider an alternative approach where the frequent itemsets are generated one level at a time. Initially, all the frequent itemsets involving items at the highest level of the hierarchy are generated. Next, we use the frequent itemsets discovered at the higher level of the hierarchy to generate candidate itemsets involving items at the lower levels of the hierarchy. For example, we generate the candidate itemset {Chips, Diet Soda} only if {Snack Food, Soda} is frequent. Use this approach to derive all frequent itemsets (up to size 4) with support ≥ 70%.

3. Consider a data set consisting of 220 data vectors, where each vector has 32 components and each component is a 4-byte value. Suppose that vector quantization is used for compression and that 216 prototype vectors are used. How many bytes of storage does that data set take before and after compression and what is the compression ratio? (Chapter 7)

Question 1: Data Set

1. Consider the data set shown in Table 5.20 (439 page). (Chapter 5)

(a) Compute the support for itemsets {e}, {b, d}, and {b, d, e} by treating each transaction ID as a market basket.

(b) Use the results in part (a) to compute the confidence for the association rules {b, d} −→ {e} and {e} −→ {b, d}. Is confidence a symmetric measure?

2. Consider the transactions shown in Table 6.15, with an item taxonomy given in Figure 6.15 (515 page). (Chapter 6)

(a) What are the main challenges of mining association rules with item taxonomy?

Used this link for the reference:

https://icourse.club/uploads/files/154047b04cec59e96abf82fd882236d39f0e3be7

Turn in your highest-quality paper
Get a qualified writer to help you with

“ DM WA-5 ”

Get high-quality paper

NEW! AI matching with writer

Hire a Writer

Client Reviews

4.9

Sitejabber

4.6

Trustpilot

4.8

Our Guarantees

100% Confidentiality

Information about customers is confidential and never disclosed to third parties.

Original Writing

We complete all papers from scratch. You can get a plagiarism report.

Timely Delivery

No missed deadlines – 97% of assignments are completed in time.

Money Back

If you're confident that a writer didn't follow your order details, ask for a refund.

New to Your Trusted Assignment Help Service? Sign up & Save

Calculate the price of your order

Type of paper needed:

Pages:

You will get a personal manager and a discount.

Academic level:

We'll send you the first draft for approval by at

Total price:

$0.00

Power up Your Academic Success with the
Team of Professionals. We’ve Got Your Back.

Power up Your Study Success with Experts We’ve Got Your Back.

Order Now Order Now