IT Questions

Assignment 3 QTS

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

1)

Present an example where data mining is crucial to the success of a business. What data mining functionalities does this business need (e.g., think of the kinds of patterns that could be mined)?

2)

 

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

Suppose that the data for analysis includes the attribute grade. The grade values for the data tuples are (in increasing order) 10, 12, 13, 13, 16, 17, 17, 18, 19, 19, 22, 22, 22, 22, 27, 30, 30, 32, 32, 32, 32, 32, 37, 42, 43, 49, 67.

a) Find the interquartile range.

b) Draw a boxplot of the data.

 3)

Consider the following relational database for the Central Zoo. Central Zoo wants to maintain information about its animals, the enclosures in which they live, and its zookeepers and the services they perform for the animals.  In addition, Central Zoo has a program by which people can be sponsor of animals.  Central Zoo wants to track its sponsors, their dependents, and associated data. Each animal has a unique animal number and each enclosure has a unique enclosure number.  An animal can live in only one enclosure.  An enclosure can have several animals in it or it can be currently empty.  A zookeeper has a unique employee number.  Every animal has been cared for by at least one and generally many zookeepers; each zookeeper has cared for at least one and generally many animals.  Each time a zookeeper performs a specific, significant service for an animal the service type, date, and time are recorded.  A zookeeper may perform a particular service on a particular animal more than once on a given day. A sponsor, who has a unique sponsor number and a unique social security number, sponsors at least one and possibly several animals.  An animal may have several sponsors or none.  For each animal that a particular sponsor sponsors, the zoo wants to track the annual sponsorship contribution and renewal date.  In addition, Central Zoo wants to keep track of each sponsor’s dependents.  A sponsor may have several dependents or none.  A dependent is associated with exactly one sponsor.

a) Describe three OLAP uses of this data warehouse.

b) Design a multidimensional database using a star schema for a data warehouse for the Central Zoo business environment.

4)

 

Consider the market basket transactions on the following table:

CID

TID

Item Bought

10

1

{T, S, R}

20

2

{Q, P, T}

30

3

{T, R, O}

40

4

{Q, P, O}

50

5

{S, O, R}

50

6

{T, R, Q, P}

40

7

{Q, P, R}

30

8

{S, R}

20

9

{T, R, Q, P}

10

10

{S, O}

a) List 3 different association rules from the above table, by treating each Customer (CID) as a market basket. Each item should be treated as a binary variable (1 if an item appears in at least one transaction bought by the customer, and 0 otherwise.) Show the support and confidence for each rule.

b) Show the Frequent Pattern tree (FP tree) that would be made for the data set. Let min_sup = 30%.

5)

Consider the following table which describes the sales data for electronics company, according to the dimensions time, item, and location. 

Pid

Quarter

Locid

Sales

300

2

3

30

300

3

3

13

300

4

3

20

400

2

3

35

400

3

3

25

400

4

3

55

500

2

3

13

500

3

3

15

500

4

3

15

300

2

4

40

300

3

4

27

300

4

4

15

400

2

4

31

400

3

4

50

400

4

4

25

500

2

4

25

500

3

4

45

500

4

4

10

a) Find the result of roll-up (drill-up) operation on location.

b) Find the result of drill-down operation on time from quarters to months.

c) Find the result of slice operation for time =”Q3”.

 

6)

The following contingency table summarizes the relationship between people who drink tea and coffee. Where coffee refers to people drink coffee, ¬coffee refers to people not drink coffee, tea refers to people drink tea, and ¬tea refers to people not drink tea.

 

coffee

¬coffee

tea

20

5

¬tea

10

15

a) Suppose that the association rule “coffee ⇒ tea” is mined. Given a minimum support threshold of 25% and a minimum confidence threshold of 50%, is this association rule strong?

 

b) Based on the given data, can we conclude that coffee drinkers and tea drinkers are independent? If not, what kind of correlation relationship exists between the two? 

 

7- A database has ten transactions. Let min_sup = 30%.

 

TID

Items Bought

100

{A, B ,D, E}

200

{B, C, D}

300

{A ,B, D, E}

400

{A, C, D, E}

500

{B, C, D, E}

600

{B, D, E}

700

{C, D}

800

{A, B, C}

900

{A, D, E}

1000

{B, D}

 

(a) Apply the Apriori algorithm to the above data set.

 

(b) Show the FP tree that would be made for the data set. 

 

8- A survey of college students determined the preference for cell phone providers. The following data were obtained. 

Provider

Gender

T-Mobile

AT&T

Verizon

Other

Male

12

39

27

16

Female

8

22

24

12

 

Can we conclude that gender and cell phone provider are independent? (Hint: Assume the significance level = 0.05). 

Calculate your order
Pages (275 words)
Standard price: $0.00
Client Reviews
4.9
Sitejabber
4.6
Trustpilot
4.8
Our Guarantees
100% Confidentiality
Information about customers is confidential and never disclosed to third parties.
Original Writing
We complete all papers from scratch. You can get a plagiarism report.
Timely Delivery
No missed deadlines – 97% of assignments are completed in time.
Money Back
If you're confident that a writer didn't follow your order details, ask for a refund.

Calculate the price of your order

You will get a personal manager and a discount.
We'll send you the first draft for approval by at
Total price:
$0.00
Power up Your Academic Success with the
Team of Professionals. We’ve Got Your Back.
Power up Your Study Success with Experts We’ve Got Your Back.

Order your essay today and save 30% with the discount code ESSAYHELP