Business Intelligence Research Paper

Please refer to the attached document for actual assignment and I am attaching the chapter 5 and 6 PPts as well

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

Graded Assignment:  Knowledge and Skills Paper

Paper Section 1: Reflection and Literature Review

Using Microsoft Word and Professional APA format, prepare a professional written paper supported with three sources of research based on what you have learned from chatpers 5 and 6.  This section of the paper should be a minimum of two pages. 

Paper Section 2:  Applied Learning Exercises

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

In this section of the professional paper, apply what you have learned from chapters 5 and 6 to descriptively address and answer the problems below.  Important Note :  Dot not type the actual written problems within the paper itself.

1. Examine how new data-capture devices such as radio-frequency identification (RFID) tags help organizations accurately identify and segment their customers for activities such as targeted marketing. Many of these applications involve data mining. Scan the literature and the Web and then propose five potential new data mining applications that can use the data created with RFID technology. What issues could arise if a country’s laws required such devices to be embedded in everyone’s body for a national identification system?

2. Survey and compare some data mining tools and vendors. Start with fairisaac.com and egain.com. Consult dmreview.com and identify some data mining products and service providers that are not mentioned in this chapter. One of my favorites to explore is RapidMiner found at 

https://rapidminer.com/

 and an educational license option can be found at:  

https://rapidminer.com/educational-program/

3. Explore the Web sites of several neural network vendors, such as California Scientific Software (calsci.com), NeuralWare (neuralware.com), and Ward Systems Group (wardsystems.com), and review some of their products. Download at least two demos and install, run, and compare them.

4. Important Note:  With limited time for a college class, perfection is not expected but effort to be exposed to various tools with attempts to learn about them is critical when considering a career in information technology associated disciplines.

Important Note :  There is no specific page requirement for this section of the paper but make sure any content provided fully addresses each problem.

Paper Section 3:  Conclusions

After addressing the problems, conclude your paper with details on how you will use this knowledge and skills to support your professional and or academic goals. This section of the paper should be around one page including a custom and original process flow or flow diagram to visually represent how you will apply this knowledge going forward.  This customized and original flow process flow or flow diagram can be created using the “Smart Art” tools in Microsoft Word.

Paper Section 4:  APA Reference Page

The three or more sources of research used to support this overall paper should be included in proper APA format in the final section of the paper.

Paper Review and Preparation to submit for Grading

Please make sure to proof read your post prior to submission. This professional paper should be well written and free of grammatical or typographical errors. Also remember not to plagiarize!!!!!!!!!!!!

Chapter 5:
Data Mining

Business Intelligence and Analytics: Systems for Decision Support
(10th Edition)

Business Intelligence and Analytics: Systems for Decision Support
(10th Edition)
Copyright © 2014 Pearson Education, Inc.

5-‹#›

1

Learning Objectives
Define data mining as an enabling technology for business intelligence
Understand the objectives and benefits of business analytics and data mining
Recognize the wide range of applications of data mining
Learn the standardized data mining processes
CRISP-DM
SEMMA
KDD
(Continued…)

Copyright © 2014 Pearson Education, Inc.

5-‹#›

Learning Objectives
Understand the steps involved in data preprocessing for data mining
Learn different methods and algorithms of data mining
Build awareness of the existing data mining software tools
Commercial versus free/open source
Understand the pitfalls and myths of data mining

Copyright © 2014 Pearson Education, Inc.

5-‹#›

Opening Vignette…
Cabela’s Reels in More Customers with Advanced Analytics and Data Mining
Decision situation
Problem
Proposed solution
Results
Answer & discuss the case questions.

Copyright © 2014 Pearson Education, Inc.

5-‹#›

4

Questions for the
Opening Vignette
Why should retailers, especially omni-channel retailers, pay extra attention to advanced analytics and data mining?
What are the top challenges for multi-channel retailers? Can you think of other industry segments that face similar problems/challenges?
What are the sources of data that retailers such as Cabela’s use for their data mining projects?
What does it mean to have a “single view of the customer”? How can it be accomplished?

Copyright © 2014 Pearson Education, Inc.

5-‹#›

Data Mining Concepts/Definitions
Why Data Mining?
More intense competition at the global scale.
Recognition of the value in data sources.
Availability of quality data on customers, vendors, transactions, Web, etc.
Consolidation and integration of data repositories into data warehouses.
The exponential increase in data processing and storage capabilities; and decrease in cost.
Movement toward conversion of information resources into nonphysical form.

Copyright © 2014 Pearson Education, Inc.

5-‹#›

6

Definition of Data Mining
The nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data stored in structured databases. – Fayyad et al., (1996)
Keywords in this definition: Process, nontrivial, valid, novel, potentially useful, understandable.
Data mining: a misnomer?
Other names: knowledge extraction, pattern analysis, knowledge discovery, information harvesting, pattern searching, data dredging,…

Copyright © 2014 Pearson Education, Inc.

5-‹#›

7

Data Mining is at the Intersection of Many Disciplines

Copyright © 2014 Pearson Education, Inc.

5-‹#›

8

Source of data for DM is often a consolidated data warehouse (not always!).
DM environment is usually a client-server or a Web-based information systems architecture.
Data is the most critical ingredient for DM which may include soft/unstructured data.
The miner is often an end user
Striking it rich requires creative thinking
Data mining tools’ capabilities and ease of use are essential (Web, Parallel processing, etc.)
Data Mining Characteristics/Objectives

Copyright © 2014 Pearson Education, Inc.

5-‹#›

9

Application Case 5.1
Smarter Insurance: Infinity P&C Improves Customer Service and Combats Fraud with Predictive Analytics
Questions For Discussion
How did Infinity P&C improve customer service with data mining?
What were the challenges, the proposed solution, and the obtained results?
What was their implementation strategy? Why is it important to produce results as early as possible in data mining studies?

Copyright © 2014 Pearson Education, Inc.

5-‹#›

10

Data in Data Mining
Data: a collection of facts usually obtained as the result of experiences, observations, or experiments.
Data may consist of numbers, words, images, …
Data: lowest level of abstraction (from which information and knowledge are derived).

Copyright © 2014 Pearson Education, Inc.

5-‹#›

11

DM extract patterns from data
Pattern? A mathematical (numeric and/or symbolic) relationship among data items
Types of patterns
Association
Prediction
Cluster (segmentation)
Sequential (or time series) relationships
What Does DM Do?
How Does it Work?

Copyright © 2014 Pearson Education, Inc.

5-‹#›

12

Application Case 5.2
Harnessing Analytics to Combat Crime: Predictive Analytics Helps Memphis Police Department Pinpoint Crime and Focus Police Resources
Questions For Discussion
How did the Memphis Police Department used data mining to better combat crime?
What were the challenges, the proposed solution, and the obtained results?

Copyright © 2014 Pearson Education, Inc.

5-‹#›

13

A Taxonomy for
Data Mining Tasks

Copyright © 2014 Pearson Education, Inc.

5-‹#›

14

Data Mining Tasks (cont.)
Time-series forecasting
Part of sequence or link analysis?
Visualization
Another data mining task?
Types of DM
Hypothesis-driven data mining
Discovery-driven data mining

Copyright © 2014 Pearson Education, Inc.

5-‹#›

15

Data Mining Applications
Customer Relationship Management
Maximize return on marketing campaigns
Improve customer retention (churn analysis)
Maximize customer value (cross-, up-selling)
Identify and treat most valued customers
Banking & Other Financial
Automate the loan application process
Detecting fraudulent transactions
Maximize customer value (cross-, up-selling)
Optimizing cash reserves with forecasting

Copyright © 2014 Pearson Education, Inc.

5-‹#›

16

Data Mining Applications (cont.)
Retailing and Logistics
Optimize inventory levels at different locations
Improve the store layout and sales promotions
Optimize logistics by predicting seasonal effects
Minimize losses due to limited shelf life
Manufacturing and Maintenance
Predict/prevent machinery failures
Identify anomalies in production systems to optimize the use manufacturing capacity
Discover novel patterns to improve product quality

Copyright © 2014 Pearson Education, Inc.

5-‹#›

17

Data Mining Applications (cont.)
Brokerage and Securities Trading
Predict changes on certain bond prices
Forecast the direction of stock fluctuations
Assess the effect of events on market movements
Identify and prevent fraudulent activities in trading
Insurance
Forecast claim costs for better business planning
Determine optimal rate plans
Optimize marketing to specific customers
Identify and prevent fraudulent claim activities

Copyright © 2014 Pearson Education, Inc.

5-‹#›

18

Data Mining Applications (cont.)
Computer hardware and software
Science and engineering
Government and defense
Homeland security and law enforcement
Travel industry
Healthcare
Medicine
Entertainment industry
Sports
Etc.
Increasingly more popular application areas for data mining

Copyright © 2014 Pearson Education, Inc.

5-‹#›

19

Application Case 5.3
A Mine on Terrorist Funding
Questions For Discussion
How can data mining be used to fight terrorism? Comment on what else can be done beyond what is covered in this short application case.
Do you think data mining, while essential for fighting terrorist cells, also jeopardizes individuals’ rights of privacy?

Copyright © 2014 Pearson Education, Inc.

5-‹#›

20

Data Mining Process
A manifestation of best practices
A systematic way to conduct DM projects
Different groups has different versions
Most common standard processes:
CRISP-DM (Cross-Industry Standard Process for Data Mining)
SEMMA (Sample, Explore, Modify, Model, and Assess)
KDD (Knowledge Discovery in Databases)

Copyright © 2014 Pearson Education, Inc.

5-‹#›

21

Data Mining Process
Source: KDNuggets.com

Copyright © 2014 Pearson Education, Inc.

5-‹#›

22

Data Mining Process: CRISP-DM

Copyright © 2014 Pearson Education, Inc.

5-‹#›

23

Data Mining Process: CRISP-DM
Step 1: Business Understanding
Step 2: Data Understanding
Step 3: Data Preparation (!)
Step 4: Model Building
Step 5: Testing and Evaluation
Step 6: Deployment
The process is highly repetitive and experimental (DM: art versus science?)
Accounts for ~85% of total project time

Copyright © 2014 Pearson Education, Inc.

5-‹#›

24

Data Preparation – A Critical DM Task

Copyright © 2014 Pearson Education, Inc.

5-‹#›

25

Data Mining Process: SEMMA

Copyright © 2014 Pearson Education, Inc.

5-‹#›

26

Application Case 5.4
Data Mining in Cancer Research
Questions For Discussion
How can data mining be used for ultimately curing illnesses like cancer?
What do you think are the promises and major challenges for data miners in contributing to medical and biological research endeavors?

Copyright © 2014 Pearson Education, Inc.

5-‹#›

27

Data Mining Methods: Classification
Most frequently used DM method
Part of the machine-learning family
Employ supervised learning
Learn from past data, classify new data
The output variable is categorical (nominal or ordinal) in nature
Classification versus regression?
Classification versus clustering?

Copyright © 2014 Pearson Education, Inc.

5-‹#›

28

Predictive accuracy
Hit rate
Speed
Model building; predicting
Robustness
Scalability
Interpretability
Transparency, explainability
Assessment Methods for Classification

Copyright © 2014 Pearson Education, Inc.

5-‹#›

29

Accuracy of Classification Models
In classification problems, the primary source for accuracy estimation is the confusion matrix

Copyright © 2014 Pearson Education, Inc.

5-‹#›

30

Estimation Methodologies for Classification
Simple split (or holdout or test sample estimation)
Split the data into 2 mutually exclusive sets training (~70%) and testing (30%)

For ANN, the data is split into three sub-sets (training [~60%], validation [~20%], testing [~20%])

Copyright © 2014 Pearson Education, Inc.

5-‹#›

31

Estimation Methodologies for Classification
k-Fold Cross Validation (rotation estimation)
Split the data into k mutually exclusive subsets
Use each subset as testing while using the rest of the subsets as training
Repeat the experimentation for k times
Aggregate the test results for true estimation of prediction accuracy training
Other estimation methodologies
Leave-one-out, bootstrapping, jackknifing
Area under the ROC curve

Copyright © 2014 Pearson Education, Inc.

5-‹#›

32

Estimation Methodologies for Classification – ROC Curve

Copyright © 2014 Pearson Education, Inc.

5-‹#›

33

Classification Techniques
Decision tree analysis
Statistical analysis
Neural networks
Support vector machines
Case-based reasoning
Bayesian classifiers
Genetic algorithms
Rough sets

Copyright © 2014 Pearson Education, Inc.

5-‹#›

34

Decision Trees
Create a root node and assign all of the training data to it.
Select the best splitting attribute.
Add a branch to the root node for each value of the split. Split the data into mutually exclusive subsets along the lines of the specific split.
Repeat the steps 2 and 3 for each and every leaf node until the stopping criteria is reached.
A general algorithm for decision tree building
Employs the divide and conquer method
Recursively divides a training set until each division consists of examples from one class

Copyright © 2014 Pearson Education, Inc.

5-‹#›

35

Decision Trees
DT algorithms mainly differ on
Splitting criteria
Which variable, what value, etc.
Stopping criteria
When to stop building the tree
Pruning (generalization method)
Pre-pruning versus post-pruning
Most popular DT algorithms include
ID3, C4.5, C5; CART; CHAID; M5

Copyright © 2014 Pearson Education, Inc.

5-‹#›

36

Decision Trees
Alternative splitting criteria
Gini index determines the purity of a specific class as a result of a decision to branch along a particular attribute/value
Used in CART
Information gain uses entropy to measure the extent of uncertainty or randomness of a particular attribute/value split
Used in ID3, C4.5, C5
Chi-square statistics (used in CHAID)

Copyright © 2014 Pearson Education, Inc.

5-‹#›

37

Application Case 5.5
2degrees Gets a 1275 Percent Boost in Churn Identification
Questions For Discussion
What does 2degrees do? Why is it important for 2degrees to accurately identify churn?
What were the challenges, the proposed solution, and the obtained results?
How can data mining help in identifying customer churn? How do some companies do it without using data mining tools and techniques?
Why is it important for Delta Lloyd Group to comply with industry regulations?

Copyright © 2014 Pearson Education, Inc.

5-‹#›

38

Cluster Analysis for Data Mining
Used for automatic identification of natural groupings of things
Part of the machine-learning family
Employ unsupervised learning
Learns the clusters of things from past data, then assigns new instances
There is not an output variable
Also known as segmentation

Copyright © 2014 Pearson Education, Inc.

5-‹#›

39

Cluster Analysis for Data Mining
Clustering results may be used to
Identify natural groupings of customers
Identify rules for assigning new cases to classes for targeting/diagnostic purposes
Provide characterization, definition, labeling of populations
Decrease the size and complexity of problems for other data mining methods
Identify outliers in a specific domain (e.g., rare-event detection)

Copyright © 2014 Pearson Education, Inc.

5-‹#›

40

Cluster Analysis for Data Mining
Analysis methods
Statistical methods (including both hierarchical and nonhierarchical), such as k-means, k-modes, and so on.
Neural networks (adaptive resonance theory [ART], self-organizing map [SOM])
Fuzzy logic (e.g., fuzzy c-means algorithm)
Genetic algorithms

Copyright © 2014 Pearson Education, Inc.

5-‹#›

41

Cluster Analysis for Data Mining
How many clusters?
There is not a “truly optimal” way to calculate it
Heuristics are often used
Most cluster analysis methods involve the use of a distance measure to calculate the closeness between pairs of items.
Euclidian versus Manhattan/Rectilinear distance

Copyright © 2014 Pearson Education, Inc.

5-‹#›

42

Cluster Analysis for Data Mining
k-Means Clustering Algorithm
k : pre-determined number of clusters
Algorithm (Step 0: determine value of k)
Step 1: Randomly generate k random points as initial cluster centers.
Step 2: Assign each point to the nearest cluster center.
Step 3: Re-compute the new cluster centers.
Repetition step: Repeat steps 3 and 4 until some convergence criterion is met (usually that the assignment of points to clusters becomes stable).

Copyright © 2014 Pearson Education, Inc.

5-‹#›

43

Cluster Analysis for Data Mining –
k-Means Clustering Algorithm

Copyright © 2014 Pearson Education, Inc.

5-‹#›

44

Association Rule Mining
A very popular DM method in business
Finds interesting relationships (affinities) between variables (items or events)
Part of machine learning family
Employs unsupervised learning
There is no output variable
Also known as market basket analysis
Often used as an example to describe DM to ordinary people, such as the famous “relationship between diapers and beers!”

Copyright © 2014 Pearson Education, Inc.

5-‹#›

45

Association Rule Mining
Input: the simple point-of-sale transaction data
Output: Most frequent affinities among items
Example: according to the transaction data…
“Customer who bought a lap-top computer and a virus protection software, also bought extended service plan 70 percent of the time.”
How do you use such a pattern/knowledge?
Put the items next to each other
Promote the items as a package
Place items far apart from each other!

Copyright © 2014 Pearson Education, Inc.

5-‹#›

46

Association Rule Mining
A representative applications of association rule mining include
In business: cross-marketing, cross-selling, store design, catalog design, e-commerce site design, optimization of online advertising, product pricing, and sales/promotion configuration
In medicine: relationships between symptoms and illnesses; diagnosis and patient characteristics and treatments (to be used in medical DSS); and genes and their functions (to be used in genomics projects)

Copyright © 2014 Pearson Education, Inc.

5-‹#›

47

Association Rule Mining
Are all association rules interesting and useful?
A Generic Rule: X  Y [S%, C%]
X, Y: products and/or services
X: Left-hand-side (LHS)
Y: Right-hand-side (RHS)
S: Support: how often X and Y go together
C: Confidence: how often Y go together with the X
Example: {Laptop Computer, Antivirus Software}  {Extended Service Plan} [30%, 70%]

Copyright © 2014 Pearson Education, Inc.

5-‹#›

48

Association Rule Mining
Algorithms are available for generating association rules
Apriori
Eclat
FP-Growth
+ Derivatives and hybrids of the three
The algorithms help identify the frequent item sets, which are, then converted to association rules

Copyright © 2014 Pearson Education, Inc.

5-‹#›

49

Association Rule Mining
Apriori Algorithm
Finds subsets that are common to at least a minimum number of the itemsets
Uses a bottom-up approach
frequent subsets are extended one item at a time (the size of frequent subsets increases from one-item subsets to two-item subsets, then three-item subsets, and so on), and
groups of candidates at each level are tested against the data for minimum support.
(see the figure)  —

Copyright © 2014 Pearson Education, Inc.

5-‹#›

50

Association Rule Mining
Apriori Algorithm

Copyright © 2014 Pearson Education, Inc.

5-‹#›

51

Data Mining
Software
Commercial
IBM SPSS Modeler (formerly Clementine)
SAS – Enterprise Miner
IBM – Intelligent Miner
StatSoft – Statistica Data Miner
… many more
Free and/or Open Source
R
RapidMiner
Weka…
Source: KDNuggets.com

Copyright © 2014 Pearson Education, Inc.

5-‹#›

52

Big Data Software Tools
and Platforms

Copyright © 2014 Pearson Education, Inc.

5-‹#›

Application Case 5.6
Data Mining Goes to Hollywood: Predicting Financial Success of Movies
Questions For Discussion
Decision situation
Problem
Proposed solution
Results
Answer & discuss the case questions.

Copyright © 2014 Pearson Education, Inc.

5-‹#›

54

Application Case 5.6
Data Mining Goes to Hollywood!

Dependent Variable
Independent Variables
A Typical Classification Problem

Copyright © 2014 Pearson Education, Inc.

5-‹#›

55

Application Case 5.6
Data Mining Goes to Hollywood!
The DM Process Map in IBM SPSS Modeler

Copyright © 2014 Pearson Education, Inc.

5-‹#›

56

Application Case 5.6
Data Mining Goes to Hollywood!

Copyright © 2014 Pearson Education, Inc.

5-‹#›

57

Application Case 5.7
Data Mining & Privacy Issues
Predicting Customer Buying Patterns—
The Target Story
Questions For Discussion
What do you think about data mining and its implication for privacy? What is the threshold between discovery of knowledge and infringement of privacy?
Did Target go too far? Did it do anything illegal? What do you think Target should have done? What do you think Target should do next (quit these types of practices)?

Copyright © 2014 Pearson Education, Inc.

5-‹#›

58

Data Mining Myths
Data mining …
provides instant solutions/predictions
is not yet viable for business applications
requires a separate, dedicated database
can only be done by those with advanced degrees
is only for large firms that have lots of customer data
is another name for the good-old statistics

Copyright © 2014 Pearson Education, Inc.

5-‹#›

59

Common Data Mining Blunders
Selecting the wrong problem for data mining
Ignoring what your sponsor thinks data mining is and what it really can/cannot do
Not leaving insufficient time for data acquisition, selection and preparation
Looking only at aggregated results and not at individual records/predictions
Being sloppy about keeping track of the data mining procedure and results
…more in the book

Copyright © 2014 Pearson Education, Inc.

5-‹#›

60

End of the Chapter

Questions, comments

Copyright © 2014 Pearson Education, Inc.

5-‹#›

61

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. Printed in the United States of America.
Copyright © 2014 Pearson Education, Inc.

Copyright © 2014 Pearson Education, Inc.

5-‹#›

62

S
t
a
t
i
s
t
i
c
s
Management Science &
Information Systems
A
r
t
i
f
i
c
i
a
l

I
n
t
e
l
l
i
g
e
n
c
e
Databases
Pattern
Recognition
Machine
Learning
Mathematical
Modeling
DATA
MINING
Data
CategoricalNumerical
NominalOrdinalIntervalRatio
Structured
Unstructured or
Semi-Structured
MultimediaTextualHTML/XML
Data Mining
Prediction
Classification
Regression
Clustering
Association
Link analysis
Sequence analysis
Learning MethodPopular Algorithms
Supervised
Supervised
Supervised
Unsupervised
Unsupervised
Unsupervised
Unsupervised
Decision trees, ANN/MLP, SVM, Rough
sets, Genetic Algorithms
Linear/Nonlinear Regression, Regression
trees, ANN/MLP, SVM
Expectation Maximization, Apriory
Algorithm, Graph-based Matching
Apriory Algorithm, FP-Growth technique
K-means, ANN/SOM
Outlier analysisUnsupervisedK-means, Expectation Maximization (EM)
Apriory, OneR, ZeroR, Eclat
Classification and Regression Trees,
ANN, SVM, Genetic Algorithms
Data Sources
Business
Understanding
Data
Preparation
Model
Building
Testing and
Evaluation
Deployment
Data
Understanding
6
12
3
5
4

Data Consolidation
Data Cleaning
Data Transformation
Data Reduction
Well-formed
Data
Real-world
Data
·Collect data
·Select data
·Integrate data
·Impute missing values
·Reduce noise in data
·Eliminate inconsistencies
·Normalize data
·Discretize/aggregate data
·Construct new attributes
·Reduce number of variables
·Reduce number of cases
·Balance skewed data

Sample
(Generate a representative
sample of the data)
Modify
(Select variables, transform
variable representations)
Explore
(Visualization and basic
description of the data)
Model
(Use variety of statistical and
machine learning models )
Assess
(Evaluate the accuracy and
usefulness of the models)
SEMMA
FN
TP
TP
Rate
Positive
True
+
=

FP
TN
TN
Rate
Negative
True
+
=

FN
FP
TN
TP
TN
TP
Accuracy
+
+
+
+
=
FP
TP
TP
recision
+
=
P
FN
TP
TP
call
Re
+
=

True
Positive
Count (TP)
False
Positive
Count (FP)
True
Negative
Count (TN)
False
Negative
Count (FN)
True Class
PositiveNegative
P
o
s
i
t
i
v
e
N
e
g
a
t
i
v
e
P
r
e
d
i
c
t
e
d

C
l
a
s
s

Preprocessed
Data
Training Data
Testing Data
Model
Development
Model
Assessment
(scoring)
2/3
1/3
Classifier
Prediction
Accuracy

10.90.80.70.60.50.40.30.20.10
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
1
0.9
0.8
False Positive Rate (1 – Specificity)
T
r
u
e

P
o
s
i
t
i
v
e

R
a
t
e

(
S
e
n
s
i
t
i
v
i
t
y
)
A
B
C

Step 1Step 2Step 3
Itemset
(SKUs)
Support
Transaction
No
SKUs
(Item No)
1
1
1
1
1
1
1, 2, 3, 4
2, 3, 4
2, 3
1, 2, 4
1, 2, 3, 4
2, 4
Raw Transaction Data
1
2
3
4
3
6
4
5
Itemset
(SKUs)
Support
1, 2
1, 3
1, 4
2, 3
3
2
3
4
3, 4
5
3
2, 4
Itemset
(SKUs)
Support
1, 2, 4
2, 3, 4
3
3
One-item ItemsetsTwo-item ItemsetsThree-item Itemsets
050100150200250300Predixion Software (3)WordStat (3)11 Ants Analytics (4)Teradata Miner (4)RapidInsight/Veera (5)Angoss (7)SAP (BusinessObjects/Sybase/Hana)(7)XLSTAT (7)Salford SPM/CART/MARS/TreeNet/RF (9)Revolution Computing (11)C4.5/C5.0/See5 (13)Bayesia (14)KXEN (14)Zementis (14)Stata (15)IBM Cognos (16)Miner3D (19)Mathematica (23)JMP (32)Other commercial software (32)Oracle Data Miner (35)Tableau (35)TIBCO Spotfire / S+ / Miner (37)Other free software (39)Microsoft SQL Server (40)Orange (42)SAS Enterprise Miner (46)IBM SPSS Modeler (54)IBM SPSS Statistics (62)MATLAB (80)Rapid-I RapidAnalytics (83)SAS (101)StatSoft Statistica (112)Weka / Pentaho (118)KNIME (174)Rapid-I RapidMiner (213)Excel (238)R (245)
01020304050607080Other Hadoop-based tools (10)Other Big Data software (21)NoSQL databases (33)Amazon Web Services (AWS) (36)Apache Hadoop/Hbase/Pig/Hive (67)
050100150200250300F# (5)Awk/Gawk/Shell (31)Perl (37)Other languages (57)C/C++ (66)Python (119)Java (138)SQL (185)R (245)
Independent Variable
Number of
Values
Possible Values
MPAA Rating 5 G, PG, PG -13, R, NR
Competition 3 High, Medium, Low
Star value 3 High, Medium, Low
Genre 10
Sci-Fi, Historic Epic Drama,
Modern Drama, Politically
Related, Thriller, Horror,
Comedy, Cartoon, Action,
Documentary
Special effects 3 High, Medium, Low
Sequel 1 Yes, No
Number of screens 1 Positive integer

Class No. 1 2 3 4 5 6 7 8 9
Range
(in $Millions)
< 1 (Flop) > 1
< 10 > 10
< 20 > 20
< 40 > 40
< 65 > 65
< 100 > 100
< 150 > 150
< 200 > 200
(Blockbuster)

Model
Development
process
Model
Assessment
process
Prediction Models
Individual Models Ensemble Models
Performance
Measure SVM ANN C&RT
Random
Forest
Boosted
Tree
Fusion
(Average)
Count (Bingo)
192 182 140 189 187
194
Count (1-Away)
104 120 126 121 104
120
Accuracy (% Bingo)
55.49% 52.60% 40.46% 54.62% 54.05%
56.07%
Accuracy (% 1-Away)
85.55% 87.28% 76.88% 89.60% 84.10%
90.75%
Standard deviation
0.93 0.87 1.05 0.76 0.84
0.63
* Training set: 1998 – 2005 movies; Test set: 2006 movies

Chapter 6:

Techniques for Predictive Modeling

Business Intelligence and Analytics: Systems for Decision Support

(10th Edition)

Business Intelligence and Analytics: Systems for Decision Support
(10th Edition)

Copyright © 2014 Pearson Education, Inc.

6-‹#›

1

Learning Objectives
Understand the concept and definitions of artificial neural networks (ANN)
Learn the different types of ANN architectures
Know how learning happens in ANN
Become familiar with ANN applications
Understand the sensitivity analysis in ANN
Understand the concept and structure of support vector machines (SVM)
(Continued…)

Copyright © 2014 Pearson Education, Inc.

6-‹#›

Learning Objectives
Learn the advantages and disadvantages of SVM compared to ANN
Understand the concept and formulation of k-nearest neighbor algorithm (kNN)
Learn the process of applying kNN
Learn the advantages and disadvantages of kNN compared to ANN and SVM

Copyright © 2014 Pearson Education, Inc.

6-‹#›

Opening Vignette…
Predictive Modeling Helps Better Understand and Manage Complex Medical Procedures
Situation
Problem
Solution
Results
Answer & discuss the case questions.

Copyright © 2014 Pearson Education, Inc.

6-‹#›

4

Questions for the Opening Vignette
Why is it important to study medical procedures? What is the value in predicting outcomes?
What factors do you think are the most important in better understanding and managing healthcare?
What would be the impact of predictive modeling on healthcare and medicine? Can predictive modeling replace medical or managerial personnel?
What were the outcomes of the study? Who can use these results? How can they be implemented?
Search the Internet to locate two additional cases in managing complex medical procedures.

Copyright © 2014 Pearson Education, Inc.

6-‹#›

Opening Vignette –
A Process Map for Training and Testing Four Predictive Models

Copyright © 2014 Pearson Education, Inc.

6-‹#›

Opening Vignette
The Comparison of Four Models

Copyright © 2014 Pearson Education, Inc.

6-‹#›

Neural Network Concepts
Neural networks (NN): a brain metaphor for information processing
Neural computing
Artificial neural network (ANN)
Many uses for ANN for
pattern recognition, forecasting, prediction, and classification
Many application areas
finance, marketing, manufacturing, operations, information systems, and so on

Copyright © 2014 Pearson Education, Inc.

6-‹#›

8

Biological Neural Networks
Two interconnected brain cells (neurons)

Copyright © 2014 Pearson Education, Inc.

6-‹#›

9

Processing Information in ANN
A single neuron (processing element – PE) with inputs and outputs

Copyright © 2014 Pearson Education, Inc.

6-‹#›

10

Biology Analogy

Copyright © 2014 Pearson Education, Inc.

6-‹#›

11

Application Case 6.1
Neural Networks Are Helping to Save Lives in the Mining Industry

Questions for
Discussion
How did neural networks help save lives in the mining industry?
What were the challenges, the proposed solution, and the obtained results?

Copyright © 2014 Pearson Education, Inc.

6-‹#›

Elements of ANN
Processing element (PE)
Network architecture
Hidden layers
Parallel processing
Network information processing
Inputs
Outputs
Connection weights
Summation function

Copyright © 2014 Pearson Education, Inc.

6-‹#›

13

Elements of ANN
Neural Network with One Hidden Layer

Copyright © 2014 Pearson Education, Inc.

6-‹#›

14

Elements of ANN
Summation Function for a Single Neuron (a), and
Several Neurons (b)

Copyright © 2014 Pearson Education, Inc.

6-‹#›

15

Elements of ANN
Transformation (Transfer) Function
Linear function
Sigmoid (logical activation) function [0 1]
Tangent Hyperbolic function [-1 1]
Threshold value?

Copyright © 2014 Pearson Education, Inc.

6-‹#›

16

Neural Network Architectures
Architecture of a neural network is driven by the task it is intended to address
Classification, regression, clustering, general optimization, association, ….
Most popular architecture: Feedforward, multi-layered perceptron with backpropagation learning algorithm
Used for both classification and regression type problems
Others – Recurrent, self-organizing feature maps, Hopfield networks, …

Copyright © 2014 Pearson Education, Inc.

6-‹#›

17

Neural Network Architectures
Feed-Forward Neural Networks
Feed-forward MLP with 1 Hidden Layer

Copyright © 2014 Pearson Education, Inc.

6-‹#›

18

Neural Network Architectures
Recurrent Neural Networks

Copyright © 2014 Pearson Education, Inc.

6-‹#›

19

Other Popular ANN Paradigms
Self-Organizing Maps (SOM)
First introduced by the Finnish Professor Teuvo Kohonen
Applies to clustering type problems

Copyright © 2014 Pearson Education, Inc.

6-‹#›

20

Other Popular ANN Paradigms
Hopfield Networks
First introduced by John Hopfield
Highly interconnected neurons
Applies to solving complex computational problems (e.g., optimization problems)

Copyright © 2014 Pearson Education, Inc.

6-‹#›

21

Application Case 6.2
Predictive Modeling is Powering the Power Generators
Questions for Discussion
What are the key environmental concerns in the electric power industry?
What are the main application areas for predictive modeling in the electric power industry?
How was predictive modeling used to address a variety of problems in the electric power industry?

Copyright © 2014 Pearson Education, Inc.

6-‹#›

Development Process of an ANN

Copyright © 2014 Pearson Education, Inc.

6-‹#›

23

An MLP ANN Structure for the Box-Office Prediction Problem

Copyright © 2014 Pearson Education, Inc.

6-‹#›

24

Testing a Trained ANN Model
Data is split into three parts
Training (~60%)
Validation (~20%)
Testing (~20%)
k-fold cross validation
Less bias
Time consuming

Copyright © 2014 Pearson Education, Inc.

6-‹#›

25

AN Learning Process
A Supervised Learning Process
Three-step process:
1. Compute temporary outputs.
2. Compare outputs with desired targets.
3. Adjust the weights and repeat the process.

Copyright © 2014 Pearson Education, Inc.

6-‹#›

26

Backpropagation Learning
Backpropagation of Error for a Single Neuron

Copyright © 2014 Pearson Education, Inc.

6-‹#›

27

Backpropagation Learning
The learning algorithm procedure
Initialize weights with random values and set other network parameters
Read in the inputs and the desired outputs
Compute the actual output (by working forward through the layers)
Compute the error (difference between the actual and desired output)
Change the weights by working backward through the hidden layers
Repeat steps 2-5 until weights stabilize

Copyright © 2014 Pearson Education, Inc.

6-‹#›

28

Illuminating The Black Box Sensitivity Analysis on ANN
A common criticism for ANN: The lack of transparency/explainability
The black-box syndrome!
Answer: sensitivity analysis
Conducted on a trained ANN
The inputs are perturbed while the relative change on the output is measured/recorded
Results illustrate the relative importance of input variables

Copyright © 2014 Pearson Education, Inc.

6-‹#›

29

Sensitivity Analysis on ANN Models
For a good example, see Application Case 6.3
Sensitivity analysis reveals the most important injury severity factors in traffic accidents

Copyright © 2014 Pearson Education, Inc.

6-‹#›

30

Application Case 6.3
Sensitivity Analysis Reveals Injury Severity Factors in Traffic Accidents
Questions for Discussion
How does sensitivity analysis shed light on the black box (i.e., neural networks)?
Why would someone choose to use a blackbox tool like neural networks over theoretically sound, mostly transparent statistical tools like logistic regression?
In this case, how did NNs and sensitivity analysis help identify injury-severity factors in traffic accidents?

Copyright © 2014 Pearson Education, Inc.

6-‹#›

Support Vector Machines (SVM)
SVM are among the most popular machine-learning techniques.
SVM belong to the family of generalized linear models… (capable of representing non-linear relationships in a linear fashion).
SVM achieve a classification or regression decision based on the value of the linear combination of input features.
Because of their architectural similarities, SVM are also closely associated with ANN.

Copyright © 2014 Pearson Education, Inc.

6-‹#›

32

Support Vector Machines (SVM)
Goal of SVM: to generate mathematical functions that map input variables to desired outputs for classification or regression type prediction problems.
First, SVM uses nonlinear kernel functions to transform non-linear relationships among the variables into linearly separable feature spaces.
Then, the maximum-margin hyperplanes are constructed to optimally separate different classes from each other based on the training dataset.
SVM has solid mathematical foundation!

Copyright © 2014 Pearson Education, Inc.

6-‹#›

33

Support Vector Machines (SVM)
A hyperplane is a geometric concept used to describe the separation surface between different classes of things.
In SVM, two parallel hyperplanes are constructed on each side of the separation space with the aim of maximizing the distance between them.
A kernel function in SVM uses the kernel trick (a method for using a linear classifier algorithm to solve a nonlinear problem)
The most commonly used kernel function is the radial basis function (RBF).

Copyright © 2014 Pearson Education, Inc.

6-‹#›

34

Support Vector Machines (SVM)
Many linear classifiers (hyperplanes) may separate the data

Copyright © 2014 Pearson Education, Inc.

6-‹#›

35

Application Case 6.4
Managing Student Retention with Predictive Modeling
Questions for Discussion
Why is attrition one of the most important issues in higher education?
How can predictive analytics (ANN, SVM, and so forth) be used to better manage student retention?
What are the main challenges and potential solutions to the use of analytics in retention management?

Copyright © 2014 Pearson Education, Inc.

6-‹#›

Application
Case 6.4
Managing Student Retention with Predictive Modeling

Copyright © 2014 Pearson Education, Inc.

6-‹#›

How Does an SVM Work?
Following a machine-learning process, an SVM learns from the historic cases.
The Process of Building SVM
1. Preprocess the data
Scrub and transform the data.
2. Develop the model.
Select the kernel type (RBF is often a natural choice).
Determine the kernel parameters for the selected kernel type.
If the results are satisfactory, finalize the model; otherwise change the kernel type and/or kernel parameters to achieve the desired accuracy level.
3. Extract and deploy the model.

Copyright © 2014 Pearson Education, Inc.

6-‹#›

38

The Process of Building an SVM

Copyright © 2014 Pearson Education, Inc.

6-‹#›

39

SVM Applications
SVMs are the most widely used kernel-learning algorithms for wide range of classification and regression problems
SVMs represent the state-of-the-art by virtue of their excellent generalization performance, superior prediction power, ease of use, and rigorous theoretical foundation
Most comparative studies show its superiority in both regression and classification type prediction problems.
SVM versus ANN?

Copyright © 2014 Pearson Education, Inc.

6-‹#›

40

k-Nearest Neighbor Method (k-NN)
ANNs and SVMs  time-demanding, computationally intensive iterative derivations
k-NN is a simplistic and logical prediction method, that produces very competitive results
k-NN is a prediction method for classification as well as regression types (similar to ANN & SVM)
k-NN is a type of instance-based learning (or lazy learning) – most of the work takes place at the time of prediction (not at modeling)
k : the number of neighbors used

Copyright © 2014 Pearson Education, Inc.

6-‹#›

41

k-Nearest Neighbor Method (k-NN)
The answer depends on the value of k

Copyright © 2014 Pearson Education, Inc.

6-‹#›

The Process of k-NN Method

Copyright © 2014 Pearson Education, Inc.

6-‹#›

Similarity Measure: The Distance Metric

Numeric versus nominal values?
k-NN Model Parameter

Copyright © 2014 Pearson Education, Inc.

6-‹#›

Number of Neighbors (the value of k)
The best value depends on the data
Larger values reduce the effect of noise but also make boundaries between classes less distinct
An “optimal” value can be found heuristically
Cross Validation is often used to determine the best value for k and the distance measure
k-NN Model Parameter

Copyright © 2014 Pearson Education, Inc.

6-‹#›

Application Case 6.5
Efficient Image Recognition and Categorization with kNN

Questions for Discussion
Why is image recognition/classification a worthy but difficult problem?
How can k-NN be effectively used for image recognition/classification applications?

Copyright © 2014 Pearson Education, Inc.

6-‹#›

End of the Chapter

Questions, comments

Copyright © 2014 Pearson Education, Inc.

6-‹#›

47

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. Printed in the United States of America.

Copyright © 2014 Pearson Education, Inc.

6-‹#›

48

Soma
Axon
Axon
Synapse
Synapse
Dendrites
Dendrites
Soma
w
1
w
2
w
n
x
1
x
2
x
n
.
.
.
Y
Y
1
Y
n
Y
2
InputsWeightsOutputs
.
.
.
Neuron (or PE)



n
i
ii
WXS
1
)(Sf
Summation
Transfer
Function
(PE)
(PE)
(PE)
(PE)
(PE)
(PE)
(PE)
Transfer
Function
( f )
Weighted
Sum
(S)
x
1
x
2
x
3
Y
1
Input
Layer
Hidden
Layer
Output
Layer
x
1
x
2
2211
WXWXY 
(PE)
(PE)
Y
(PE)
(PE)
w
1
w
1
w
11
w
21
w
12
w
22
w
23
x
1
x
2
Y
1
Y
2
Y
3
2121111
WXWXY 
2221212
WXWXY 
2323
WXY
(a) Single neuron(b) Multiple neurons
PE: Processing Element (or neuron)
X
1
=3
Processing
element (PE)
X
2
=1
X
3
=2
W
1
=
0
.
2
W
2
=0.4
W
3

=
0
.
1
Y=1.2
Summation function:
Transfer function:
Y = 3(0.2) + 1(0.4) + 2(0.1) = 1.2
Y
T
= 1/(1 + e
-1.2
) = 0.77
Y
T
=0.77
INPUT
LAYER
HIDDEN
LAYER
OUTPUT
LAYER
.
.
.
.
.
.
Voted “yes” or
“no” to legalizing
gaming
Predicted
vs. Actual
=
Socio-demographic
Religious
Financial
Other
Input 1
Input 2
Input 3

I n p u t
O
u
t
p
u
t
1
2
3
4
5
6
7

1
2
3
4
5
6
7
8
9

MPAA Rating (5)
(G, PG, PG13, R, NR)
Competition (3)
(High, Medium, Low)
Star Value (3)
(High, Medium, Low)
Genre (10)
(Sci-Fi, Action, … )
Technical Effects (3)
(High, Medium, Low)
Sequel (2)
(Yes, No)
Number of Screens
(Positive Integer)
Class 1 -FLOP
(BO < 1 M) Class 2 (1M < BO < 10M) Class 3 (10M < BO < 20M) Class 4 (20M < BO < 40M) Class 5 (40M < BO < 65M) Class 6 (65M < BO < 100M) Class 7 (100M < BO < 150M) Class 8 (150M < BO < 200M) Class 9 -BLOCKBUSTER (BO > 200M)
INPUT
LAYER
(27 PEs)
HIDDEN
LAYER I
(18 PEs)
HIDDEN
LAYER II
(16 PEs)
OUTPUT
LAYER
(9 PEs)
Compute
output
Is desired
output
achieved?
Stop
learning
Adjust
weights
Yes
No
ANN
Model
w
1
w
2
w
n
x
1
x
2
x
n
.
.
.
Y
i
Neuron (or PE)



n
i
ii
WXS
1
)(Sf
Summation
Transfer
Function
)(SfY
a(Z
i
–Y
i
)
error
D
1
Systematically
Perturbed
Inputs
Observed
Change in
Outputs
Trained ANN
“the black-box”
X
1
X
2
M
a
x
i
m
u
m

m
a
r
g
i
n

h
y
p
e
r
p
l
a
n
e
X
1
X
2
L
1
L
2
L
3
M
a
r
g
i
n
Pre-Process the Data
üScrub the data
“Identify and handle missing,
incorrect, and noisy”
üTransform the data
“Numerisize, normalize and
standardize the data”
Develop the Model
üSelect the kernel type
“Choose from RBF, Sigmoid
or Polynomial kernel types”
üDetermine the kernel values
“Use v-fold cross validation or
employ ‘grid-search’”
Deploy the Model
üExtract the model coefficients
üCode the trained model into
the decision support system
üMonitor and maintain the
model
Training
data
Pre-processed data
Validated SVM model
Prediction
Model
Experimentation
“Training/Testing”
X
Y
X
i
Y
i
k= 3
k= 5
Historic Data
New Data
Parameter Setting
üDistance measure
üValue of “k”
Training Set
Validation Set
Predicting
Classify (or Forecast)
new cases using k
number of most
similar cases

Calculate your order
Pages (275 words)
Standard price: $0.00
Client Reviews
4.9
Sitejabber
4.6
Trustpilot
4.8
Our Guarantees
100% Confidentiality
Information about customers is confidential and never disclosed to third parties.
Original Writing
We complete all papers from scratch. You can get a plagiarism report.
Timely Delivery
No missed deadlines – 97% of assignments are completed in time.
Money Back
If you're confident that a writer didn't follow your order details, ask for a refund.

Calculate the price of your order

You will get a personal manager and a discount.
We'll send you the first draft for approval by at
Total price:
$0.00
Power up Your Academic Success with the
Team of Professionals. We’ve Got Your Back.
Power up Your Study Success with Experts We’ve Got Your Back.

Order your essay today and save 30% with the discount code ESSAYHELP