Business Intelligence

In the heart disease diagnosis case study (Application Case 11.4, on pp. 486-87), what was a major benefit of the SIPMES expert system?  Does this type of system have a high coincidence factor? If so, why is that helpful?

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

or

Expound on these activities within knowledge engineering –   knowledge acquisition, knowledge representation, and knowledge validation…

or

The key to effective knowledge management is extracting and encoding expert knowledge that can be accessed and reused by others. Offer an example/application of this, a related benefit, and are there any disadvantages of using KM?

Save Time On Research and Writing
Hire a Pro to Write You a 100% Plagiarism-Free Paper.
Get My Paper

BUSINESS INTELLIGENCE
AND ANALYTICS
RAMESH SHARDA
DURSUN DELEN
EFRAIM TURBAN
TENTH EDITION
.•

TENTH EDITION
BUSINESS INTELLIGENCE
AND ANALYTICS:
SYSTEMS FOR DECISION SUPPORT
Ramesh Sharda
Oklahoma State University
Dursun Delen
Oklahoma State University
Efraim Turban
University of Hawaii
With contributions by
J.E.Aronson
Tbe University of Georgia
Ting-Peng Liang
National Sun Yat-sen University
David King
]DA Software Group, Inc.
PEARSON
Boston Columbus Indianapolis New York San Francisco Upper Saddle River
Amsterdam Cape Town Dubai London Madrid Milan Munich Paris Montreal Toronto
Delhi Mexico City Sao Paulo Sydney Hong Kong Seoul Singapore Taipei Tokyo

Editor in Chief: Stephanie Wall
Executive Editor: Bob Horan
Program Manager Team Lead: Ashley Santora
Program Manager: Denise Vaughn
Executive Marketing Manager: Anne Fahlgren
Project Manager Team Lead: Judy Leale
Project Manager: Tom Benfatti
Operations Specialist: Michelle Klein
Creative Director: Jayne Conte
Cover Designer: Suzanne Behnke
Digital Production Project Manager: Lisa
Rinaldi
Full-Service Project Management: George Jacob,
Integra Software Solutions.
Printer/Binder: Edwards Brothers Malloy-Jackson
Road
Cover Printer: Lehigh/Phoenix-Hagerstown
Text Font: Garamond
Credits and acknowledgments borrowed from other sources and reproduced, with permission, in this textbook
appear on the appropriate page within text.
Microsoft and/ or its respective suppliers make no representations about the suitability of the information
contained in the documents and related graphics published as part of the services for any purpose. All such
documents and related graphics are provided “as is” without warranty of any kind. Microsoft and/or its
respective suppliers hereby disclaim all warranties and conditions with regard to this information, including
all warranties and conditions of merchantability, whether express, implied or statutory, fitness for a particular
purpose, title and non-infringement. In no event shall Microsoft and/or its respective suppliers be liable for
any special, indirect or consequential damages or any damages whatsoever resulting from loss of use , data or
profits, whether in an action of contract, negligence or other tortious action, arising out of or in connection
with the use or performance of information available from the services.
The documents and related graphics contained herein could include technical inaccuracies or typographical
errors. Changes are periodically added to the information here in. Microsoft and/or its respective suppliers may
make improvements and/or changes in the product(s) and/ or the program(s) described herein at any time.
Partial screen shots may be viewed in full within the software version specified.
Microsoft® Windows®, and Microsoft Office® are registered trademarks of the Microsoft Corporation in the U.S.A.
and other countries. This book is not sponsored or endorsed by or affiliated with the Microsoft Corporation.
Copyright© 2015, 2011, 2007 by Pearson Education, Inc., One Lake Street, Upper Saddle River,
New Jersey 07458. All rights reserved. Manufactured in the United States of America. This publication
is protected by Copyright, and permission should be obtained from the publisher prior to any prohibited
reproduction, storage in a retrieval system, or transmission in any form or by any means, electronic,
mechanical, photocopying, recording, or likewise. To obtain permission(s) to use material from this work,
please submit a written request to Pearson Education, Inc., Permissions Department, One Lake Street,
Upper Saddle River, New Jersey 07458, or you may fax your request to 201-236-3290.
Many of the designations by manufacturers and sellers to distinguish their products are claimed as trademarks.
Where those designations appear in this book, and the publisher was aware of a trademark claim, the
designations have been printe d in initial caps or all caps.
Library of Congress Cataloging-in-Publication Data
Turban, Efraim.
[Decision support and expert system,)
Business intelligence and analytics: systems for decision support/Ramesh Sharda , Oklahoma State University,
Dursun Delen , Oklahoma State University, Efraim Turban, University of Hawaii; With contributions
by J. E. Aronson, The University of Georgia, Ting-Peng Liang, National Sun Yat-sen University,
David King, JOA Software Group, Inc.-Tenth edition.
pages cm
ISBN-13: 978-0-13-305090-5
ISBN-10: 0-13-305090-4
1. Management-Data processing. 2. Decision support systems. 3. Expert systems (Compute r science)
4. Business intelligence. I. Title .
HD30.2.T87 2014
658.4’03801 l-dc23
10 9 8 7 6 5 4 3 2 1
PEARSON
2013028826
ISBN 10: 0-13-305090-4
ISBN 13: 978-0-13-305090-5

BRIEF CONTENTS
Preface xxi
About the Authors xxix
PART I Decision Making and Analytics: An Overview 1
Chapter 1 An Overview of Business Intelligence, Analytics,
and Decision Support 2
Chapter 2 Foundations and Technologies for Decision Making 37
PART II Descriptive Analytics 77
Chapter 3 Data Warehousing 78
Chapter 4 Business Reporting, Visual Analytics, and Business
Performance Management 135
PART Ill Predictive Analytics 185
Chapter 5 Data Mining 186
Chapter 6 Techniques for Predictive Modeling 243
Chapter 7 Text Analytics, Text Mining, and Sentiment Analysis 288
Chapter 8 Web Analytics, Web Mining, and Social Analytics 338
PART IV Prescriptive Analytics 391
Chapter 9 Model-Based Decision Making: Optimization and Multi-
Criteria Systems 392
Chapter 10 Modeling and Analysis: Heuristic Search Methods and
Simulation 435
Chapter 11 Automated Decision Systems and Expert Systems 469
Chapter 12 Knowledge Management and Collaborative Systems 507
PART V Big Data and Future Directions for Business
Analytics 541
Chapter 13 Big Data and Analytics 542
Chapter 14 Business Analytics: Emerging Trends and Future
Impacts 592
Glossary 634
Index 648
iii

iv
CONTENTS
Preface xxi
About the Authors xxix
Part I Decision Making and Analytics: An Overview 1
Chapter 1 An Overview of Business Intelligence, Analytics, and
Decision Support 2
1.1 Opening Vignette: Magpie Sensing Employs Analytics to
Manage a Vaccine Supply Chain Effectively and Safely 3
1.2 Changing Business Environments and Computerized
Decision Support 5
The Business Pressures-Responses-Support Model 5
1.3 Managerial Decision Making 7
The Nature of Managers’ Work 7
The Decision-Making Process 8
1.4 Information Systems Support for Decision Making 9
1.5 An Early Framework for Computerized Decision
Support 11
The Gorry and Scott-Morton Classical Framework 11
Computer Support for Structured Decisions 12
Computer Support for Unstructured Decisions 13
Computer Support for Semistructured Problems 13
1.6 The Concept of Decision Support Systems (DSS) 13
DSS as an Umbrella Term 13
Evolution of DSS into Business Intelligence 14
1.7 A Framework for Business Intelligence (Bl) 14
Definitions of Bl 14
A Brief History of Bl 14
The Architecture of Bl 15
Styles of Bl 15
The Origins and Drivers of Bl 16
A Multimedia Exercise in Business Intelligence 16
~ APPLICATION CASE 1.1 Sabre Helps Its Clients Through Dashboards
and Analytics 17
The DSS-BI Connection 18
1.8 Business Analytics Overview 19
Descriptive Analytics 20
~ APPLICATION CASE 1.2 Eliminating Inefficiencies at Seattle
Children’s Hospital 21
~ APPLICATION CASE 1.3 Analysis at the Speed of Thought 22
Predictive Analytics 22

~ APPLICATION CASE 1.4 Moneybal/: Analytics in Sports and Movies 23
~ APPLICATION CASE 1.5 Analyzing Athletic Injuries 24
Prescriptive Analytics 24
~ APPLICATION CASE 1.6 Industrial and Commercial Bank of China
(ICBC) Employs Models to Reconfigure Its Branch Network 25
Analytics Applied to Different Domains 26
Analytics or Data Science? 26
1.9 Brief Introduction to Big Data Analytics 27
What Is Big Data? 27
~ APPLICATION CASE 1.7 Gilt Groupe’s Flash Sales Streamlined by Big
Data Analytics 29
1.10 Plan of the Book 29
Part I: Business Analytics: An Overview 29
Part II: Descriptive Analytics 30
Part Ill: Predictive Analytics 30
Part IV: Prescriptive Analytics 31
Part V: Big Data and Future Directions for Business Analytics 31
1.11 Resources, Links, and the Teradata University Network
Connection 31
Resources and Links 31
Vendors, Products, and Demos 31
Periodicals 31
The Teradata University Network Connection 32
The Book’s Web Site 32
Chapter Highlights 32 • Key Terms 33
Questions for Discussion 33 • Exercises 33
~ END-OF-CHAPTER APPLICATION CASE Nationwide Insurance Used Bl
to Enhance Customer Service 34
References 35
Chapter 2 Foundations and Technologies for Decision Making 37
2.1 Opening Vignette: Decision Modeling at HP Using
Spreadsheets 38
2.2 Decision Making: Introduction and Definitions 40
Characteristics of Decision Making 40
A Working Definition of Decision Making 41
Decision-Making Disciplines 41
Decision Style and Decision Makers 41
2.3 Phases of the Decision-Making Process 42
2.4 Decision Making: The Intelligence Phase 44
Problem (or Opportunity) Identification 45
~ APPLICATION CASE 2.1 Making Elevators Go Faster! 45
Problem Classification 46
Problem Decomposition 46
Problem Ownership 46
Conte nts v

vi Contents
2.5 Decision Making: The Design Phase 47
Models 47
Mathematical (Quantitative) Models 47
The Benefits of Models 4 7
Selection of a Principle of Choice 48
Normative Models 49
Suboptimization 49
Descriptive Models 50
Good Enough, or Satisficing 51
Developing (Generating) Alternatives 52
Measuring Outcomes 53
Risk 53
Scenarios 54
Possible Scenarios 54
Errors in Decision Making 54
2.6 Decision Making: The Choice Phase 55
2.7 Decision Making: The Implementation Phase 55
2.8 How Decisions Are Supported 56
Support for the Intelligence Phase 56
Support for the Design Phase 5 7
Support for the Choice Phase 58
Support for the Implementation Phase 58
2.9 Decision Support Systems: Capabilities 59
A DSS Application 59
2.10 DSS Classifications 61
The AIS SIGDSS Classification for DSS 61
Other DSS Categories 63
Custom-Made Systems Versus Ready-Made Systems 63
2.11 Components of Decision Support Systems 64
The Data Management Subsystem 65
The Model Management Subsystem 65
~ APPLICATION CASE 2.2 Station Casinos Wins by Building Customer
Relationships Using Its Data 66
~ APPLICATION CASE 2.3 SNAP DSS Helps OneNet Make
Telecommunications Rate Decisions 68
The User Interface Subsystem 68
The Knowledge-Based Management Subsystem 69
~ APPLICATION CASE 2.4 From a Game Winner to a Doctor! 70
Chapter Highlights 72 • Key Terms 73
Questions for Discussion 73 • Exercises 74
~ END-OF-CHAPTER APPLICATION CASE Logistics Optimization in a
Major Shipping Company (CSAV) 74
References 75

Part II Descriptive Analytics 77
Chapter 3 Data Warehousing 78
3.1 Opening Vignette: Isle of Capri Casinos Is Winning with
Enterprise Data Warehouse 79
3.2 Data Warehousing Definitions and Concepts 81
What Is a Data Warehouse? 81
A Historical Perspective to Data Warehousing 81
Characteristics of Data Warehousing 83
Data Marts 84
Operational Data Stores 84
Enterprise Data Warehouses (EDW) 85
Metadata 85
~ APPLICATION CASE 3.1 A Better Data Plan: Well-Established TELCOs
Leverage Data Warehousing and Analytics to Stay on Top in a
Competitive Industry 85
3.3 Data Warehousing Process Overview 87
~ APPLICATION CASE 3.2 Data Warehousing Helps MultiCare Save
More Lives 88
3.4 Data Warehousing Architectures 90
Alternative Data Warehousing Architectures 93
Which Architecture Is the Best? 96
3.5 Data Integration and the Extraction, Transformation, and
Load (ETL) Processes 97
Data Integration 98
~ APPLICATION CASE 3.3 BP Lubricants Achieves BIGS Success 98
Extraction, Transfonnation, and Load 100
3.6 Data Warehouse Development 102
~ APPLICATION CASE 3.4 Things Go Better with Coke’s Data
Warehouse 103
Data Warehouse Development Approaches 103
~ APPLICATION CASE 3.5 Starwood Hotels & Resorts Manages Hotel
Profitability with Data Warehousing 106
Additional Data Warehouse Development Considerations 107
Representation of Data in Data Warehouse 108
Analysis of Data in the Data Warehouse 109
OLAP Versus OLTP 110
OLAP Operations 11 0
3.7 Data Warehousing Implementation Issues 113
~ APPLICATION CASE 3.6 EDW Helps Connect State Agencies in
Michigan 115
Massive Data Warehouses and Scalability 116
3.8 Real-Time Data Warehousing 117
~ APPLICATION CASE 3.7 Egg Pie Fries the Competition in Near Real
Time 118
Conte nts vii

viii Conte nts
3.9 Data Warehouse Administration, Security Issues, and Future
Trends 121
The Future of Data Warehousing 123
3.10 Resources, Links, and the Teradata University Network
Connection 126
Resources and Links 126
Cases 126
Vendors, Products, and Demos 127
Periodicals 127
Additional References 127
The Teradata University Network (TUN) Connection 127
Chapter Highlights 128 • Key Terms 128
Questions for Discussion 128 • Exercises 129
…. END-OF-CHAPTER APPLICATION CASE Continental Airlines Flies High
with Its Real-Time Data Warehouse 131
References 132
Chapter 4 Business Reporting, Visual Analytics, and Business
Performance Management 135
4.1 Opening Vignette:Self-Service Reporting Environment
Saves Millions for Corporate Customers 136
4.2 Business Reporting Definitions and Concepts 139
What Is a Business Report? 140
..,. APPLICATION CASE 4.1 Delta Lloyd Group Ensures Accuracy and
Efficiency in Financial Reporting 141
Components of the Business Reporting System 143
…. APPLICATION CASE 4.2 Flood of Paper Ends at FEMA 144
4.3 Data and Information Visualization 145
..,. APPLICATION CASE 4.3 Tableau Saves Blastrac Thousands of Dollars
with Simplified Information Sharing 146
A Brief History of Data Visualization 147
…. APPLICATION CASE 4.4 TIBCO Spotfire Provides Dana-Farber Cancer
Institute with Unprecedented Insight into Cancer Vaccine Clinical
Trials 149
4.4 Different Types of Charts and Graphs 150
Basic Charts and Graphs 150
Specialized Charts and Graphs 151
4.5 The Emergence of Data Visualization and Visual
Analytics 154
Visual Analytics 156
High-Powered Visual Analytics Environments 158
4.6 Performance Dashboards 160
…. APPLICATION CASE 4.5 Dallas Cowboys Score Big with Tableau and
Teknion 161

Dashboard Design 162
~ APPLICATION CASE 4.6 Saudi Telecom Company Excels with
Information Visualization 163
What to Look For in a Dashboard 164
Best Practices in Dashboard Design 165
Benchmark Key Performance Indicators with Industry Standards 165
Wrap the Dashboard Metrics with Contextual Metadata 165
Validate the Dashboard Design by a Usability Specialist 165
Prioritize and Rank Alerts/Exceptions Streamed to the Dashboard 165
Enrich Dashboard with Business Users’ Comments 165
Present Information in Three Different Levels 166
Pick the Right Visual Construct Using Dashboard Design Principles 166
Provide for Guided Analytics 166
4.7 Business Performance Management 166
Closed-Loop BPM Cycle 167
~ APPLICATION CASE 4.7 IBM Cognos Express Helps Mace for Faster
and Better Business Reporting 169
4.8 Performance Measurement 170
Key Performance Indicator (KPI) 171
Performance Measurement System 172
4.9 Balanced Scorecards 172
The Four Perspectives 173
The Meaning of Balance in BSC 17 4
Dashboards Versus Scorecards 174
4.10 Six Sigma as a Performance Measurement System 175
The DMAIC Performance Model 176
Balanced Scorecard Versus Six Sigma 176
Effective Performance Measurement 1 77
~ APPLICATION CASE 4.8 Expedia.com’s Customer Satisfaction
Scorecard 178
Chapter Highlights 179 • Key Terms 180
Questions for Discussion 181 • Exercises 181
~ END-OF-CHAPTER APPLICATION CASE Smart Business Reporting
Helps Healthcare Providers Deliver Better Care 182
References 184
Part Ill Predictive Analytics 185
Chapter 5 Data Mining 186
5.1 Opening Vignette: Cabela’s Reels in More Customers with
Advanced Analytics and Data Mining 187
5.2 Data Mining Concepts and Applications 189
~ APPLICATION CASE 5.1 Smarter Insurance: Infinity P&C Improves
Customer Service and Combats Fraud with Predictive Analytics 191
Conte nts ix

x Conte nts
Definitions, Characteristics, and Benefits 192
..,. APPLICATION CASE 5.2 Harnessing Analytics to Combat Crime:
Predictive Analytics Helps Memphis Police Department Pinpoint Crime
and Focus Police Resources 196
How Data Mining Works 197
Data Mining Versus Statistics 200
5.3 Data Mining Applications 201
…. APPLICATION CASE 5.3 A Mine on Terrorist Funding 203
5.4 Data Mining Process 204
Step 1: Business Understanding 205
Step 2: Data Understanding 205
Step 3: Data Preparation 206
Step 4: Model Building 208
…. APPLICATION CASE 5.4 Data Mining in Cancer Research 210
Step 5: Testing and Evaluation 211
Step 6: Deployment 211
Other Data Mining Standardized Processes and Methodologies 212
5.5 Data Mining Methods 214
Classification 214
Estimating the True Accuracy of Classification Models 215
Cluster Analysis for Data Mining 220
..,. APPLICATION CASE 5.5 2degrees Gets a 1275 Percent Boost in Churn
Identification 221
Association Rule Mining 224
5.6 Data Mining Software Tools 228
…. APPLICATION CASE 5.6 Data Mining Goes to Hollywood: Predicting
Financial Success of Movies 231
5.7 Data Mining Privacy Issues, Myths, and Blunders 234
Data Mining and Privacy Issues 234
…. APPLICATION CASE 5.7 Predicting Customer Buying Patterns-The
Target Story 235
Data Mining Myths and Blunders 236
Chapter Highlights 237 • Key Terms 238
Questions for Discussion 238 • Exercises 239
…. END-OF-CHAPTER APPLICATION CASE Macys.com Enhances Its
Customers’ Shopping Experience with Analytics 241
References 241
Chapter 6 Techniques for Predictive Modeling 243
6.1 Opening Vignette: Predictive Modeling Helps Better
Understand and Manage Complex Medical
Procedures 244
6.2 Basic Concepts of Neural Networks 247
Biological and Artificial Neural Networks 248
..,. APPLICATION CASE 6.1 Neural Networks Are Helping to Save Lives in
the Mining Industry 250
Elements of ANN 251

Network Information Processing 2 52
Neural Network Architectures 254
~ APPLICATION CASE 6.2 Predictive Modeling Is Powering the Power
Generators 256
6.3 Developing Neural Network-Based Systems 258
The General ANN Learning Process 259
Backpropagation 260
6.4 Illuminating the Black Box of ANN with Sensitivity
Analysis 262
~ APPLICATION CASE 6.3 Sensitivity Analysis Reveals Injury Severity
Factors in Traffic Accidents 264
6.5 Support Vector Machines 265
~ APPLICATION CASE 6.4 Managing Student Retention with Predictive
Modeling 266
Mathematical Formulation of SVMs 270
Primal Form 271
Dual Form 271
Soft Margin 271
Nonlinear Classification 272
Kernel Trick 272
6.6 A Process-Based Approach to the Use of SVM 273
Support Vector Machines Versus Artificial Neural Networks 274
6.7 Nearest Neighbor Method for Prediction 275
Similarity Measure: The Distance Metric 276
Parameter Selection 277
~ APPLICATION CASE 6.5 Efficient Image Recognition and
Categorization with kNN 278
Chapter Highlights 280 • Key Terms 280
Questions for Discussion 281 • Exercises 281
~ END-OF-CHAPTER APPLICATION CASE Coors Improves Beer Flavors
with Neural Networks 284
References 285
Chapter 7 Text Analytics, Text Mining, and Sentiment Analysis 288
7.1 Opening Vignette: Machine Versus Men on Jeopardy!: The
Story of Watson 289
7.2 Text Analytics and Text Mining Concepts and
Definitions 291
~ APPLICATION CASE 7.1 Text Mining for Patent Analysis 295
7.3 Natural Language Processing 296
~ APPLICATION CASE 7.2 Text Mining Improves Hong Kong
Government’s Ability to Anticipate and Address Public Complaints 298
7.4 Text Mining Applications 300
Marketing Applications 301
Security Applications 301
~ APPLICATION CASE 7.3 Mining for Lies 302
Biomedical Applications 304
Conte nts xi

xii Conte nts
Academic Applications 305
…. APPLICATION CASE 7.4 Text Mining and Sentiment Analysis Help
Improve Customer Service Performance 306
7.5 Text Mining Process 307
Task 1: Establish the Corpus 308
Task 2: Create the Term-Document Matrix 309
Task 3: Extract the Knowledge 312
..,. APPLICATION CASE 7.5 Research Literature Survey with Text
Mining 314
7.6 Text Mining Tools 317
Commercial Software Tools 317
Free Software Tools 317
..,. APPLICATION CASE 7.6 A Potpourri ofText Mining Case Synopses 318
7.7 Sentiment Analysis Overview 319
..,. APPLICATION CASE 7.7 Whirlpool Achieves Customer Loyalty and
Product Success with Text Analytics 321
7.8 Sentiment Analysis Applications 323
7.9 Sentiment Analysis Process 325
Methods for Polarity Identification 326
Using a Lexicon 327
Using a Collection of Training Documents 328
Identifying Semantic Orientation of Sentences and Phrases 328
Identifying Semantic Orientation of Document 328
7.10 Sentiment Analysis and Speech Analytics 329
How Is It Done? 329
..,. APPLICATION CASE 7.8 Cutting Through the Confusion: Blue Cross
Blue Shield of North Carolina Uses Nexidia’s Speech Analytics to Ease
Member Experience in Healthcare 331
Chapter Highlights 333 • Key Terms 333
Questions for Discussion 334 • Exercises 334
…. END-OF-CHAPTER APPLICATION CASE BBVA Seamlessly Monitors
and Improves Its Online Reputation 335
References 336
Chapter 8 Web Analytics, Web Mining, and Social Analytics 338
8.1 Opening Vignette: Security First Insurance Deepens
Connection with Policyholders 339
8.2 Web Mining Overview 341
8.3 Web Content and Web Structure Mining 344
…. APPLICATION CASE 8.1 Identifying Extremist Groups with Web Link
and Content Analysis 346
8.4 Search Engines 347
Anatomy of a Search Engine 347
1. Development Cycle 348
Web Crawler 348
Document Indexer 348

2. Response Cycle 349
Query Analyzer 349
Document Matcher/Ranker 349
How Does Google Do It? 351
~ APPLICATION CASE 8.2 IGN Increases Search Traffic by 1500 Percent 353
8.5 Search Engine Optimization 354
Methods for Search Engine Optimization 355
~ APPLICATION CASE 8.3 Understanding Why Customers Abandon
Shopping Carts Results in $10 Million Sales Increase 357
8.6 Web Usage Mining (Web Analytics) 358
Web Analytics Technologies 359
~ APPLICATION CASE 8.4 Allegro Boosts Online Click-Through Rates by
500 Percent with Web Analysis 360
Web Analytics Metrics 362
Web Site Usability 362
Traffic Sources 363
Visitor Profiles 364
Conversion Statistics 364
8.7 Web Analytics Maturity Model and Web Analytics Tools 366
Web Analytics Tools 368
Putting It All Together-A Web Site Optimization Ecosystem 370
A Framework for Voice of the Customer Strategy 372
8.8 Social Analytics and Social Network Analysis 373
Social Network Analysis 374
Social Network Analysis Metrics 375
~ APPLICATION CASE 8.5 Social Network Analysis Helps
Telecommunication Firms 375
Connections 376
Distributions 376
Segmentation 377
8.9 Social Media Definitions and Concepts 377
How Do People Use Social Media? 378
~ APPLICATION CASE 8.6 Measuring the Impact of Social Media at
Lollapalooza 379
8.10 Social Media Analytics 380
Measuring the Social Media Impact 381
Best Practices in Social Media Analytics 381
~ APPLICATION CASE 8.7 eHarmony Uses Social Media to Help Take the
Mystery Out of Online Dating 383
Social Media Analytics Tools and Vendors 384
Chapter Highlights 386 • Key Terms 387
Questions for Discussion 387 • Exercises 388
~ END-OF-CHAPTER APPLICATION CASE Keeping Students on Track with
Web and Predictive Analytics 388
References 390
Conte nts xiii

xiv Contents
Part IV Prescriptive Analytics 391
Chapter 9 Model-Based Decision Making: Optimization and
Multi-Criteria Systems 392
9.1 Opening Vignette: Midwest ISO Saves Billions by Better
Planning of Power Plant Operations and Capacity
Planning 393
9.2 Decision Support Systems Modeling 394
~ APPLICATION CASE 9.1 Optimal Transport for ExxonMobil
Downstream Through a DSS 395
Current Modeling Issues 396
~ APPLICATION CASE 9.2 Forecasting/Predictive Analytics Proves to Be
a Good Gamble for Harrah’s Cherokee Casino and Hotel 397
9.3 Structure of Mathematical Models for Decision Support 399
The Components of Decision Support Mathematical Models 399
The Structure of Mathematical Models 401
9.4 Certainty, Uncertainty, and Risk 401
Decision Making Under Certainty 402
Decision Making Under Uncertainty 402
Decision Making Under Risk (Risk Analysis) 402
~ APPLICATION CASE 9.3 American Airlines Uses
Should-Cost Modeling to Assess the Uncertainty of Bids
for Shipment Routes 403
9.5 Decision Modeling with Spreadsheets 404
~ APPLICATION CASE 9.4 Showcase Scheduling at Fred Astaire East
Side Dance Studio 404
9.6 Mathematical Programming Optimization 407
~ APPLICATION CASE 9.5 Spreadsheet Model Helps Assign Medical
Residents 407
Mathematical Programming 408
Linear Programming 408
Modeling in LP: An Example 409
Implementation 414
9.7 Multiple Goals, Sensitivity Analysis, What-If Analysis,
and Goal Seeking 416
Multiple Goals 416
Sensitivity Analysis 417
What-If Analysis 418
Goal Seeking 418
9.8 Decision Analysis with Decision Tables and Decision
Trees 420
Decision Tables 420
Decision Trees 422
9.9 Multi-Criteria Decision Making With Pairwise
Comparisons 423
The Analytic Hierarchy Process 423

~ APPLICATION CASE 9.6 U.S. HUD Saves the House by Using
AHP for Selecting IT Projects 423
Tutorial on Applying Analytic Hierarchy Process Using Web-HIPRE 425
Chapter Highlights 429 • Key Terms 430
Questions for Discussion 430 • Exercises 430
~ END-OF-CHAPTER APPLICATION CASE Pre-Positioning of Emergency
Items for CARE International 433
References 434
Chapter 10 Modeling and Analysis: Heuristic Search Methods and
Simulation 435
10.1 Opening Vignette: System Dynamics Allows Fluor
Corporation to Better Plan for Project and Change
Management 436
10.2 Problem-Solving Search Methods 437
Analytical Techniques 438
Algorithms 438
Blind Searching 439
Heuristic Searching 439
~ APPLICATION CASE 10.1 Chilean Government Uses Heuristics to
Make Decisions on School Lunch Providers 439
10.3 Genetic Algorithms and Developing GA Applications 441
Example: The Vector Game 441
Terminology of Genetic Algorithms 443
How Do Genetic Algorithms Work? 443
Limitations of Genetic Algorithms 445
Genetic Algorithm Applications 445
10.4 Simulation 446
~ APPLICATION CASE 10.2 Improving Maintenance Decision Making in
the Finnish Air Force Through Simulation 446
~ APPLICATION CASE 10.3 Simulating Effects of Hepatitis B
Interventions 447
Major Characteristics of Simulation 448
Advantages of Simulation 449
Disadvantages of Simulation 450
The Methodology of Simulation 450
Simulation Types 451
Monte Carlo Simulation 452
Discrete Event Simulation 453
10.5 Visual Interactive Simulation 453
Conventional Simulation Inadequacies 453
Visual Interactive Simulation 453
Visual Interactive Models and DSS 454
~ APPLICATION CASE 10.4 Improving Job-Shop Scheduling Decisions
Through RFID: A Simulation-Based Assessment 454
Simulation Software 457
Conte nts xv

xvi Contents
10.6 System Dynamics Modeling 458
10.7 Agent-Based Modeling 461
~ APPLICATION CASE 10.5 Agent-Based Simulation Helps Analyze
Spread of a Pandemic Outbreak 463
Chapter Highlights 464 • Key Terms 464
Questions for Discussion 465 • Exercises 465
~ END-OF-CHAPTER APPLICATION CASE HP Applies Management
Science Modeling to Optimize Its Supply Chain and Wins a Major
Award 465
References 467
Chapter 11 Automated Decision Systems and Expert Systems 469
11.1 Opening Vignette: I nterContinental Hotel Group Uses
Decision Rules for Optimal Hotel Room Rates 470
11.2 Automated Decision Systems 471
~ APPLICATION CASE 11.1 Giant Food Stores Prices the Entire
Store 472
11.3 The Artificial Intelligence Field 475
11.4 Basic Concepts of Expert Systems 477
Experts 477
Expertise 478
Features of ES 478
~ APPLICATION CASE 11.2 Expert System Helps in Identifying Sport
Talents 480
11.5 Applications of Expert Systems 480
~ APPLICATION CASE 11.3 Expert System Aids in Identification of
Chemical, Biological, and Radiological Agents 481
Classical Applications of ES 481
Newer Applications of ES 482
Areas for ES Applications 483
11.6 Structure of Expert Systems 484
Knowledge Acquisition Subsystem 484
Knowledge Base 485
Inference Engine 485
User Interface 485
Blackboard (Workplace) 485
Explanation Subsystem (Justifier) 486
Knowledge-Refining System 486
~ APPLICATION CASE 11.4 Diagnosing Heart Diseases by Signal
Processing 486
11.7 Knowledge Engineering 487
Knowledge Acquisition 488
Knowledge Verification and Validation 490
Knowledge Representation 490
Inferencing 491
Explanation and Justification 496

11.8 Problem Areas Suitable for Expert Systems 497
11.9 Development of Expert Systems 498
Defining the Nature and Scope of the Problem 499
Identifying Proper Experts 499
Acquiring Knowledge 499
Selecting the Building Tools 499
Coding the System 501
Evaluating the System 501
…. APPLICATION CASE 11.5 Clinical Decision Support System for Tendon
Injuries 501
11.10 Concluding Remarks 502
Chapter Highlights 503 • Key Terms 503
Questions for Discussion 504 • Exercises 504
…. END·OF·CHAPTER APPLICATION CASE Tax Collections Optimization
for New York State 504
References 505
Chapter 12 Knowledge Management and Collaborative Systems 507
12.1 Opening Vignette: Expertise Transfer System to Train
Future Army Personnel 508
12.2 Introduction to Knowledge Management 512
Knowledge Management Concepts and Definitions 513
Knowledge 513
Explicit and Tacit Knowledge 515
12.3 Approaches to Knowledge Management 516
The Process Approach to Knowledge Management 517
The Practice Approach to Knowledge Management 51 7
Hybrid Approaches to Knowledge Management 51 8
Knowledge Repositories 518
12.4 Information Technology (IT) in Knowledge
Management 520
The KMS Cyde 520
Components of KMS 521
Technologies That Support Knowledge Management 521
12.5 Making Decisions in Groups: Characteristics, Process,
Benefits, and Dysfunctions 523
Characteristics of Groupwork 523
The Group Decision-Making Process 524
The Benefits and Limitations of Groupwork 524
12.6 Supporting Groupwork with Computerized Systems 526
An Overview of Group Support Systems (GSS) 526
Groupware 527
Time/Place Framework 527
12.7 Tools for Indirect Support of Decision Making 528
Groupware Tools 528
Conte nts xvii

xviii Conte nts
Groupware 530
Collaborative Workflow 530
Web 2.0 530
Wikis 531
Collaborative Networks 531
12.8 Direct Computerized Support for Decision Making:
From Group Decision Support Systems to Group Support
Systems 532
Group Decision Support Systems (GOSS) 532
Group Support Systems 533
How GOSS (or GSS) Improve Groupwork 533
Facilities for GOSS 534
Chapter Highlights 535 • Key Terms 536
Questions for Discussion 536 • Exercises 536
~ END-OF-CHAPTER APPLICATION CASE Solving Crimes by Sharing
Digital Forensic Knowledge 537
References 539
Part V Big Data and Future Directions for Business
Analytics 541
Chapter 13 Big Data and Analytics 542
13.1 Opening Vignette: Big Data Meets Big Science at CERN 543
13.2 Definition of Big Data 546
The Vs That Define Big Data 547
~ APPLICATION CASE 13.1 Big Data Analytics Helps Luxottica Improve
Its Marketing Effectiveness 550
13.3 Fundamentals of Big Data Analytics 551
Business Problems Addressed by Big Data Analytics 554
~ APPLICATION CASE 13.2 Top 5 Investment Bank Achieves Single
Source of Truth 555
13.4 Big Data Technologies 556
MapReduce 557
Why Use Map Reduce? 558
Hadoop 558
How Does Hadoop Work? 558
Hadoop Technical Components 559
Hadoop: The Pros and Cons 560
NoSQL 562
~ APPLICATION CASE 13.3 eBay’s Big Data Solution 563
13.5 Data Scientist 565
Where Do Data Scientists Come From? 565
~ APPLICATION CASE 13.4 Big Data and Analytics in Politics 568
13.6 Big Data and Data Warehousing 569
Use Case(s) for Hadoop 570
Use Case(s) for Data Warehousing 571

The Gray Areas (Any One of the Two Would Do the Job) 572
Coexistence of Hadoop and Data Warehouse 572
13.7 Big Data Vendors 574
~ APPLICATION CASE 13.5 Dublin City Council Is Leveraging Big Data
to Reduce Traffic Congestion 575
~ APPLICATION CASE 13.6 Creditreform Boosts Credit Rating Quality
with Big Data Visual Analytics 580
13.8 Big Data and Stream Analytics 581
Stream Analytics Versus Perpetual Analytics 582
Critical Event Processing 582
Data Stream Mining 583
13.9 Applications of Stream Analytics 584
e-commerce 584
Telecommunications 584
~ APPLICATION CASE 13.7 Turning Machine-Generated Streaming Data
into Valuable Business Insights 585
Law Enforcement and Cyber Security 586
Power Industry 587
Financial Services 587
Health Sciences 587
Government 587
Chapter Highlights 588 • Key Terms 588
Questions for Discussion 588 • Exercises 589
~ END-OF-CHAPTER APPLICATION CASE Discovery Health Turns Big
Data into Better Healthcare 589
References 591
Chapter 14 Business Analytics: Emerging Trends and Future
Impacts 592
14.1 Opening Vignette: Oklahoma Gas and Electric Employs
Analytics to Promote Smart Energy Use 593
14.2 Location-Based Analytics for Organizations 594
Geospatial Analytics 594
~ APPLICATION CASE 14.1 Great Clips Employs Spatial Analytics to
Shave Time in Location Decisions 596
A Multimedia Exercise in Analytics Employing Geospatial Analytics 597
Real-Time Location Intelligence 598
~ APPLICATION CASE 14.2 Quiznos Targets Customers for Its
Sandwiches 599
14.3 Analytics Applications for Consumers 600
~ APPLICATION CASE 14.3 A Life Coach in Your Pocket 601
14.4 Recommendation Engines 603
14.5 Web 2.0 and Online Social Networking 604
Representative Characteristics of Web 2.0 605
Social Networking 605
A Definition and Basic Information 606
Implications of Business and Enterprise Social Networks 606
Conte nts xix

xx Contents
14.6 Cloud Computing and Bl 607
Service-Oriented DSS 608
Data-as-a-Service (DaaS) 608
Information-as-a-Service (Information on Demand) (laaS) 611
Analytics-as-a-Service (AaaS) 611
14.7 Impacts of Analytics in Organizations: An Overview 613
New Organizational Units 613
Restructuring Business Processes and Virtual Teams 614
The Impacts of ADS Systems 614
Job Satisfaction 614
Job Stress and Anxiety 614
Analytics’ Impact on Managers’ Activities and Their Performance 615
14.8 Issues of Legality, Privacy, and Ethics 616
Legal Issues 616
Privacy 617
Recent Technology Issues in Privacy and Analytics 618
Ethics in Decision Making and Support 619
14.9 An Overview of the Analytics Ecosystem 620
Analytics Industry Clusters 620
Data Infrastructure Providers 620
Data Warehouse Industry 621
Middleware Industry 622
Data Aggregators/Distributors 622
Analytics-Focused Software Developers 622
Reporting/Analytics 622
Predictive Analytics 623
Prescriptive Analytics 623
Application Developers or System Integrators: Industry Specific or General 624
Analytics User Organizations 625
Analytics Industry Analysts and Influencers 627
Academic Providers and Certification Agencies 628
Chapter Highlights 629 • Key Terms 629
Questions for Discussion 629 • Exercises 630
~ END·OF·CHAPTER APPLICATION CASE Southern States Cooperative
Optimizes Its Catalog Campaign 630
References 632
Glossary 634
Index 648

PREFACE
Analytics has become the technology driver of this decade. Companies such as IBM,
Oracle , Microsoft, and others are creating new organizational units focused on analytics
that help businesses become more effective and efficient in their operations. Decision
makers are using more computerized tools to support their work. Even consumers are
using analytics tools directly or indirectly to make decisions on routine activities such as
shopping, healthcare, and entertainment. The field of decision support systems (DSS)/
business intelligence (BI) is evolving rapidly to become more focused on innovative appli-
cations of data streams that were not even captured some time back, much less analyzed
in any significant way. New applications turn up daily in healthcare, sports, entertain-
ment, supply chain management, utilities, and virtually every industry imaginable.
The theme of this revised edition is BI and analytics for enterprise decision support.
In addition to traditional decision support applications, this edition expands the reader’s
understanding of the various types of analytics by providing examples, products, services,
and exercises by discussing Web-related issues throughout the text. We highlight Web
intelligence/Web analytics, which parallel Bl/business analytics (BA) for e-commerce and
other Web applications. The book is supported by a Web site (pearsonhighered.com/
sharda) and also by an independent site at dssbibook.com. We will also provide links
to software tutorials through a special section of the Web site.
The purpose of this book is to introduce the reader to these technologies that are
generally called analytics but have been known by other names. The core technology
consists of DSS, BI, and various decision-making techniques. We use these terms inter-
changeably. This book presents the fundamentals of the techniques and the manner in
which these systems are constructed and used. We follow an EEE approach to introduc-
ing these topics: Exposure, Experience, and Explore. The book primarily provides
exposure to various analytics techniques and their applications. The idea is that a student
will be inspired to learn from how other organizations have employed analytics to make
decisions or to gain a competitive edge. We believe that such exposure to what is being
done with analytics and how it can be achieved is the key component of learning about
analytics. In describing the techniques, we also introduce specific software tools that can
be used for developing such applications. The book is not limited to any one software
tool , so the students can experience these techniques using any number of available
software tools. Specific suggestions are given in each chapter, but the student and the
professor are able to use this book with many different software tools. Our book’s com-
panion Web site will include specific software guides, but students can gain experience
with these techniques in many different ways. Finally, we hope that this exposure and
experience enable and motivate readers to explore the potential of these techniques in
their own domain. To facilitate such exploration, we include exercises that direct them
to Teradata University Network and other sites as well that include team-oriented exer-
cises where appropriate. We will also highlight new and innovative applications that we
learn about on the book’s companion Web sites.
Most of the specific improvements made in this tenth edition concentrate on three
areas: reorganization, content update, and a sharper focus. Despite the many changes, we
have preserved the comprehensiveness and user friendliness that have made the text a
market leader. We have also reduced the book’s size by eliminating older and redundant
material and by combining material that was not used by a majority of professors. At the
same time, we have kept several of the classical references intact. Finally, we present
accurate and updated material that is not available in any other text. We next describe the
changes in the tenth edition.
xxi

xxii Preface
WHAT’S NEW IN THE TENTH EDITION?
With the goal of improving the text, this edition marks a major reorganization of the text
to reflect the focus on analytics. The last two editions transformed the book from the
traditional DSS to BI and fostered a tight linkage with the Teradata University Network
(TUN). This edition is now organized around three major types of analytics. The new
edition has many timely additions , and the dated content has been deleted. The following
major specific changes have been made:
• New organization. The book is now organized around three types of analytics:
descriptive, predictive, and prescriptive, a classification promoted by INFORMS. After
introducing the topics of DSS/ BI and analytics in Chapter 1 and covering the founda-
tions of decision making and decision support in Chapter 2, the book begins with an
overview of data warehousing and data foundations in Chapter 3. This part then cov-
ers descriptive or reporting analytics, specifically, visualization and business perfor-
mance measurement. Chapters 5-8 cover predictive analytics. Chapters 9-12 cover
prescriptive and decision analytics as well as other decision support systems topics.
Some of the coverage from Chapter 3-4 in previous editions will now be found in
the new Chapters 9 and 10. Chapter 11 covers expert systems as well as the new
rule-based systems that are commonly built for implementing analytics. Chapter 12
combines two topics that were key chapters in earlier editions-knowledge manage-
ment and collaborative systems. Chapter 13 is a new chapter that introduces big data
and analytics. Chapter 14 concludes the book with discussion of emerging trends
and topics in business analytics , including location intelligence, mobile computing,
cloud-based analytics, and privacy/ethical considerations in analytics. This chapter
also includes an overview of the analytics ecosystem to help the user explore all of
the different ways one can participate and grow in the analytics environment. Thus ,
the book marks a significant departure from the earlier editions in organization. Of
course, it is still possible to teach a course with a traditional DSS focus with this book
by covering Chapters 1-4, Chapters 9-12, and possibly Chapter 14.
• New chapters. The following chapters have been added:
Chapter 8, “Web Analytics, Web Mining, and Social Analytics.” This chapter
covers the popular topics of Web analytics and social media analytics. It is an
almost entirely new chapter (95% new material).
Chapter 13, “Big Data and Analytics.” This chapter introduces the hot topics of
Big Data and analytics . It covers the basics of major components of Big Data tech-
niques and charcteristics. It is also a new chapter (99% new material) .
Chapter 14, “Business Analytics: Emerging Trends and Future Impacts.”
This chapter examines several new phenomena that are already changing or are
likely to change analytics . It includes coverage of geospatial in analytics , location-
based analytics applications , consumer-oriented analytical applications , mobile plat-
forms , and cloud-based analytics. It also updates some coverage from the previous
edition on ethical and privacy considerations. It concludes with a major discussion
of the analytics ecosystem (90% new material).
• Streamlined coverage. We have made the book shorter by keeping the most
commonly used content. We also mostly eliminated the preformatted online con-
tent. Instead, we will use a Web site to provide updated content and links on a
regular basis. We also reduced the number of references in each chapter.
• Revamped author team. Building upon the excellent content that has been
prepared by the authors of the previous editions (Turban, Aronson , Liang, King ,
Sharda, and Delen) , this edition was revised by Ramesh Sharda and Dursun Delen.

Both Ramesh and Dursun have worked extensively in DSS and analytics and have
industry as well as research experience.
• A live-update Web site. Adopters of the textbook will have access to a Web site that
will include links to news stories, software, tutorials, and even YouTube videos related
to topics covered in the book. This site will be accessible at http://dssbibook.com.
• Revised and updated content. Almost all of the chapters have new opening
vignettes and closing cases that are based on recent stories and events. In addition,
application cases throughout the book have been updated to include recent exam-
ples of applications of a specific technique/model. These application case stories
now include suggested questions for discussion to encourage class discussion as
well as further exploration of the specific case and related materials . New Web site
links have been added throughout the book. We also deleted many older product
links and references. Finally, most chapters have new exercises, Internet assign-
ments, and discussion questions throughout.
Specific changes made in chapters that have been retained from the previous edi-
tions are summarized next:
Chapter 1, “An Overview of Business Intelligence, Analytics, and Decision
Support,” introduces the three types of analytics as proposed by INFORMS: descriptive ,
predictive, and prescriptive analytics. A noted earlier, this classification is used in guiding
the complete reorganization of the book itself. It includes about 50 percent new material.
All of the case stories are new.
Chapter 2, “Foundations and Technologies for Decision Making,” combines mate-
rial from earlier Chapters 1, 2, and 3 to provide a basic foundation for decision making in
general and computer-supported decision making in particular. It eliminates some dupli-
cation that was present in Chapters 1-3 of the previous editions. It includes 35 percent
new material. Most of the cases are new.
Chapter 3, “Data Warehousing”
• 30 percent new material, including the cases
• New opening case
• Mostly new cases throughout
• NEW: A historic perspective to data warehousing-how did we get here?
• Better coverage of multidimensional modeling (star schema and snowflake schema)
• An updated coverage on the future of data warehousing
Chapter 4, “Business Reporting, Visual Analytics, and Business Performance
Management”
• 60 percent of the material is new-especially in visual analytics and reporting
• Most of the cases are new
Chapter 5, “Data Mining”
• 25 percent of the material is new
• Most of the cases are new
Chapter 6, “Techniques for Predictive Modeling”
• 55 percent of the material is new
• Most of the cases are new
• New sections on SVM and kNN
Chapter 7, “Text Analytics, Text Mining, and Sentiment Analysis”
• 50 percent of the material is new
• Most of the cases are new
• New section (1 / 3 of the chapter) on sentiment analysis
Preface xxiii

xxiv Preface
Chapter 8, “Web Analytics, Web Mining, and Social Analytics” (New Chapter)
• 95 percent of the material is new
Chapter 9, “Model-Based Decision Making: Optimization and Multi-Criteria Systems”
• All new cases
• Expanded coverage of analytic hierarchy process
• New examples of mixed-integer programming applications and exercises
• About 50 percent new material
In addition, all the Microsoft Excel-related coverage has been updated to work with
Microsoft Excel 2010.
Chapter 10, “Modeling and Analysis: Heuristic Search Methods and Simulation”
• This chapter now introduces genetic algorithms and various types of simulation
models
• It includes new coverage of other types of simulation modeling such as agent-based
modeling and system dynamics modeling
• New cases throughout
• About 60 percent new material
Chapter 11, “Automated Decision Systems and Expert Systems”
• Expanded coverage of automated decision systems including examples from the
airline industry
• New examples of expert systems
• New cases
• About 50 percent new material
Chapter 12, “Knowledge Management and Collaborative Systems”
• Significantly condensed coverage of these two topics combined into one chapter
• New examples of KM applications
• About 25 percent new material
Chapters 13 and 14 are mostly new chapters , as described earlier.
We have retained many of the enhancements made in the last editions and updated
the content. These are summarized next:
• Links to Teradata University Network (TUN). Most chapters include new links
to TUN (teradatauniversitynetwork.com). We encourage the instructors to regis-
ter and join teradatauniversitynetwork.com and explore various content available
through the site. The cases, white papers , and software exercises available through
TUN will keep your class fresh and timely.
• Book title. As is already evident, the book’s title and focus have changed
substantially.
• Software support. The TUN Web site provides software support at no charge .
It also provides links to free data mining and other software. In addition, the site
provides exercises in the use of such software.
THE SUPPLEMENT PACKAGE: PEARSONHIGHERED.COM/SHARDA
A comprehensive and flexible technology-support package is available to enhance the
teaching and learning experience. The following instructor and student supplements are
available on the book’s Web site, pearsonhighered.com/sharda:
• Instructor’s Manual. The Instructor’s Manual includes learning objectives for the
entire course and for each chapter, answers to the questions and exercises at the end
of each chapter, and teaching suggestions (including instructions for projects). The
Instructor’s Manual is available on the secure faculty section of pearsonhighered
.com/sharda.

• Test Item File and TestGen Software. The Test Item File is a comprehensive
collection of true/false, multiple-choice, fill-in-the-blank, and essay questions. The
questions are rated by difficulty level, and the answers are referenced by book page
number. The Test Item File is available in Microsoft Word and in TestGen. Pearson
Education’s test-generating software is available from www.pearsonhighered.
com/ire. The software is PC/MAC compatible and preloaded with all of the Test
Item File questions. You can manually or randomly view test questions and drag-
and-drop to create a test. You can add or modify test-bank questions as needed. Our
TestGens are converted for use in BlackBoard , WebCT, Moodie, D2L, and Angel.
These conversions can be found on pearsonhighered.com/sharda. The TestGen
is also available in Respondus and can be found on www.respondus.com.
• PowerPoint slides. PowerPoint slides are available that illuminate and build
on key concepts in the text. Faculty can download the PowerPoint slides from
pearsonhighered.com/ sharda.
ACKNOWLEDGMENTS
Many individuals have provided suggestions and criticisms since the publication of the
first edition of this book. Dozens of students participated in class testing of various chap-
ters , software , and problems and assisted in collecting material. It is not possible to name
everyone who participated in this project, but our thanks go to all of them. Certain indi-
viduals made significant contributions, and they deserve special recognition.
First, we appreciate the efforts of those individuals who provided formal reviews of
the first through tenth editions (school affiliations as of the date of review):
Robert Blanning, Vanderbilt University
Ranjit Bose , University of New Mexico
Warren Briggs, Suffolk University
Lee Roy Bronner, Morgan State University
Charles Butler, Colorado State University
Sohail S. Chaudry, University of Wisconsin-La Crosse
Kathy Chudoba , Florida State University
Wingyan Chung, University of Texas
Woo Young Chung, University of Memphis
Paul “Buddy” Clark, South Carolina State University
Pi’Sheng Deng, California State University-Stanislaus
Joyce Elam, Florida International University
Kurt Engemann, Iona College
Gary Farrar, Jacksonville University
George Federman, Santa Clara City College
Jerry Fjermestad, New Jersey Institute of Technology
Joey George , Florida State University
Paul Gray , Claremont Graduate School
Orv Greynholds, Capital College (Laurel, Maryland)
Martin Grossman, Bridgewater State College
Ray Jacobs, Ashland University
Leonard Jessup , Indiana University
Jeffrey Johnson , Utah State University
Jahangir Karimi , University of Colorado Denver
Saul Kassicieh , University of New Mexico
Anand S. Kunnathur, University of Toledo
Preface XXV

xxvi Preface
Shao-ju Lee, California State University at Northridge
Yair Levy, Nova Southeastern University
Hank Lucas, New York University
Jane Mackay, Texas Christian University
George M. Marakas, University of Maryland
Dick Mason, Southern Methodist University
Nick McGaughey, San Jose State University
Ido Millet, Pennsylvania State University-Erie
Benjamin Mittman, Northwestern University
Larry Moore, Virginia Polytechnic Institute and State University
Simitra Mukherjee, Nova Southeastern University
Marianne Murphy, Northeastern University
Peter Mykytyn, Southern Illinois University
Natalie Nazarenko, SUNY College at Fredonia
Souren Paul, Southern Illinois University
Joshua Pauli, Dakota State University
Roger Alan Pick, University of Missouri-St. Louis
W. “RP” Raghupaphi, California State University-Chico
Loren Rees, Virginia Polytechnic Institute and State University
David Russell, Western New England College
Steve Ruth, George Mason University
Vartan Safarian, Winona State University
Glenn Shephard, San Jose State University
Jung P. Shim, Mississippi State University
Meenu Singh, Murray State University
Randy Smith, University of Virginia
James T.C. Teng, University of South Carolina
John VanGigch, California State University at Sacramento
David Van Over, University of Idaho
Paul J.A. van Vliet, University of Nebraska at Omaha
B. S. Vijayaraman, University of Akron
Howard Charles Walton, Gettysburg College
Diane B. Walz, University of Texas at San Antonio
Paul R. Watkins, University of Southern California
Randy S. Weinberg, Saint Cloud State University
Jennifer Williams, University of Southern Indiana
Steve Zanakis, Florida International University
Fan Zhao, Florida Gulf Coast University
Several individuals contributed material to the text or the supporting material.
Susan Baxley and Dr. David Schrader of Teradata provided special help in identifying
new TUN content for the book and arranging permissions for the same. Peter Horner,
editor of OR/MS Today, allowed us to summarize new application stories from OR/
MS Today and Analytics Magazine. We also thank INFORMS for their permission to
highlight content from Inteifaces. Prof. Rick Wilson contributed some examples and
exercise questions for Chapter 9 . Assistance from Natraj Ponna, Daniel Asamoah, Amir
Hassan-Zadeh, Kartik Dasika, Clara Gregory, and Amy Wallace (all of Oklahoma State
University) is gratefully acknowledged for this edition. We also acknowledge Narges
Kasiri (Ithaca College) for the write-up on system dynamics modeling and Jongswas
Chongwatpol (NIDA, Thailand) for the material on SIMIO software. For the previous edi-
tion , we acknowledge the contributions of Dave King QDA Software Group, Inc.) and

Jerry Wagner (University of Nebraska-Omaha) . Major contributors for earlier editions
include Mike Gou! (Arizona State University) and Leila A. Halawi (Bethune-Cookman
College), who provided material for the chapter on data warehousing; Christy Cheung
(Hong Kong Baptist University), who contributed to the chapter on knowledge man-
agement; Linda Lai (Macau Polytechnic University of China); Dave King QDA Software
Group, Inc.); Lou Frenzel, an independent consultant whose books Crash Course in
Artificial Intelligence and Expert Systems and Understanding of Expert Systems (both
published by Howard W. Sams, New York , 1987) provided material for the early edi-
tions; Larry Medsker (American University), who contributed substantial material on neu-
ral networks; and Richard V. McCarthy (Quinnipiac University), who performed major
revisions in the seventh edition.
Previous editions of the book have also benefited greatly from the efforts of many
individuals who contributed advice and interesting material (such as problems), gave
feedback on material, or helped with class testing. These individuals are Warren Briggs
(Suffolk University) , Frank DeBalough (University of Southern California), Mei-Ting
Cheung (University of Hong Kong), Alan Dennis (Indiana University), George Easton
(San Diego State University), Janet Fisher (California State University, Los Angeles),
David Friend (Pilot Software, Inc .) , the late Paul Gray (Claremont Graduate School),
Mike Henry (OSU), Dustin Huntington (Exsys , Inc.), Subramanian Rama Iyer (Oklahoma
State University), Angie Jungermann (Oklahoma State University), Elena Karahanna
(The University of Georgia), Mike McAulliffe (The University of Georgia), Chad Peterson
(The University of Georgia), Neil Rabjohn (York University), Jim Ragusa (University of
Central Florida) , Alan Rowe (University of Southern California) , Steve Ruth (George
Mason University), Linus Schrage (University of Chicago), Antonie Stam (University of
Missouri), Ron Swift (NCR Corp.) , Merril Warkentin (then at Northeastern University),
Paul Watkins (The University of Southern California), Ben Mortagy (Claremont Graduate
School of Management), Dan Walsh (Bellcore), Richard Watson (The University of
Georgia), and the many other instructors and students who have provided feedback.
Several vendors cooperated by providing development and/or demonstra-
tion software: Expert Choice , Inc. (Pittsburgh, Pennsylvania), Nancy Clark of Exsys ,
Inc. (Albuquerque, New Mexico), Jim Godsey of GroupSystems, Inc. (Broomfield,
Colorado), Raimo Hamalainen of Helsinki University of Technology, Gregory Piatetsky-
Shapiro of KDNuggets .com, Logic Programming Associates (UK), Gary Lynn of
NeuroDimension Inc. (Gainesville, Florida), Palisade Software (Newfield, New York),
Jerry Wagner of Planners Lab (Omaha, Nebraska) , Promised Land Technologies (New
Haven, Connecticut) , Salford Systems (La Jolla , California), Sense Networks (New York ,
New York), Gary Miner of StatSoft, Inc . (Tulsa, Oklahoma) , Ward Systems Group, Inc .
(Frederick, Maryland), Idea Fisher Systems, Inc. (Irving, California), and Wordtech
Systems (Orinda , California) .
Special thanks to the Teradata University Network and especially to Hugh Watson,
Michael Gou!, and Susan Baxley, Program Director, for their encouragement to tie this
book with TUN and for providing useful material for the book.
Many individuals helped us with administrative matters and editing, proofreading,
and preparation. The project began with Jack Repcheck (a former Macmillan editor), who
initiated this project with the support of Hank Lucas (New York University). Judy Lang
collaborated with all of us , provided editing, and guided us during the entire project
through the eighth edition.
Finally, the Pearson team is to be commended: Executive Editor Bob Horan, who
orchestrated this project; Kitty Jarrett, who copyedited the manuscript; and the produc-
tion team, Tom Benfatti at Pearson, George and staff at Integra Software Services , who
transformed the manuscript into a book.
Preface xxvii

xxviii Preface
We would like to thank all these individuals and corporations. Without their help,
the creation of this book would not have been possible. Ramesh and Dursun want to
specifically acknowledge the contributions of previous coauthors Janine Aronson, David
King, and T. P. Liang, whose original contributions constitute significant components of
the book.
R.S.
D.D.
E.T
Note that Web site URLs are dynamic. As this book went to press, we verified that all the cited Web sites were
active and valid. Web sites to which we refer in the text sometimes change or are discontinued because compa-
nies change names , are bought or sold, merge, or fail. Sometimes Web sites are down for maintenance, repair,
or redesign. Most organizations have dropped the initial “www” designation for their sites, but some still use
it . If you have a problem connecting to a Web site that we mention , please be patient and simply run a Web
search to try to identify the new site. Most times , the new site can be found quickly. Some sites also require a
free registration before allowing you to see the content. We apologize in advance for this inconvenience.

ABOUT THE AUTHORS
Ramesh Sharda (M.B.A., Ph.D ., University of Wisconsin-Madison) is director of the
Ph.D. in Business for Executives Program and Institute for Research in Information
Systems (IRIS), ConocoPhillips Chair of Management of Technology, and a Regents
Professor of Management Science and Information Systems in the Spears School of
Business at Oklahoma State University (OSU) . About 200 papers describing his research
have been published in major journals, including Operations Research, Management
Science, Information Systems Research, Decision Support Systems, and journal of MIS.
He cofounded the AIS SIG on Decision Support Systems and Knowledge Management
(SIGDSS). Dr. Sharda serves on several editorial boards, including those of INFORMS
journal on Computing, Decision Support Systems , and ACM Transactions on Management
Information Systems . He has authored and edited several textbooks and research books
and serves as the co-editor of several book series (Integrated Series in Information
Systems , Operations Research/ Computer Science Interfaces, and Annals of Information
Systems) with Springer. He is also currently serving as the executive director of the
Teradata University Network. His current research interests are in decision support sys-
tems, business analytics, and technologies for managing information overload.
Dursun Delen (Ph.D ., Oklahoma State University) is the Spears and Patterson Chairs in
Business Analytics, Director of Research for the Center for Health Systems Innovation,
and Professor of Management Science and Information Systems in the Spears School of
Business at Oklahoma State University (OSU). Prior to his academic career, he worked
for a privately owned research and consultancy company, Knowledge Based Systems
Inc. , in College Station, Texas, as a research scientist for five years, during which he led
a number of decision support and other information systems-related research projects
funded by federal agencies such as DoD, NASA, NIST, and DOE. Dr. Delen’s research has
appeared in major journals including Decision Support Systems, Communications of the
ACM, Computers and Operations Research, Computers in Industry, journal of Production
Operations Management, Artificial Intelligence in Medicine, and Expert Systems with
Applications, among others. He recently published four textbooks: Advanced Data Mining
Techniques with Springer, 2008; Decision Support and Business Intelligence Systems with
Prentice Hall, 2010; Business Intelligence: A Managerial Approach , with Prentice Hall,
2010; and Practical Text Mining, with Elsevier, 2012 . He is often invited to national and
international conferences for keynote addresses on topics related to data/ text mining,
business intelligence, decision support systems , and knowledge management. He served
as the general co-chair for the 4th International Conference on Network Computing and
Advanced Information Management (September 2-4, 2008, in Seoul, South Korea) and
regularly chairs tracks and mini-tracks at various information systems conferences. He is
the associate editor-in-chief for International journal of Experimental Algorithms, associ-
ate editor for International journal of RF Technologies and journal of Decision Analytics,
and is on the editorial boards of five other technical journals. His research and teaching
interests are in data and text mining , decision support systems , knowledge management,
business intelligence, and enterprise modeling.
Efraim Turban (M .B.A., Ph .D., University of California, Berkeley) is a visiting scholar
at the Pacific Institute for Information System Management, University of Hawaii. Prior
to this, he was on the staff of several universities, including City University of Hong
Kong; Lehigh University; Florida International University; California State University, Long
xxix

XXX About the Authors
Beach; Eastern Illinois University; and the University of Southern California. Dr. Turban
is the author of more than 100 refereed papers published in leading journals, such as
Management Science, MIS Quarterly, and Decision Support Systems. He is also the author
of 20 books , including Electronic Commerce: A Managerial Perspective and Information
Technology for Management. He is also a consultant to major corporations worldwide.
Dr. Turban’s current areas of interest are Web-based decision support systems, social
commerce, and collaborative decision making.

P A R T
Decision Making and Analytics
An Overview
LEARNING OBJECTIVES FOR PART I
• Understand the need for business analytics
• Understand the foundations and key issues of
managerial decision making
• Understand the major categories and
applications of business analytics
• Learn the major frameworks of computerized
decision support: analytics, decision support
systems (DSS), and business intelligence (BI)
This book deals with a collection of computer technologies that support managerial work-essentially,
decision making. These technologies have had a profound impact on corporate strategy, perfor-
mance, and competitiveness. These techniques broadly encompass analytics, business intelligence,
and decision support systems, as shown throughout the book. In Part I, we first provide an overview
of the whole book in one chapter. We cover several topics in this chapter. The first topic is managerial
decision making and its computerized support; the second is frameworks for decision support. We
then introduce business analytics and business intelligence. We also provide examples of applications
of these analytical techniques, as well as a preview of the entire book. The second chapter within
Part I introduces the foundational methods for decision making and relates these to computerized
decision support. It also covers the components and technologies of decision support systems.
1

2
An Overview of Business Intelligence,
Analytics, and Decision Support
LEARNING OBJECTIVES
• Understand today’s turbulent business
environment and describe how
organizations survive and even excel in
such an environment (solving problems
and exploiting opportunities)
• Understand the need for computerized
support of managerial decision making
• Understand an early framework for
managerial decision making
• Learn the conceptual foundations of
the decision support systems (DSS1)
methodology
• Describe the business intelligence (BI)
methodology and concepts and relate
them to DSS
• Understand the various types of analytics
• List the major tools of computerized
decision support
T
he business environment (climate) is constantly changing, and it is becoming more
and more complex. Organizations, private and public, are under pressures that
force them to respond quickly to changing conditions and to be innovative in the
way they operate. Such activities require organizations to be agile and to make frequent
and quick strategic, tactical, and operational decisions, some of which are very complex.
Making such decisions may require considerable amounts of relevant data, information,
and knowledge. Processing these, in the framework of the needed decisions, must be
done quickly, frequently in real time, and usually requires some computerized support.
This book is about using business analytics as computerized support for manage-
rial decision making. It concentrates on both the theoretical and conceptual founda-
tions of decision support, as well as on the commercial tools and techniques that are
available. This introductory chapter provides more details of these topics as well as an
overview of the book. This chapter has the following sections:
1.1 Opening Vignette: Magpie Sensing Employs Analytics to Manage a Vaccine
Supply Chain Effectively and Safely 3
1.2 Changing Business Environments and Computerized Decision Support 5
‘The acronym DSS is treated as both singular and plural throughout this book. Similarly, other acronyms, such
as MIS and GSS, designate both plural and singular forms. This is also true of the word analytics.

Chapter 1 • An Overview of Business Intelligence, Analytics, and Decision Support 3
1.3 Managerial Decision Making 7
1.4 Information Systems Support for Decision Making 9
1.5 An Early Framework for Computerized Decision Support 11
1.6 The Concept of Decision Support Systems (DSS) 13
1. 7 A Framework for Business Intelligence (BI) 14
1.8 Business Analytics Overview 19
1.9 Brief Introduction to Big Data Analytics 27
1.10 Plan of the Book 29
1.11 Resources, Links, and the Teradata University Network Connection 31
1.1 OPENING VIGNETTE: Magpie Sensing Employs
Analytics to Manage a Vaccine Supply Chain
Effectively and Safely
Cold chain in healthcare is defined as the temperature-controlled supply chain involving a
system of transporting and storing vaccines and pharmaceutical drugs. It consists of three
major components-transport and storage equipment, trained personnel, and efficient
management procedures. The majority of the vaccines in the cold chain are typically main-
tained at a temperature of 35–46 degrees Fahrenheit [2-8 degrees Centigrade]. Maintaining
cold chain integrity is extremely important for healthcare product manufacturers.
Especially for the vaccines, improper storage and handling practices that compromise
vaccine viability prove a costly, time-consuming affair. Vaccines must be stored properly
from manufacture until they are available for use. Any extreme temperatures of heat or
cold will reduce vaccine potency; such vaccines, if administered, might not yield effective
results or could cause adverse effects .
Effectively maintaining the temperatures of storage units throughout the healthcare
supply chain in real time-Le., beginning from the gathering of the resources, manufac-
turing, distribution, and dispensing of the products-is the most effective solution desired
in the cold chain. Also, the location-tagged real-time environmental data about the storage
units helps in monitoring the cold chain for spoiled products. The chain of custody can
be easily identified to assign product liability.
A study conducted by the Centers for Disease Control and Prevention ( CDC) looked at
the handling of cold chain vaccines by 45 healthcare providers around United States and
reported that three-quarters of the providers experienced serious cold chain violations.
A WAY TOWARD A POSSIBLE SOLUTION
Magpie Sensing, a start-up project under Ebers Smith and Douglas Associated LLC, pro-
vides a suite of cold chain monitoring and analysis technologies for the healthcare indus-
try. It is a shippable, wireless temperature and humidity monitor that provides real-time,
location-aware tracking of cold chain products during shipment. Magpie Sensing’s solu-
tions rely on rich analytics algorithms that leverage the data gathered from the monitor-
ing devices to improve the efficiency of cold chain processes and predict cold storage
problems before they occur.
Magpie sensing applies all three types of analytical techniques-descriptive, predic-
tive, and prescriptive analytics-to tum the raw data returned from the monitoring devices
into actionable recommendations and warnings.
The properties of the cold storage system, which include the set point of the storage
system’s thermostat, the typical range of temperature values in the storage system, and

4 Part I • Decision Making and Analytics: An Oveiview
the duty cycle of the system’s compressor, are monitored and reported in real time. This
information helps trained personnel to ensure that the storage unit is properly configured
to store a particular product. All the temperature information is displayed on a Web dash-
board that shows a graph of the temperature inside the specific storage unit.
Based on information derived from the monitoring devices, Magpie’s predictive ana-
lytic algorithms can determine the set point of the storage unit’s thermostat and alert the
system’s users if the system is incorrectly configured, depending upon the various types
of products stored. This offers a solution to the users of consumer refrigerators where
the thermostat is not temperature graded. Magpie’s system also sends alerts about pos-
sible temperature violations based on the storage unit’s average temperature and subse-
quent compressor cycle runs, which may drop the temperature below the freezing point.
Magpie ‘s predictive analytics further report possible human errors, such as failure to shut
the storage unit doors or the presence of an incomplete seal, by analyzing the tempera-
ture trend and alerting users via Web interface, text message , or audible alert before the
temperature bounds are actually violated. In a similar way, a compressor or a power
failure can be detected; the estimated time before the storage unit reaches an unsafe tem-
perature also is reported, which prepares the users to look for backup solutions such as
using dry ice to restore power.
In addition to predictive analytics, Magpie Sensing’s analytics systems can provide
prescriptive recommendations for improving the cold storage processes and business
decision making. Prescriptive analytics help users dial in the optimal temperature setting,
which helps to achieve the right balance between freezing and spoilage risk; this, in turn,
provides a cushion-time to react to the situation before the products spoil. Its prescriptive
analytics also gather useful meta-information on cold storage units, including the times of
day that are busiest and periods where the system’s doors are opened, which can be used
to provide additional design plans and institutional policies that ensure that the system is
being properly maintained and not overused.
Furthermore, prescriptive analytics can be used to guide equipment purchase deci-
sions by constantly analyzing the performance of current storage units. Based on the
storage system’s efficiency, decisions on distributing the products across available storage
units can be made based on the product’s sensitivity.
Using Magpie Sensing’s cold chain analytics, additional manufacturing time and
expenditure can be eliminated by ensuring that product safety can be secured throughout
the supply chain and effective products can be administered to the patients. Compliance
with state and federal safety regulations can be better achieved through automatic data
gathering and reporting about the products involved in the cold chain.
QUESTIONS FOR THE OPENING VIGNETTE
1. What information is provided by the descriptive analytics employed at Magpie
Sensing?
2. What type of support is provided by the predictive analytics employed at Magpie
Sensing?
3. How does prescriptive analytics help in business decision making?
4. In what ways can actionable information be reported in real time to concerned
users of the system?
5. In what other situations might real-time monitoring applications be needed?
WHAT WE CAN LEARN FROM THIS VIGNETIE
This vignette illustrates how data from a business process can be used to generate insights
at various levels. First, the graphical analysis of the data (termed reporting analytics) allows

Chapter 1 • An Overview of Business Intelligence, Analytics, and Decision Support 5
users to get a good feel for the situation. Then, additional analysis using data mining
techniques can be used to estimate what future behavior would be like. This is the domain
of predictive analytics. Such analysis can then be taken to create specific recommendations
for operators. This is an example of what we call prescriptive analytics. Finally, this open-
ing vignette also suggests that innovative applications of analytics can create new business
ventures. Identifying opportunities for applications of analytics and assisting with decision
making in specific domains is an emerging entrepreneurial opportunity.
Sources: Magpiesensing.com, “Magpie Sensing Cold Chain Analytics and Monitoring,” magpiesensing.com/
wp-content/uploads/2013/01/ColdChainAnalyticsMagpieSensing-Whitepaper (accessed July 2013);
Centers for Disease Control and Prevention, Vaccine Storage and Handling, http://www.cdc.gov/vaccines/pubs/
pinkbook/vac-storage.html#storage (accessed July 2013); A. Zaleski, “Magpie Analytics System Tracks Cold-
Chain Products to Keep Vaccines, Reagents Fresh” (2012), technicallybaltimore.com/profiles/startups/magpie-
analytics-system-track.s-cold-chain-products-to-keep-vaccines-reagents-fresh (accessed February 2013).
1.2 CHANGING BUSINESS ENVIRONMENTS AND COMPUTERIZED
DECISION SUPPORT
The opening vignette illustrates how a company can employ technologies to make sense
of data and make better decisions. Companies are moving aggressively to computerized
support of their operations. To understand why companies are embracing computer-
ized support, including business intelligence, we developed a model called the Business
Pressures-Responses-Support Model, which is shown in Figure 1.1.
The Business Pressures-Responses-Support Model
The Business Pressures-Responses-Support Model, as its name indicates, has three com-
ponents: business pressures that result from today’s business climate, responses (actions
taken) by companies to counter the pressures (or to take advantage of the opportunities
available in the environment), and computerized support that facilitates the monitoring
of the environment and enhances the response actions taken by organizations.
Business
Environmental Factors
Globalization
Customer demand
Government regulations
Market conditions
Competition
Etc.
Pressures
Opportunities
Organization
Respon ses
Strategy
Partners’ collaboration
Real-time response
Agility
Increased productivity
New vendors
New business models
Etc.
FIGURE 1.1 The Business Pressures-Responses-Support Model.
.
Decisions and
Support
Analyses
Predictions
Decisions
i i i
Integrated
computerized
decision
support
Business
intelligence

6 Part I • Decision Making and Analytics: An Overview
THE BUSINESS ENVIRONMENT The environment in which organizations operate today
is becoming more and more complex. This complexity creates opportunities on the one
hand and problems on the other. Take globalization as an example. Today, you can eas-
ily find suppliers and customers in many countries, which means you can buy cheaper
materials and sell more of your products and services; great opportunities exist. However,
globalization also means more and stronger competitors. Business environment factors
can be divided into four major categories: markets, consumer demands, technology, and
societal. These categories are summarized in Table 1.1.
Note that the intensity of most of these factors increases with time, leading to
more pressures, more competition, and so on. In addition, organizations and departments
within organizations face decreased budgets and amplified pressures from top managers
to increase performance and profit. In this kind of environment, managers must respond
quickly, innovate, and be agile. Let’s see how they do it.
ORGANIZATIONAL RESPONSES: BE REACTIVE, ANTICIPATIVE, ADAPTIVE, AND PROACTIVE
Both private and public organizations are aware of today’s business environment and
pressures. They use different actions to counter the pressures. Vodafone New Zealand
Ltd (Krivda, 2008), for example, turned to BI to improve communication and to support
executives in its effort to retain existing customers and increase revenue from these cus-
tomers. Managers may take other actions, including the following:
• Employ strategic planning.
• Use new and innovative business models.
• Restructure business processes.
• Participate in business alliances.
• Improve corporate information systems.
• Improve partnership relationships.
TABLE 1.1 Business Environment Factors That Create Pressures on Organizations
Factor Description
Markets Strong competition
Expanding global markets
Consumer demands
Technology
Societal
Booming electronic markets on the Internet
Innovative marketing methods
Opportunities for outsourcing with IT support
Need for real-time, on-demand transactions
Desire for customization
Desire for quality, diversity of products, and speed of delivery
Customers getting powerful and less loyal
More innovations, new products, and new services
Increasing obsolescence rate
Increasing information overload
Social networking, Web 2.0 and beyond
Growing government regulations and deregulation
Workforce more diversified, older, and composed of more women
Prime concerns of homeland security and terrorist attacks
Necessity of Sarbanes-Oxley Act and other reporting-related legislation
Increasing social responsibility of companies
Greater emphasis on sustainability

Chapter 1 • An Overview of Business Intelligence , Analytics, and Decision Support 7
• Encourage innovation and creativity.
• Improve customer service and relationships.
• Employ social media and mobile platforms for e-commerce and beyond.
• Move to make-to-order production and on-demand manufacturing and services .
• Use new IT to improve communication, data access (discovery of information) , and
collaboration.
• Respond quickly to competitors’ actions (e.g., in pricing, promotions, new products
and services).
• Automate many tasks of white-collar employees.
• Automate certain decision processes, especially those dealing with customers.
• Improve decision making by employing analytics.
Many, if not all, of these actions require some computerized support. These and other
response actions are frequently facilitated by computerized decision support (DSS).
CLOSING THE STRATEGY GAP One of the major objectives of computerized decision
support is to facilitate closing the gap between the current performance of an organi-
zation and its desired performance, as expressed in its mission, objectives, and goals,
and the strategy to achieve them. In order to understand why computerized support
is needed and how it is provided, especially for decision-making support, let’s look at
managerial decision making.
SECTION 1.2 REVIEW QUESTIONS
1. List the components of and explain the Business Pressures-Responses-Support
Model.
2. What are some of the major factors in today’s business environment?
3. What are some of the major response activities that organizations take?
1.3 MANAGERIAL DECISION MAKING
Management is a process by which organizational goals are achieved by using
resources . The resources are considered inputs, and attainment of goals is viewed as
the output of the process. The degree of success of the organization and the manager
is often measured by the ratio of outputs to inputs. This ratio is an indication of the
organization’s productivity, which is a reflection of the organizational and managerial
pe,fonnance.
The level of productivity or the success of management depends on the perfor-
mance of managerial functions, such as planning, organizing, directing, and control-
ling. To perform their functions , managers engage in a continuous process of making
decisions. Making a decision means selecting the best alternative from two or more
solutions.
The Nature of Managers’ Work
Mintzberg’s (2008) classic study of top managers and several replicated studies suggest
that managers perform 10 major roles that can be classified into three major categories :
interpersonal, infonnational, and decisional (see Table 1.2).
To perform these roles, managers need information that is delivered efficiently and
in a timely manner to personal computers (PCs) on their desktops and to mobile devices.
This information is delivered by networks, generally via Web technologies.
In addition to obtaining information necessary to better perform their roles, manag-
ers use computers directly to support and improve decision making, which is a key task

8 Part I • Decision Making and Analytics: An Overview
TABLE 1.2 Mintzberg’s 10 Managerial Roles
Role
Interpersonal
Figurehead
Leader
Liaison
Informational
Monitor
Disseminator
Spokesperson
Decisional
Entrepreneur
Disturbance handler
Resource allocator
Negotiator
Description
Is symbolic head; obliged to perform a number of routine duties of a
legal or social nature
Is responsible for the motivation and activation of subordinates;
responsible for staffing, training, and associated duties
Maintains self-developed network of outside contacts and informers
who provide favors and information
Seeks and receives a wide variety of special information (much of it
current) to develop a thorough understanding of the organization
and environment; emerges as the nerve center of the organization’s
internal and external information
Transmits information received from outsiders or from subordinates to
members of the organization; some of this information is factual,
and some involves interpretation and integration
Transmits information to outsiders about the organization’s plans,
policies, actions, results, and so forth; serves as an expert on the
organization’s industry
Searches the organization and its environment for opportunities and
initiates improvement projects to bring about change; supervises
design of certain projects
Is responsible for corrective action when the organization faces
important, unexpected disturbances
Is responsible for the allocation of organizational resources of all
kinds; in effect, is responsible for the making or approval of all
significant organizational decisions
Is responsible for representing the organization at major negotiations
Sources: Compiled from H. A. Mintzberg, The Nature of Managerial Work. Prentice Hall, Englew ood Cliffs,
NJ, 1980; and H. A. Mintzberg, The Rise and Fall of Strategic Planning. The Free Press, New York, 1993.
that is part of most of these roles. Many managerial activities in all roles revolve around
decision making. Managers, especially those at high managerial levels, are primarily deci-
sion makers. We review the decision-making process next but will study it in more detail
in the next chapter.
The Decision-Making Process
For years, managers considered decision making purely an art-a talent acquired over a
long period through experience (i.e., learning by trial-and-error) and by using intuition.
Management was considered an art because a variety of individual styles could be used
in approaching and successfully solving the same types of managerial problems. These
styles were often based on creativity, judgment, intuition, and experience rather than
on systematic quantitative methods grounded in a scientific approach. However, recent
research suggests that companies with top managers who are more focused on persistent
work (almost dullness) tend to outperform those with leaders whose main strengths are
interpersonal communication skills (Kaplan et al., 2008; Brooks, 2009). It is more impor-
tant to emphasize methodical, thoughtful, analytical decision making rather than flashi-
ness and interpersonal communication skills.

Chapter 1 • An Overview of Business Intelligence , Analytics, and Decision Support 9
Managers usually make decisions by following a four-step process C we learn more
about these in Chapter 2):
1. Define the problem (i.e., a decision situation that may deal with some difficulty or
with an opportunity).
2. Construct a model that describes the real-world problem.
3. Identify possible solutions to the modeled problem and evaluate the solutions.
4. Compare, choose, and recommend a potential solution to the problem.
To follow this process, one must make sure that sufficient alternative solutions are
being considered, that the consequences of using these alternatives can be reasonably
predicted, and that comparisons are done properly. However, the environmental factors
listed in Table 1.1 make such an evaluation process difficult for the following reasons:
• Technology, information systems, advanced search engines, and globalization result
in more and more alternatives from which to choose.
• Government regulations and the need for compliance, political instability and ter-
rorism, competition, and changing consumer demands produce more uncertainty,
making it more difficult to predict consequences and the future.
• Other factors are the need to make rapid decisions, the frequent and unpredictable
changes that make trial-and-error learning difficult, and the potential costs of making
mistakes.
• These environments are growing more complex every day. Therefore, making deci-
sions today is indeed a complex task.
Because of these trends and changes, it is nearly impossible to rely on a trial-and-
error approach to management, especially for decisions for which the factors shown in
Table 1.1 are strong influences. Managers must be more sophisticated; they must use the
new tools and techniques of their fields. Most of those tools and techniques are discussed
in this book. Using them to support decision making can be extremely rewarding in
making effective decisions. In the following section, we look at why we need computer
support and how it is provided.
SECTION 1.3 REVIEW QUESTIONS
1. Describe the three major managerial roles , and list some of the specific activities in each.
2. Why have some argued that management is the same as decision making?
3. Describe the four steps managers take in making a decision.
1.4 INFORMATION SYSTEMS SUPPORT FOR DECISION MAKING
From traditional uses in payroll and bookkeeping functions, computerized systems have
penetrated complex managerial areas ranging from the design and management of auto-
mated factories to the application of analytical methods for the evaluation of proposed
mergers and acquisitions. Nearly all executives know that information technology is vital
to their business and extensively use information technologies.
Computer applications have moved from transaction processing and monitoring
activities to problem analysis and solution applications, and much of the activity is done
with Web-based technologies, in many cases accessed through mobile devices. Analytics
and BI tools such as data warehousing, data mining, online analytical processing (OLAF) ,
dashboards , and the use of the Web for decision support are the cornerstones of today’s
modern management. Managers must have high-speed, networked information sys-
tems (wireline or wireless) to assist them with their most important task: making deci-
sions. Besides the obvious growth in hardware, software, and network capacities, some

10 Part I • Decision Making and Analytics: An Oveiview
developments have clearly contributed to facilitating growth of decision support and
analytics in a number of ways, including the following:
• Group communication and collaboration. Many decisions are made today by
groups whose members may be in different locations. Groups can collaborate and
communicate readily by using Web-based tools as well as the ubiquitous smartphones.
Collaboration is especially important along the supply chain, where partners-all the
way from vendors to customers-must share information. Assembling a group of
decision makers, especially experts, in one place can be costly. Infonnation systems
can improve the collaboration process of a group and enable its members to be at dif-
ferent locations (saving travel costs). We will study some applications in Chapter 12.
• Improved data management. Many decisions involve complex computations.
Data for these can be stored in different databases anywhere in the organization
and even possibly at Web sites outside the organization. The data may include text,
sound, graphics, and video, and they can be in different languages. It may be neces-
sary to transmit data quickly from distant locations. Systems today can search, store,
and transmit needed data quickly, economically, securely, and transparently.
• Managing giant data warehouses and Big Data. Large data warehouses, like
the ones operated by Walmart, contain terabytes and even petabytes of data. Special
methods , including parallel computing, are available to organize, search, and mine
the data. The costs related to data warehousing are declining. Technologies that fall
under the broad category of Big Data have enabled massive data coming from a
variety of sources and in many different forms, which allows a very different view
into organizational performance that was not possible in the past.
• Analytical support. With more data and analysis technologies, more alterna-
tives can be evaluated, forecasts can be improved, risk analysis can be performed
quickly, and the views of experts (some of whom may be in remote locations) can
be collected quickly and at a reduced cost. Expertise can even be derived directly
from analytical systems. With such tools , decision makers can perform complex
simulations, check many possible scenarios, and assess diverse impacts quickly and
economically. This, of course, is the focus of several chapters in the book.
• Overcoming cognitive limits in processingandstoringinformation. According
to Simon 0977), the human mind has only a limited ability to process and store infor-
mation. People sometimes find it difficult to recall and use infonnation in an error-free
fashion due to their cognitive limits. The term cognitive limits indicates that an indi-
vidual’s problem-solving capability is limited when a wide range of diverse information
and knowledge is required. Computerized systems enable people to overcome their
cognitive limits by quickly accessing and processing vast amounts of stored information
(see Chapter 2).
• Knowledge management. Organizations have gathered vast stores of informa-
tion about their own operations, customers, internal procedures, employee interac-
tions , and so forth through the unstructured and structured communications taking
place among the various stakeholders. Knowledge management systems (KMS ,
Chapter 12) have become sources of formal and informal support for decision
making to managers, although sometimes they may not even be called KMS.
• Anywhere, any time support. Using wireless technology, managers can access
information anytime and from any place , analyze and interpret it, and communicate
with those involved. This perhaps is the biggest change that has occurred in the last
few years. The speed at which information needs to be processed and converted
into decisions has truly changed expectations for both consumers and businesses .
These and other capabilities have been driving tl1e use of computerized decision support
since the late 1960s, but especially since the mid-1990s. The growth of mobile technologies ,

Chapter 1 • An Overview of Business Intelligence, Analytics, and Decision Support 11
social media platforms, and analytical tools has enabled a much higher level of information
systems support for managers. In the next sections we study a historical classification of
decision support tasks. This leads us to be introduced to decision support systems. We will
then study an overview of technologies that have been broadly referred to as business intel-
ligence. From there we will broaden our horizons to introduce various types of analytics.
SECTION 1.4 REVIEW QUESTIONS
1. What are some of the key system-oriented trends that have fostered IS-supported
decision making to a new level?
2. List some capabilities of information systems that can facilitate managerial decision
making.
3. How can a computer help overcome the cognitive limits of humans?
1.5 A N EARLY FRAMEWORK FOR COMPUTERIZED
DECISION SUPPORT
An early framework for computerized decision support includes several major concepts
that are used in forthcoming sections and chapters of this book. Gorry and Scott-Morton
created and used this framework in the early 1970s, and the framework then evolved into
a new technology called DSS.
The Gorry and Scott-Morton Classical Framework
Gorry and Scott-Morton 0971) proposed a framework that is a 3-by-3 matrix, as shown in
Figure 1.2. The two dimensions are the degree of structuredness and the types of control.
Type of Control
Operational Managerial S trategic
Type of Decision Contro l Control Planning
L!_ l!_ l!_
Accounts receivable Budget analysis Financial management
Structured Accounts payable Short-term forecasting Investment portfolio
Order entry Personnel reports Warehouse location
Make-or-buy Distribution systems
~ l!_ l!_
Production scheduling Credit evaluation Building a new plant
Inventory control Budget preparation Mergers & acquisitions
S emistructured Plant layout New product planning
Project scheduling Compensation planning
Reward system design Quality assurance
Inventory HR policies
categorization Inventory planning
L!_ l!_ l!_
Buying software Negotiating R & D planning
Unstructured Approving loans Recruiting an executive New tech development
Operating a help desk Buying hardware Social responsibility
Selecting a cover for Lobbying planning
a magazine
FIGURE 1.2 Decision Support Frameworks.

12 Part I • Decision Making and Analytics: An Oveiview
DEGREE OF STRUCTUREDNESS The left side of Figure 1.2 is based on Simon’s (1977) idea
that decision-making processes fall along a continuum that ranges from highly structured
(sometimes called programmed) to highly unstructured (i.e., nonprogrammed) decisions.
Structured processes are routine and typically repetitive problems for which standard
solution methods exist. Unstrnctured processes are fuzzy, complex problems for which
there are no cut-and-dried solution methods.
An unstructured problem is one where the articulation of the problem or the solu-
tion approach may be unstructured in itself. In a structured problem, the procedures
for obtaining the best (or at least a good enough) solution are known. Whether the prob-
lem involves finding an appropriate inventory level or choosing an optimal investment
strategy, the objectives are clearly defined. Common objectives are cost minimization and
profit maximization.
Semistructured problems fall between structured and unstructured problems, hav-
ing some structured elements and some unstructured elements. Keen and Scott-Morton
0978) mentioned trading bonds, setting marketing budgets for consumer products, and
performing capital acquisition analysis as semistructured problems.
TYPES OF CONTROL The second half of the Gorry and Scott-Morton framework
(refer to Figure 1.2) is based on Anthony’s 0965) taxonomy, which defines three
broad categories that encompass all managerial activities: strategic planning, which
involves defining long-range goals and policies for resource allocation; manage-
ment control, the acquisition and efficient use of resources in the accomplishment of
organizational goals; and operational control, the efficient and effective execution of
specific tasks.
THE DECISION SUPPORT MATRIX Anthony’s and Simon’s taxonomies are combined in the
nine-cell decision support matrix shown in Figure 1.2. The initial purpose of this matrix
was to suggest different types of computerized support to different cells in the matrix.
Gorry and Scott-Morton suggested, for example, that for semistructured decisions and
unstrnctured decisions, conventional management information systems (MIS) and man-
agement science (MS) tools are insufficient. Human intellect and a different approach to
computer technologies are necessary. They proposed the use of a supportive information
system, which they called a DSS.
Note that the more structured and operational control-oriented tasks (such as
those in cells 1, 2, and 4) are usually performed by lower-level managers, whereas
the tasks in cells 6, 8, and 9 are the responsibility of top executives or highly trained
specialists.
Computer Support for Structured Decisions
Computers have historically supported structured and some semistructured decisions,
especially those that involve operational and managerial control, since the 1960s.
Operational and managerial control decisions are made in all functional areas , especially
in finance and production (i.e., operations) management.
Structured problems, which are encountered repeatedly, have a high level of struc-
ture . It is therefore possible to abstract, analyze , and classify them into specific catego-
ries. For example, a make-or-buy decision is one category. Other examples of categories
are capital budgeting, allocation of resources, distribution, procurement, planning, and
inventory control decisions. For each category of decision, an easy-to-apply prescribed
model and solution approach have been developed, generally as quantitative formulas.
Therefore, it is possible to use a scientific approach for automating portions of manage-
rial decision making.

Chapter 1 • An Overview of Business Intelligence , Analytics, and Decision Support 13
Computer Support for Unstructured Decisions
Unstructured problems can be only partially supported by standard computerized quan-
titative methods. It is usually necessary to develop customized solutions. However, such
solutions may benefit from data and information generated from corporate or external
data sources. Intuition and judgment may play a large role in these types of decisions, as
may computerized communication and collaboration technologies, as well as knowledge
management (see Chapter 12).
Computer Support for Semistructured Problems
Solving semistructured problems may involve a combination of standard solution pro-
cedures and human judgment. Management science can provide models for the portion
of a decision-making problem that is structured. For the unstructured portion, a DSS can
improve the quality of the information on which the decision is based by providing, for
example, not only a single solution but also a range of alternative solutions, along with
their potential impacts. These capabilities help managers to better understand the nature
of problems and, thus, to make better decisions.
SECTION 1.5 REVIEW QUESTIONS
1. What are structured, unstructured, and semistructured decisions? Provide two exam-
ples of each.
2. Define operational control, managerial control, and strategic planning. Provide two
examples of each.
3. What are the nine cells of the decision framework? Explain what each is for.
4. How can computers provide support for making structured decisions?
5. How can computers provide support to semistructured and unstructured decisions?
1.6 THE CONCEPT OF DECISION SUPPORT SYSTEMS (DSS)
In the early 1970s, Scott-Morton first articulated the major concepts of DSS. He defined
decision support systems (DSS) as “interactive computer-based systems, which help
decision makers utilize data and models to solve unstructured problems” (Gorry and
Scott-Morton, 1971). The following is another classic DSS definition, provided by Keen
and Scott-Morton 0978):
Decision support systems couple the intellectual resources of individuals with
the capabilities of the computer to improve the quality of decisions. It is a
computer-based support system for management decision makers who deal
with semistructured problems.
Note that the term decision support system, like management information system (MIS)
and other terms in the field of IT, is a content-free expression (i.e., it means different
things to different people). Therefore, there is no universally accepted definition of DSS.
(We present additional definitions in Chapter 2.) Actually, DSS can be viewed as a con-
ceptual methodology-that is, a broad, umbrella term. However, some view DSS as a nar-
rower, specific decision support application.
DSS as an Umbrella Term
The term DSS can be used as an umbrella term to describe any computerized system that
supports decision making in an organization. An organization may have a knowledge

14 Part I • Decision Making and Analytics: An Oveiview
management system to guide all its personnel in their problem solving. Another organiza-
tion may have separate support systems for marketing, finance , and accounting; a sup-
ply chain management (SCM) system for production; and several rule-based systems for
product repair diagnostics and help desks . DSS encompasses them all.
Evolution of DSS into Business Intelligence
In the early days of DSS , managers let their staff do some supportive analysis by using
DSS tools. As PC technology advanced, a new generation of managers evolved-one
that was comfortabl e with computing and knew that technology can directly h elp
make intelligent business decisions faster. New tools such as OLAP, data warehousing,
data mining, and intelligent systems , delivered via Web technology, added promised
capabilities and easy access to tools, models, and data for computer-aided decision
making. These tools started to appear under the names BI and business analytics in
the mid-1990s . We introduce these concepts next , and relate the DSS and BI concepts
in the following section s.
SECTION 1.6 REVIEW QUESTIONS
1. Provide two definitions of DSS.
2. Describe DSS as an umbrella term.
1.7 A FRAMEWORK FOR BUSINESS INTELLIGENCE (Bl)
The decision support concepts presented in Sections 1.5 and 1.6 have been implemented
incrementally, under different names, by many vendors that have created tools and meth-
odologies for decision support. As the enterprise-wide systems grew, managers were
able to access user-friendly reports that enabled them to make decisions quickly . These
systems, which were generally called executive information systems (EIS), then began to
offer additional visualization, alerts , and performance measurement capabilities. By 2006,
the major commercial products and services appeared under the umbrella term business
intelligence (BI).
Definitions of Bl
Business intelligence (BI) is an umbrella term that combines architectures, tools , data-
bases, analytical tools, applications, and methodologies. It is, like DSS, a content-free
expression, so it means different things to different peopl e. Part of the confusion about
BI lies in the flurry of acronyms and buzzwords that are associated with it (e.g., business
performance management [BPM]). Bi’s major objective is to enable interactive access
(sometimes in real time) to data, to enable manipulation of data, and to give business
managers and analysts the ability to conduct appropriate analyses . By analyzing historical
and current data, situations, and performances, decision makers get valuable insights that
enable them to make more informed and better decisions . The process of BI is based on
the traniformation of data to information , then to decisions, and finally to actions .
A Brief History of Bl
The term BJ was coined by the Gartner Group in the mid-1990s. However, the concept is
much older; it has its roots in the MIS reporting systems of the 1970s. During that period,
reporting systems were static, two dimensional, and had no analytical capabilities . In the
early 1980s, the concept of executive infonnation systems (EIS) emerged. This concept
expanded the computerized support to top-level managers and executives. Some of the

Chapter 1 • An Overview of Business Intelligence , Analytics, and Decision Support 15
Querying and
reportin
~ Data warehouse
EIS/ESS
Financial
reporting
DLAP
I
Digital cockpits L
. and dashboardsr,———–··
I
Scorecards and Ll————-11>
_ dashboards I
Workflow
Alerts and
notifications
mining Predictive
analytics
FIGURE 1.3 Evolution of Business Intelligence (Bl).
Data marts
Business
intelligence
Broadcasting
tools
Portals
capabilities introduced were dynamic multidimensional (ad hoc or on-demand) reporting,
forecasting and prediction, trend analysis, drill-down to details , status access, and criti-
cal success factors. These features appeared in dozens of commercial products until the
mid-1990s . Then the same capabilities and some new ones appeared under the name BI.
Today, a good BI-based enterprise information system contains all the information execu-
tives need. So, the original concept of EIS was transformed into BI. By 2005 , BI systems
started to include a-rtificial intelligence capabilities as well as powerful analytical capabili-
ties. Figure 1.3 illustrates the various tools and techniques that may be included in a BI
system. It illustrates the evolution of BI as well. The tools shown in Figure 1.3 provide the
capabilities of BI. The most sophisticated BI products include most of these capabilities;
others specialize in only some of them. We will study several of these capabilities in more
detail in Chapters 5 through 9.
The Architecture of Bl
A BI system has four major components: a data warehouse, with its source data; business
analytics, a collection of tools for manipulating, mining, and analyzing the data in the data
warehouse; business peiformance management {BPM) for monitoring and analyzing perfor-
mance; and a userinteiface (e.g., a dashboard). The relationship among these components is
illustrated in Figure 1.4. We will discuss these components in detail in Chapters 3 through 9.
Styles of Bl
The architecture of BI depends on its applications. MicroStrategy Corp. distinguishes five
styles of BI and offers special tools for each. The five styles are report delivery and alert-
ing; enterprise reporting (using dashboards and scorecards); cube analysis (also known as
slice-and-dice analysis); ad hoc queries; and statistics and data mining.

16 Part I • Decision Making and Analytics: An Oveiview
Data
————
Data Warehouse
Environment
Technical staff
Build the data warehouse
Organizing
Summarizing
Standardizing
Future component:
intelligent systems
Business Analytics
Environment
Manipulacion, results ——
User interface
Performance and
Strategy
Managers/ executives
— 8PM strategies
I
Browser ~
Portal 0
Dashboard
r-
FIGURE 1.4 A High-Level Architecture of Bl. Source: Based on W. Eckerson, Smart Companies in the
21st Century: The Secrets of Creating Successful Business Intelligent Solutions. The Data Warehousing
Institute, Seattle, WA, 2003, p. 32, Illustration 5.
The Origins and Drivers of Bl
Where did modern approaches to data warehousing (DW) and BI come from? What are
their roots, and how do those roots affect the way organizations are managing these initia-
tives today? Today’s investments in information technology are under increased scrutiny
in terms of their bottom-line impact and potential. The same is true of DW and the BI
applications that make these initiatives possible.
Organizations are being compelled to capture , understand, and harness their data
to support decision making in order to improve business operations . Legislation and
regulation (e.g., the Sarbanes-Oxley Act of 2002) now require business leaders to docu-
ment their business processes and to sign off on the legitimacy of the information they
rely on and report to stakeholders. Moreover, business cycle times are now extremely
compressed; faster, more informed, and better decision making is therefore a competitive
imperative. Managers need the right infonnation at the right time and in the right place.
This is the mantra for modern approaches to BI.
Organizations have to work smart. Paying careful attention to the management of BI
initiatives is a necessary aspect of doing business. It is no surprise, then, that organizations
are increasingly championing BL You will hear about more BI successes and the funda-
mentals of those successes in Chapters 3 through 9. Examples of many applications of BI
are provided in Table 1.3. Application Case 1.1 illustrates one such application of BI that
has helped many airlines, as well as the companies offering such services to the airlines.
A Multimedia Exercise in Business Intelligence
Teradata University Network (TUN) includes some videos along the lines of the televi-
sion show CSI to illustrate concepts of analytics in different industries. These are called
“BSI Videos (Business Scenario Investigations).” Not only these are entertaining, but
they also provide the class with some questions for discussion. For starters, please go to
teradatauniversitynetwork.com/teach-and-learn/library-item/?Libraryltemld=889.
Watch the video that appears on YouTube. Essentially, you have to assume the role of a
customer service center professional. An incoming flight is running late, and several pas-
sengers are likely to miss their connecting flights. There are seats on one outgoing flight
that can accommodate two of the four passengers. Which two passengers should be given

Chapter 1 • An Overview of Business Intelligence, Analytics, and Decision Support 17
TABLE 1.3 Business Value of Bl Analytical Applications
Analytic Application Business Question Business Value
Customer segmentation What market segments do my customers fall
into, and what are their characteristics?
Personalize customer relationships for higher
satisfaction and retention.
Propensity to buy Which customers are most likely to respond
to my promotion?
Target customers based on their need to
increase their loyalty to your product line.
Also, increase campaign profitability by focusing
on the most likely to buy.
Customer profitability What is the lifetime profitability of my
customer?
Make individual business interaction decisions
based on the overall profitability of
customers.
Fraud detection How can I tell which transactions are likely
to be fraudulent?
Quickly determine fraud and take immediate
action to minimize cost.
Customer attrition Which customer is at risk of leaving? Prevent loss of high-value customers and let go
of lower-value customers.
Channel optimization What is the best channel to reach my cus-
tomer in each segment?
Interact with customers based on their
preference and your need to manage cost.
Source: A. Ziama and J. Kasher, Data Mining Primer for the Data Warehousing Professional. Teradata, Dayton, OH, 2004.
Application Case 1.1
Sabre Helps Its Clients Through Dashboards and Analytics
Sabre is one of the world leaders in the travel indus-
try, providing both business-to-consumer services as
well as business-to-business services. It serves travel-
ers, travel agents, corporations, and travel suppliers
through its four main companies: Travelocity, Sabre
Travel Network, Sabre Airline Solutions, and Sabre
Hospitality Solutions. The current volatile global eco-
nomic environment poses significant competitive chal-
lenges to the airline industry. To stay ahead of the
competition, Sabre Airline Solutions recognized that
airline executives needed enhanced tools for manag-
ing their business decisions by eliminating the tradi-
tional, manual, time-consuming process of collect-
ing and aggregating financial and other information
needed for actionable initiatives. This enables real-time
decision support at airlines throughout the world that
maximize their (and, in tum, Sabre’s) return on infor-
mation by driving insights, actionable intelligence, and
value for customers from the growing data.
Sabre developed an Enterprise Travel Data
Warehouse (ETDW) using Teradata to hold its mas-
sive reservations data. ETDW is updated in near-real
time with batches that run every 15 minutes, gathering
data from all of Sabre’s businesses. Sabre uses its
ETDW to create Sabre Executive Dashboards that pro-
vide near-real-time executive insights using a Cognos
8 BI platform with Oracle Data Integrator and Oracle
Goldengate technology infrastructure. The Executive
Dashboards offer their client airlines’ top-level man-
agers and decision makers a timely, automated, user-
friendly solution, aggregating critical petformance
metrics in a succinct way and providing at a glance
a 360-degree view of the overall health of the airline.
At one airline, Sabre’s Executive Dashboards provide
senior management with a daily and intra-day snap-
shot of key petformance indicators in a single appli-
cation, replacing the once-a-week, &-hour process of
generating the same report from various data sources.
The use of dashboards is not limited to the external
customers; Sabre also uses them for their assessment
of internal operational petformance.
The dashboards help Sabre’s customers to have
a clear understanding of the data through the visual
displays that incorporate interactive drill-down capa-
bilities. It replaces flat presentations and allows for
more focused review of the data with less effort and
(Continued)

18 Part I • Decision Making and Analytics: An Oveiview
Application Case 1.1 (Continued)
time. This facilitates team dialog by making the data/
metrics pertaining to sales performance, including
ticketing, seats sold and flown, operational perfor-
mance such as data on flight movement and track-
ing, customer reservations, inventory, and revenue
across an airline’s multiple distribution channels, avail-
able to many stakeholders. The dashboard systems
provide scalable infrastructure, graphical user interface
(GUI) support, data integration, and data aggregation
that empower airline executives to be more proactive
in taking actions that lead to positive impacts on the
overall health of their airline.
With its EIDW, Sabre could also develop other
Web-based analytical and reporting solutions that lev-
erage data to gain customer insights through analysis
of customer profiles and their sales interactions to cal-
culate customer value. This enables better customer
segmentation and insights for value-added services.
QUESTIONS FOR DISCUSSION
1. What is traditional reporting? How is it used in
organizations?
2. How can analytics be used to transform tradi-
tional reporting?
3. How can interactive reporting assist organiza-
tions in decision making?
What We Can Learn from This Application
Case
This Application Case shows that organizations
that earlier used reporting only for tracking their
internal business activities and meeting compliance
requirements set out by the government are now
moving toward generating actionable intelligence
from their transactional business data. Reporting
has become broader as organizations are now try-
ing to analyze archived transactional data to under-
stand underlying hidden trends and patterns that
would enable them to make better decisions by
gaining insights into problematic areas and resolv-
ing them to pursue current and future market
opportunities. Reporting has advanced to interac-
tive online reports that enable users to pull and
quickly build custom reports as required and even
present the reports aided by visualization tools
that have the ability to connect to the database,
providing the capabilities of digging deep into
summarized data .
Source: Teradata .com, “Sabre Airline Solutions,” teradata.com/t/
case-studies/Sabre-Airline-Solutions-EB6281 (accessed
February 2013).
priority? You are given information about customers’ profiles and relationship with the air-
line. Your decisions might change as you learn more about those customers’ profiles.
Watch the video, pause it as appropriate, and answer the questions on which pas-
sengers should be given priority. Then resume the video to get more information. After
the video is complete, you can see the slides related to this video and how the analysis
was prepared on a slide set at teradatauniversitynetwork.corn/templates/Download.
aspx?Contentltemld=891. Please note that access to this content requires initial registration.
This multimedia excursion provides an example of how additional information made
available through an enterprise data warehouse can assist in decision making.
The DSS-BI Connection
By now , you should be able to see some of the similarities and differences between DSS
and BI. First, their architectures are very similar because BI evolved from DSS . However,
BI implies the use of a data warehouse, whereas DSS may or may not have such a feature .
BI is, therefore, more appropriate for large organizations (because data warehouses are
expensive to build and maintain) , but DSS can be appropriate to any type of organization.
Second, most DSS are constructed to directly support specific decision making. BI
systems, in general, are geared to provide accurate and timely information, and they sup-
port decision support indirectly. This situation is changing, however, as more and more
decision support tools are being added to BI software packages.

Chapter 1 • An Overview of Business Intelligence , Analytics, and Decision Support 19
Third, BI has an executive and strategy orientation, especially in its BPM and dash-
board components. DSS , in contrast, is oriented toward analysts.
Fourth, most BI systems are constructed with commercially available tools and com-
ponents that are fitted to the needs of organizations. In building DSS, the interest may
be in constructing solutions to very unstructured problems. In such situations , more pro-
gramming (e.g., using tools such as Excel) may be needed to customize the solutions.
Fifth, DSS methodologies and even some tools were developed mostly in the aca-
demic world. BI methodologies and tools we re deve loped mostly by software companies.
(See Zaman, 2005, for information on how BI has evolved.)
Sixth, many of the tools that BI uses are also considered DSS tools . For example,
data mining and predictive analysis are core tools in both areas.
Although some people equate DSS with BI, these systems are not, at present, the
same. It is interesting to note that some people believe that DSS is a part of BI-one of its
analytical tools. Others think that BI is a special case of DSS that deals mostly with report-
ing, communication, and collaboration (a form of data-oriented DSS) . Another explana-
tion (Watson, 2005) is that BI is a result of a continuous revolution and, as such, DSS is
one of Bi’s original elements. In this book, we separate DSS from BI. However, we point
to the DSS-BI connection frequently. Further, as noted in the next section onward, in
many circles BI has been subsumed by the new term analytics or data science.
SECTION 1. 7 REVIEW QUESTIONS
1. Define BI.
2. List and describe the major components of BI.
3. What are the major similarities and differences of DSS and BI?
1.8 BUSINESS ANALYTICS OVERVIEW
The word “analytics” has replaced the previous individual components of computerized
decision support technologies that have been available under various labels in the past.
Indeed , many practitioners and academics now use the word analytics in place of BI.
Although many authors and consultants have defined it slightly differently, one can view
analytics as the process of developing actionable decisions or recommendation for actions
based upon insights generated from historical data . The Institute for Operations Research
and Management Science (INFORMS) has created a major initiative to organize and pro-
mote analytics. According to INFORMS, analytics represents the combination of computer
technology, management science techniques, and statistics to solve real problems. Of
course, many other organizations have proposed their own interpretations and motivation
for analytics. For example, SAS Institute Inc. proposed eight levels of analytics that begin
with standardized reports from a computer system. These reports essentially provide a
sense of what is happening with an organization. Additional technologies have enabled
us to create more customized reports that can be generated on an ad hoc basis. The next
extension of reporting takes us to online analytical processing (OLAP)-type queries that
allow a user to dig deeper and determine the specific source of concern or opportuni-
ties . Technologies available today can also automatically issue alerts for a decision maker
when performance issues warrant such alerts. At a consumer leve l we see such alerts for
weather or other issues. But similar alerts can also be generated in specific settings when
sales fall above or below a certain level within a certain time period or when the inventory
for a specific product is running low. All of these applications are made possible through
analysis and queries on data being collected by an organization. The next level of analysis
might entail statistical analysis to better understand patterns. These can then be taken a
step further to develop forecasts or models for predicting how customers might respond to

20 Part I • Decision Making and Analytics: An Overview
Predictive
Statistical Analysis and
Data Mining
Reporting
Visualization
Periodic, ad hoc
Reporting Trend Analysis
Prescriptive
Management Science
Models and Solution
Reporting
Visualization
Periodic ,
ad hoc Reporting
Trend Analysis
FIGURE 1.5 Three Types of Analytics.
Predictive
Statistical Analysis
and
Data Mining
Prescriptive
Management Science
Models and
Solution
a specific marketing campaign or ongoing service/product offerings. When an organization
has a good view of what is happening and what is likely to happen, it can also employ
other techniques to make the best decisions under the circumstances. These eight levels of
analytics are described in more detail in a white paper by SAS (sas.com/news/sascom/
analytics_levels ).
This idea of looking at all the data to understand what is happening, what will
happen, and how to make the best of it has also been encapsulated by INFORMS in
proposing three levels of analytics. These three levels are identified (inforrns.org/
Community/Analytics) as descriptive, predictive, and prescriptive. Figure 1.5 presents
two graphical views of these three levels of analytics. One view suggests that these three
are somewhat independent steps (of a ladder) and one type of analytics application leads
to another. The interconnected circles view suggests that there is actually some overlap
across these three types of analytics. In either case, the interconnected nature of different
types of analytics applications is evident. We next introduce these three levels of analytics.
Descriptive Analytics
Descriptive or reporting analytics refers to knowing what is happening in the
organization and understanding some underlying trends and causes of such occur-
rences. This involves, first of all, consolidation of data sources and availability of

Chapter 1 • An Overview of Business Intelligence , Analytics, a nd Decision Support 21
all relevant data in a form that enables appropriate reporting and analysis . Usually
development of this data infrastructure is part of data warehouse s , which we study in
Chapter 3. From this data infrastructure we can develop appropriate reports , queries ,
alerts , and trends using various reporting tools and techniques. We study these in
Chapter 4.
A significant technology that has become a key player in this area is visua lizatio n .
Using the latest visualization tools in the marketplace , we can now develop powerful
insights into the operatio ns of our organization. Applicatio n Cases 1.2 a nd 1.3 highlight
some such applications in the healthcare domain. Color renderings of such applications
are available on the companion Web site and also on Tableau ‘s Web site. Chapter 4
covers visualization in more detail.
Application Case 1.2
Eliminating Inefficiencies at Seattle Children’s Hospital
Seattle Children’s was the seventh highest ranked
children’s hospital in 2011 , according to U.S. News
& World Report. For any organization that is com-
mitted to saving lives, identifying and removing the
inefficiencies from systems and processes so that
more resources become available to cater to patient
care become very important. At Seattle Children’s ,
management is continuously looking for new ways
to improve the quality, safety, and processes from
the time a patient is admitted to the time they are
discharged. To this end, they spend a lot of time in
analyzing the data associated w ith the patient visits.
To quickly turn patient and hospital data into
insights, Seattle Children’s implemented Tableau
Software’s business intelligence application. It pro-
vides a browser based on easy-to-use analytics to the
stakeholders; this makes it intuitive for individuals to
create visualizations and to understand what the data
has to offer. The data analysts, business managers,
and financial an alysts as well as clinicians, doctors,
and researchers are a ll using descriptive analytics
to solve different problems in a much faster way.
They are developing visual systems o n their own,
resulting in dashboards and scorecards that help
in defining the standards, the current performance
achieved measured against the standards, and how
these systems will grow into the future. Through the
use of monthly and daily dashboards, day-to-day
decision making at Seattle Children’s has improved
significantly .
Seattle Children’s measures patient wait-times
and analyzes them with the help of visualizations
to discover the root causes and contributing factors
for patient wa1tmg. They found that early delays
cascaded during the day. They focused o n on-time
appointments of patient services as on e of the solu-
tions to improving patient overall waiting time and
increasing the availability of beds. Seattle Children’s
saved about $3 million from the supply chain, and
with the help of tools like Tableau, they are find-
ing new ways to increase savings while treating as
many patients as possible by making the existing
processes more efficient.
QUESTIONS FOR DISCUSSION
1. Who are the users of the tool?
2. What is a dashboard?
3 . How does visualization help in decision making?
4 . What are the significant results achieved by the
use of Tableau?
What We Can Learn from This Application
Case
This Application Case shows that reporting analyt-
ics involving visualizations such as dashboards can
offer major insights into existing data and show how
a variety of users in different domains and depart-
ments can contribute toward process and qual-
ity improvements in an organization. Furthermore,
exploring the data visually can help in identifying
the root causes of problems and provide a basis for
working toward possible solutions.
Source: Tableausoftware.com, “Eliminating Waste at Seattle
Childre n’s,” tableausoftware.com/eliminating-waste-at-seattle-
childrens (accessed Fe bruary 2013).

22 Pan I • Decision Making and Analytics: An Overview
Application Case 1.3
Analysis at the Speed of Thought
Kaleida Health, the largest healthcare provider in
western New York, has more than 10,000 employ-
ees, five hospitals, a number of clinics and nursing
homes, and a visiting-nurse association that deals
with millions of patient records. Kaleida’s traditional
reporting tools were inadequate to ha ndle the grow-
ing data, and they were faced with the challenge of
finding a business intelligence tool that could handle
large data sets effortlessly, quickly, and with a much
deeper analytic capability.
At Kaleida, many of the calculations are now
done in Tableau, primarily pulling the data from
Oracle databases into Excel and importing the
data into Tableau. For many of the monthly ana-
lytic reports, data is directly extracted into Tableau
from the data warehouse ; many of the data queries
are saved and rerun, resulting in time savings when
dealing with millions of records-each having more
than 40 fields per record. Besides speed, Kaleida
also uses Tableau to me rge different tables for gen-
erating extracts.
Using Tableau, Kaleida can analyze emergency
room data to determin e the number of patients who
visit more than 10 times a year. The data often reveal
that people frequently use emergency room and
ambulance services inappropriately for stomach-
aches, headaches, and fevers. Kaleida can manage
resource utilizations-the use a nd cost of supplies-
which will ultimately lead to efficie ncy and standard-
ization of supplies management across the system.
Kaleida now has its own business intelligence
department and uses Tableau to compare itself to
Predictive Analytics
other hospitals across the country. Comparisons are
made o n various aspects, such as length of patient
stay, hospital practices, market share, and partner-
ships with doctors .
QUESTIONS FOR DISCUSSION
1. What are the desired functionalities of a report-
ing tool?
2. What advantages were derived by using a report-
ing tool in the case?
What We Can Learn from This Application
Case
Correct selection of a reporting tool is extremely
important, especially if an organization wants to
derive value from reporting. The generated reports
and visualizations should be easily discernible; they
should help people in different sectors make sense
out of the reports, identify the problematic areas,
a nd contribute toward improving them. Many future
organizations will require reporting analytic tools
that are fast and capable of h andling huge amounts
of data efficiently to generate desired reports with-
out the need for third-party consultants and service
providers. A truly useful reporting tool can exempt
organizations from unnecessary expenditure.
Source: Tableausoftware .com, “Kaleida Health Finds Efficiencies,
Stays Compe titive,” tableausoftware.com/learn/stories/user-
experience-speed-thought-kaleida-health (accessed February
2013).
Predictive analytics aims to determine what is likely to happen in the future . This an aly-
sis is based o n statistical techniques as well as other more recently developed techniques
that fa ll under the general category of data mining. The goal of these techniques is to be
able to predict if the customer is likely to switch to a competitor (“churn”) , what the cus-
tomer is likely to buy next and how much, what promotion a customer would respond
to, or w hether this customer is a creditworthy risk. A number of techniques are u sed in
developing predictive analytical applications, including various classification algorithms.
For example, as described in Chapters 5 an d 6, we can use classification techniques su ch
as decision tree models and neural networks to predict how well a motion picture will
do at the box office. We can also use clustering algorithms for segmenting customers
into different clusters to be able to target specific promotions to them. Fina lly, we can

Chapter 1 • An Overview of Business Intelligence , Analytics, a nd Decision Support 23
use association mining techniques to estimate relationships between different purchasing
behaviors. That is, if a customer buys one product, what e lse is the customer likely to pur-
chase? Such analysis can assist a retailer in recommending or promoting related products.
For example , any product search on Amazon.com results in the retailer also suggesting
other similar products that may interest a customer. We will study these techniques and
their applications in Ch apters 6 through 9. Application Cases 1.4 and 1.5 highlight some
similar applications. Application Case 1.4 introduces a movie you may have heard of:
Moneyball. It is perhaps one of the best examples of applications of predictive analysis
in sports.
Application Case 1.4
Moneyba/1: Analytics in Sports and Movies
Moneyball, a biographical, sports, drama film, was
released in 2011 and directed by Bennett Miller. The
film was based on Michael Lewis’s book, Moneyball.
The movie gave a detailed account of the Oakland
Athletics baseball team during the 2002 season and
the Oakland general manager’s efforts to assemble a
competitive team.
The Oakland Athletics suffered a big loss to the
New York Yankees in 2001 postseason. As a result,
Oakland lost many of its star players to free agency
and ended up with a weak team with unfavorable
financial prospects. The general manager’s efforts to
reassemble a competitive team were denied because
Oakland had limited payroll. The scouts for the
Oakland Athletics followed the o ld baseball custom
of making subjective decisions when selecting the
team members. The general manager then met a
young, computer whiz with an economics degree
from Yale. The general manager decided to appoint
him as the n ew assistant general manager.
The assistant general manager had a deep pas-
sion for baseball and had the expertise to crunch
the numbers for the game. His love for the game
made him develop a radical way of understanding
baseball statistics. He was a disciple of Bill James, a
marginal figure w ho offered rationalized techniques
to analyze baseball. James looked at baseball statis-
tics in a different way, crunching the numbers purely
on facts and eliminating subjectivity. James pio-
neered the nontraditional a nalysis method called the
Sabermetric approach, which derived from SABR-
Society for American Baseball Research.
The assistant general manager followed the
Sabermetric approach by building a prediction
model to help the Oakland Athletics select play-
ers based on their “on-base percentage” (OBP), a
statistic that measured how often a batter reached
base for any reason other than fielding error, field-
er’s choice, dropped/ uncaught third strike, fielder’s
obstruction, or catcher’s interference. Rather than
relying on the scout’s experience and intuition, the
assistant general manager selected players based
almost exclusively o n OBP.
Spoiler Alert: The new team beat all odds, won
20 consecutive games, and set an American League
record.
QUESTIONS FOR DISCUSSION
1. How is predictive analytics applied in Moneyball?
2. What is the difference between objective and
subjective approaches in decision making?
What We Can Learn from This Application
Case
Analytics finds its use in a variety of industries. It
helps organizations rethink their traditional prob-
lem-solving abilities, which are most often subjec-
tive , relying o n the same old processes to find a
solutio n. Analytics takes the radical approach of
using historical data to find fact-based solutions
that w ill remain appropriate for making even future
decisions.
Source.- Wikipedia , “On-Base Percentage,” en.wikipedia.org/
wiki/On_base_percentage (accessed Ja nuary 2013); Wikipedia ,
“Saberme tricsm,” wikipedia.org/wiki/Sabennetrics (accessed
January 2013) .

24 Pan I • Decision Making and Analytics: An Overview
Application Case 1.5
Analyzing Athletic Injuries
Any athletic activity is prone to injuries. If the inju-
ries are not handled properly, then the team suf-
fers. Using analytics to understand injuries can help
in deriving valuable insights that would enable
the coaches and team doctors to manage the team
composition , understand player profiles, and ulti-
mately a id in better decision making concerning
which players might be available to play at any
given time.
In an exploratory study, Oklahoma State
University analyzed American football-related sport
injuries by using reporting and predictive analytics.
The project followed the CRISP-DM methodol-
ogy to understand the problem of making recom-
me ndations on managing injuries, understanding
the various data elements collected about injuries,
cleaning the data, developing visualizations to draw
various inferences, building predictive models to
analyze the injury healing time period, and drawing
sequence rules to predict the relationship among the
injuries and the various body part parts afflicted with
injuries.
The injury data set consisted of more than
560 football injury records, which were categorized
into injury-specific variables-body part/ site/ later-
ality, action taken, severity, injury type, injury statt
and healing dates-and player/sport-specific varia-
bles-player ID, position played, activity, onset, and
game location. Healing time was calculated for each
record, which was classified into different sets of
time periods: 0-1 month, 1-2 months, 2-4 months,
4- 6 months, and 6- 24 months .
Various visualizations were built to draw
inferences from injury data set information depict-
ing the healing time period associated w ith players’
positions, severity of injuries and the h ealing time
period, treatment offered and the associated healing
time period, major injuries afflicting body parts, and
so forth.
Prescriptive Analytics
Neural network models were built to pre-
dict each of the healing categories using IBM SPSS
Modeler. Some of the predictor variables were cur-
rent status of injury, severity, body part, body site,
type of injury, activity, event location, action taken,
and position played. The success of classifying the
healing categoty was quite good: Accuracy was 79.6
percent. Based on the analysis, many business rec-
ommendations were suggested, including e mploy-
ing more specialists’ input from injury onset instead
of letting the training room staff screen the injured
players; training players at defensive positions to
avoid being injured; and holding practice to thor-
oughly safety-check mechanisms.
QUESTIONS FOR DISCUSSION
1. What types of a nalytics are applied in the injury
analysis?
2. How do visualizations aid in understanding the
data and delivering in sights into the data?
3. What is a classification problem?
4 . What can be derived by performing sequence
analysis?
What We Can Learn from This Application
Case
For any analytics project, it is always important
to understand the busin ess domain and the cur-
rent state of the business problem through exten-
sive analysis of the only resource-historical data .
Visualizations often provide a great tool for gaining
the initial insights into data, which can be further
refined based on expett opinions to identify the rela-
tive importance of the data e lements related to the
problem. Visualizations also aid in generating ideas
for obscure business problems, which can be pur-
sued in building predictive models that could help
organizations in decision making.
The third category of analytics is termed prescriptive analytics . The goal of prescriptive
analytics is to recognize what is going on as well as the likely forecast and make decisions
to achieve the best performance possible. This group of techniques h as historically been
studied under the umbrella of operations research or management sciences and h as gen-
erally been aimed at optimizing the performance of a system. The goal h e re is to provide

Chapter 1 • An Overview of Business Intelligence , Analytics, a nd Decision Support 25
a decision or a recommendation for a specific action. These recommendations can be in
the forms of a specific yes/ no decision for a problem, a specific amount (say, price for a
specific item or airfare to ch arge) , or a complete set of production plans. The decisions
may be presented to a decision maker in a report or may directly be used in an automated
decision rules system (e.g., in airline pricing systems). Thus, these types of analytics can
also be termed decision or normative analytics. Application Case 1.6 gives an example
of such prescriptive analytic applications . We will learn about some of these techniques
and several additional applicatio ns in Chapters 10 through 12.
Application Case 1.6
Industrial and Commercial Bank of China (ICBC) Employs Models
to Reconfigure Its Branch Network
The Industrial and Commercial Bank of China
(ICBC) has more than 16,000 branches and serves
over 230 million individual customers and 3.6 mil-
lion corporate clients. Its daily financial transactions
total about $180 million. It is also the largest pub-
licly traded bank in the world in terms of market
capitalization, deposit volume , and profitability. To
stay competitive and increase profitability, ICBC was
faced with the challenge to quickly adapt to the fast-
paced economic growth, urbanization, and increase
in personal wealth of the Chinese. Changes had to be
implemented in over 300 cities with high variability
in customer behavior and financial status. Obviously,
the nature of the challenges in such a huge economy
meant that a large-scale optimization solution had to
be developed to locate branches in the right places,
with right services, to serve the right customers.
With their existing method, ICBC used to decide
w here to open new branches through a scoring model
in which different variables with varying weight were
used as inputs. Some of the variables were customer
flow, number of residential households, and number
of competitors in the intended geographic region. This
method was deficient in determining the customer dis-
tribution of a geographic area. The existing method
was also unable to optimize the distribution of bank
branches in the branch network. With support from
IBM, a branch reconfiguration (BR) tool was devel-
oped. Inputs for the BR system are in three parts:
a. Geographic data with 83 different categories
b . Demographic and economic data with 22 dif-
ferent categories
c. Branch transactions and performance data that
consisted of more than 60 million transaction
records each day
These three inputs helped generate accurate cus-
tomer distribution for each area and, h ence, helped
the bank optimize its branch network. The BR system
consisted of a market potential calculation model, a
branch network optimization model, and a branch
site evaluation model. In the market potential model,
the customer volume and value is measured based
on input data and expert knowledge. For instance ,
expe1t knowledge would help determine if per-
sonal income should be weighted more than gross
domestic product (GDP). The geographic areas are
also demarcated into cells, and the preference of one
cell over the other is determined. In the branch net-
work optimization model, mixed integer program-
ming is used to locate branches in candidate cells
so that they cover the largest market potential areas.
In the branch site evaluation model, the value for
establishing bank branches at specific locations is
determined.
Since 2006, the development of the BR has
been improved through an iterative process. ICBC’s
branch reconfiguration tool has increased deposits
by $21.2 billio n since its inception. This increase
in deposit is because the bank can now reach
more customers with the right services by use of
its optimization tool. In a specific example , when
BR was implemented in Suzhou in 2010, deposits
increased to $13 .67 billion from an initial leve l of
$7 .56 billion in 2007. Hence, the BR tool assisted
in an increase of deposits to the tune of $6.11
b illion between 2007 and 2010. This project was
selected as a finalist in the Edelman Competition
2011 , which is run by INFORMS to promote actual
app lications of management science/ operations
research models.
(Continued)

26 Pan I • Decision Making and Analytics: An Overview
Application Case 1.6 (Continued}
QUESTIONS FOR DISCUSSION
1. How can analytical techniques help organiza-
tions to re tain competitive advantage?
2. How can descriptive and predictive an alytics
help in pursuing prescriptive analytics?
3. What kinds of p rescriptive analytic techniques
are employed in the case study?
4. Are the prescriptive models once built good
forever?
What We Can Learn from This Application
Case
Many organizations in the world are now embrac-
ing analytical techniqu es to stay compe titive
and achieve growth. Many o rga nizations provide
con sulting solutions to the businesses in employ-
ing prescriptive analytical solutions. It is equ ally
important to have proactive decis ion m akers in the
organizations who are aware of the ch anging eco-
nomic environment as well as the advan cem ents
in the field of an alytics to e n sure that appropriate
models are e mployed . This case shows an example
o f geographic m a rke t segmentatio n and customer
beh avioral segmentation tec hniques to isolate the
profitability of customers a nd e mploy optimizatio n
techniques to locate the branches that deliver hig h
profitability in each geographic segment.
Source: X. Wang e t al. , “Branch Reconfiguration Practice Through
Operations Research in Industrial a nd Commercial Bank of China,”
Interfaces, January/ February 2012, Vol. 42, No . 1, pp. 33-44; DOI:
10.1287/ inte.1110.0614.
Analytics Applied to Different Domains
Applicatio ns of analytics in various industry sectors h ave spawn ed m any related areas or
at least buzzwords. It is almost fashio nable to attach the word analytics to any specific
industry or type of data . Besides the general category of text analytics- aimed at getting
value out of text (to be studied in Ch apter 6)- or Web analytics- analyzing Web data
streams (Chapte r 7)- many industry- o r p roblem-sp ecific a nalytics p rofessio ns/ streams
have come up. Examples of such areas are marketing analytics, retail analytics, fraud ana-
lytics, transportation a nalytics, health analytics, sports analytics, tale nt a nalytics, behav-
ioral analytics, and so forth. For example, Application Case 1. 1 could also be terme d as
a case study in airline analytics . Application Cases 1.2 and 1.3 would belo ng to health
analytics; Applicatio n Cases 1.4 and 1.5 to sports an alytics; Application Case 1.6 to bank
analytics; and Application Case 1.7 to reta il analytics. The End-of-Chapter Application
Case could be termed insurance analytics . Literally, any systematic analysis of data in a
sp ecific secto r is b eing labeled as “(fill-in-blanks)” Analytics. Although this may result in
overselling the concepts of analytics, the benefit is that more people in specific in dustries
are aware of the power a nd p otential of an alytics. It also provides a focus to professionals
developing a nd applying the concepts of analytics in a vertical secto r. Altho ugh m any of
the techniques to develop analytics applications may be commo n , there are unique issues
w ithin each vertical segment that influe nce how the data may b e collected, processed,
analyzed , and the applications implemented. Thus , the differentiatio n o f analytics based
o n a vertical focus is good for the overall growth of the d iscipline.
Analytics or Data Science?
Even as the concept of analytics is getting popular among industry and academic circles ,
another term h as already been introduced and is becoming popular. The n ew term is data
science. Thus the practitio n ers of data scien ce are data scie ntists. Mr. D . J. Patil of Linkedin
is sometimes credited w ith creating the term data science. There have been some attempts
to describe the differences between data analysts and data scientists (e.g., see this study at
emc.com/collateral/about/news/emc-data-science-study-wp ). One view is that

Chapter 1 • An Overview of Business Intelligence, Analytics, a nd Decision Support 27
data analyst is just another term for professionals w ho were doing business intelligence in
the form of data compilation, cleaning, reporting, and perhaps some visualization. Their
skill sets included Excel , some SQL knowledge, a nd reporting. A reader of Section 1.8
would recognize that as descriptive or reporting analytics. In contrast, a data scientist is
responsible for predictive analysis, statistical analysis, and more advanced analytical tools
and algorithms. They may have a deeper knowledge of algorithms and may recognize
them under various labels-data mining, knowledge discovery, machine learning, and
so forth. Some of these professionals may also need deeper programming knowledge to
be able to write code for data cleaning and analysis in current Web-oriented languages
su ch as Java and Python. Again, our readers should recognize these as falling under the
predictive a nd prescriptive a nalytics umbrella. Our view is that the distinction between
analytics and data science is more of a degree of technical knowledge and skill sets than
the functions. It may also be more of a distinction across d iscip lines. Computer science,
statistics, and applied mathematics programs appear to prefer the data science label,
reserving the analytics label for more business-oriented professionals. As another example
of this, applied physics professionals have proposed using network science as the term
for describing analytics that relate to a group of people-social networks, supply chain
networks, and so forth. See barabasilab.neu.edu/networksciencebook/downlPDF.
html for a n evolving textbook on this topic .
Aside from a clear difference in the skill sets of professionals w ho only h ave to do
descriptive/ reporting analytics versus those who e ngage in all three types of analytics, the
distinction is fuzzy between the two labels, at best. We observe that graduates of our
analytics programs tend to be responsible for tasks more in line with data science profes-
sionals (as defined by some circles) than just reporting analytics. This book is clearly aimed
at introducing the capabilities and functionality of all analytics (which includes data sci-
ence), not just reporting analytics. From now o n , we w ill use these terms interch angeably .
SECTION 1.8 REVIEW QUESTIONS
1. Define analytics.
2. What is descriptive analytics? What various tools are employed in descriptive analytics?
3. How is descriptive analytics different from traditional reporting?
4. What is a data warehouse? How can data warehousing technology help in ena-
bling analytics?
5. What is predictive analytics? How can organizations employ predictive analytics?
6. What is prescriptive analytics? What kinds of problems can be solved by prescrip-
tive analytics?
7. Define modeling from the analytics perspective.
8. Is it a good idea to follow a hierarchy of descriptive and predictive analytics before
applying prescriptive analytics?
9. How can analytics aid in objective decision making?
1.9 BRIEF INTRODUCTION TO BIG DATA ANALYTICS
What Is Big Data?
Our brains work extreme ly quickly and are efficient and versatile in processing large
amounts of all kinds of data: images , text, sounds, smells, and video. We process all
different forms of data relatively easily. Computers, on the other hand, are still finding it
hard to keep up with the pace at which data is generated-let alone analyze it quickly.
We have the problem of Big Data. So w hat is Big Data? Simply put, it is data that cannot

28 Pan I • Decision Making and Analytics: An Overview
be stored in a single storage unit. Big Data typically refe rs to data that is arnv mg in
many different forms, be they structured, unstructured, or in a stream. Major sources
of su c h data are clickstreams from Web sites, postings o n social media sites such as
Facebook, o r data from traffic , sen sors , o r weather. A Web search e ngine like Google
n eed s to search a nd index billions of Web pages in o rde r to give you relevant search
results in a fraction of a second. Although this is n o t done in real time , generating an
index of a ll the Web pages o n the Inte rnet is not an easy task. Luckily for Google , it
was able to solve this proble m . Among o the r tools, it h as e mployed Big Data a nalytical
techniques.
There are two aspects to managing data o n this scale: storing a n d p rocessing . If we
could purchase an extreme ly expensive storage solutio n to store all the d ata at o n e place
o n one unit, making this unit fault tolerant would involve major expense . An ingenious
solutio n was proposed that involved storing this data in chunks o n different machines
connected by a network, putting a copy or two of this chunk in diffe rent locations on
the netwo rk, both logically and physically . It was originally used at Google (then called
Google File System) and later developed and re leased as an Apache project as the Hadoop
Distributed File System (HDFS).
However, sto ring this data is only h alf the problem. Data is worthless if it does
n ot provide business value, a nd for it to provide bu siness value, it has to be a nalyzed.
How are such vast amounts of data a n alyzed? Passing all computation to o ne powerful
compute r does n o t work; this scale would create a huge overhead on such a power-
ful computer. Another ingenious solutio n was proposed: Push computation to the data,
instead of pushing data to a computing no de. This was a new paradigm, and it gave rise
to a w ho le new way of processing data. This is w h at we know today as the MapRedu ce
programming paradigm, w hich made processing Big Data a reality. MapReduce was origi-
n ally develo p ed at Google, and a subseque n t versio n was released by th e Apache project
called Hadoop MapReduce.
Today, w hen we talk about storing, processing, o r analyzing Big Data, HDFS and
MapReduce are involved at some level. Other relevant stan dards and software solutions
have been proposed. Although the majo r toolkit is available as open source, several
companies have been launched to provide training or specialized analytical hardware or
software services in this space. Some examples are HortonWorks, Clo udera , an d Teradata
Aster.
Over the p ast few years , w h at was called Big Data changed m ore and more as Big
Data applicatio n s appeared. The n eed to process data coming in at a rapid rate added
velocity to the equatio n . One example of fast data processing is algorithmic trading . It
is the use of electronic platforms based o n algorithms for trading sh ares o n the finan cial
m arke t, which operates in the order of microseconds. The n eed to process different
kinds of data added variety to the equation. Another example of the wide varie ty of
data is sentiment a nalysis, w hic h uses various forms of data from social media p latforms
a nd c ustomer responses to gauge sentime nts. Tod ay Big Data is associated w ith al most
a ny kind of large data that h as the characteristics of volume, velocity, and variety.
Applicatio n Case 1.7 illustrates one example of Big Data analytics . We w ill study Big
Data characteristics in more detail in Chapters 3 and 13 .
SECTION 1.9 REVIEW QUESTIONS
1. What is Big Data a nalytics?
2 . What are the sources o f Big Data?
3. What are the characteristics of Big Data?
4. What processing technique is applied to p rocess Bi ta?

Chapter 1 • An Overview of Business Inte lligence, Analytics, a nd Decision Support 29
Application Case 1.7
Gilt Groupe’s Flash Sales Streamlined by Big Data Analytics
Gilt Groupe is an online destination offering flash
sales for major brands by selling their clothing and
accessories. It offers its members exclusive discounts
on high-end clothing and other apparel. After regis-
tering with Gilt, customers are sent e-mails containing
a variety of offers. Customers are given a 36-48 hour
window to make purchases using these offers. There
are about 30 different sales each day. While a typical
department store turns over its inventory two or three
times a year, Gilt does it eight to 10 times a year. Thus,
they have to manage their inventory extremely well
or they could incur extremely high inventory costs.
In order to do this, analytics software developed at
Gilt keeps track of eve1y customer click-ranging
from what brands the customers click on, what colors
they choose, what styles they pick, and what they
end up buying. Then Gilt tries to predict what these
customers are more likely to buy and stocks inve n-
tory according to these predictions. Customers are
sent customized alerts to sale offers depending on the
suggestions by the analytics software.
That, however, is not the whole process. The
software also monitors what offers the custome rs
choose from the recommended offers to make more
accurate predictions and to increase the effectiveness
of its personalized recommendations. Some custom-
ers do not check e-mail that often. Gilt’s analytics
1.10 PLAN OF THE BOOK
software keeps track of responses to offers and sends
the same offer 3 days later to those customers who
h aven’t responded. Gilt also keeps track of what
customers are saying in general about Gilt’s prod-
ucts by analyzing Twitter feeds to analyze sentiment.
Gilt’s recomme ndation software is based on Teradata
Aster’s technology solution that includes Big Data
analytics technologies .
QUESTIONS FOR DISCUSSION
1. What makes this case study an example of Big
Data analytics?
2. What types of decisions does Gilt Groupe have
to make?
What We Can Learn From this Application
Case
There is continuous growth in the amount of struc-
tured and unstructured data, and many organiza-
tions are now tapping these data to make actionable
decisions. Big Data analytics is now enabled by the
advancements in technologies that aid in storage and
processing of vast amounts of rapidly growing data.
Source: Asterdata.com, “Gilt Groupe Speaks o n Digital Ma rketing
Optimizatio n ,” asterdata.com/gilt_groupe_video.php (accesse d
Febrnary 2013).
The previous sections have given you an understanding of the n eed for using informa-
tion technology in decision making; an IT-oriented v iew of various types of decisions;
and the evolution of decision support systems into business intelligence, and now into
analytics. In the last two sections we have seen an overview of various types of analyt-
ics and their applications. Now we are ready for a more detailed managerial excursion
into these topics, along with some potentially deep hands-on experience in some of the
technical topics. The 14 chapters of this book are organized into five parts, as shown in
Figure 1.6.
Part I: Business Analytics: An Overview
In Chapter 1, we provided an introduction, definitions , and an overview of decision sup-
port systems, business intelligence, and analytics, including Big Data analytics. Chapter 2
covers the basic phases of the decision-making process and introduces decision support
systems in more detail.

30 Pan I • Decision Making and Analytics: An Ove rview
Part II
Descriptive Analytics
Chapter 3
Data Warehousing
Part I
Decision Making and Analytics: An Overview
r – – – – – – – – – – – – – – – – – – – – – – – – – – ,
‘ Chapter 1 1
An Overview of Business
Intelligence, Analytics, and
Decision Support
Chapter 2
Foundations and Technologies for
Decision Making
Part Ill Part IV
Predictive Analytics Prescriptive Analytics
Chapter 9
Chapter 5 Model-Based Decision Making:
Data Mining Optimization and Multi-Criteria
Chapter 6
Techniques for Predictive
Modeling
Chapter 7
Text Analytics, Text Mining, and
Sentiment Analysis
Systems
Chapter 10
Modeling and Analysis :
Heuristic Sear ch Methods and
Simulation
Chapter 11
Automated Decision Systems and
Expert Systems
Part V
Big Data and Future Directions
for Business Analytics
,————————–
Chapter 13
Big D ata and Analytics
Chapter 4
Business Reporting , Visual
Analytics, and Business
Performance Management
Chapter 8
Web Analytics, Web Mining, and
Social Analytics
Chapter 12
Knowledge Management and
Collabor ative Systems
Chapter 14
Business Analytics: Emer ging
Trends and Future Impacts
————-r- ———— ————r————, ._ ————1- ————- ————-T- ———–
I
FIGURE 1.6 Plan of the Book.
Part II: Descriptive Analytics
Part VI
Online Supplements
Software Demos
Data Files for Exercises
PowerPoint Slides
Part II begins with an introduction to data warehousing issues , applications, and technolo-
gies in Chapter 3. Data re present the fundamental backbone of any decision support and
analytics application. Chapter 4 describes business reporting, visualization technologies,
and applications. It also includes a brief overview of business performance management
techniques and applications, a topic that has been a key p art of traditional BI.
Part Ill: Predictive Analytics
Part III comprises a large part of the book. It begins w ith an introduction to predictive
analytics applications in Chapter 5. It includes many of the common application tech-
niques: classification, clustering, association mining, and so forth. Chapter 6 includes a
technical description of selected d ata minin g techniques, especially ne ural network m od-
els. Chapter 7 focuses on text mining applications. Similarly, Chapter 8 focuses on Web
analytics, including social media analytics, sentiment analysis, and other related to pics.

Chapter 1 • An Overview of Business Intelligence, Analytics, a nd Decision Support 31
Part IV: Prescriptive Analytics
Part IV introduces decision analytic techniques, which are also called prescriptive analyt-
ics. Specifically, Chapter 9 covers selected models that may be implemented in spread-
sheet environme nts. It also covers a popular multi-objective decision technique-analytic
hierarchy processes.
Chapter 10 then introduces other model-based decision-making techniques, espe-
cially heuristic models and simulation. Chapter 11 introduces automated decision systems
including expert systems. This part concludes with a brief discussion of knowledge
management and group support systems in Chapter 12.
Part V: Big Data and Future Directions for Business Analytics
Part V begins with a more detailed coverage of Big Data and analytics in Chapter 13.
Chapter 14 attempts to integrate all the material covered in this book and
concludes with a discussion of emerging trends , such as how the ubiquity of wire-
less and GPS devices and other sensors is resulting in the creation of massive new
databases and unique applications. A new breed of data mining and BI companies is
emerging to analyze these new databases and create a much better and deeper under-
standing of customers’ behaviors and movements. The chapter also covers cloud-based
analytics, recommendation systems, and a brief discussion of security/ privacy dimen-
sions of analytics. It concludes the book by also presenting a discussion of the analytics
ecosystem. An understanding of the ecosystem and the various players in the a nalytics
industry highlights the various career opportunities for students and practitioners of
analytics .
1.11 RESOURCES, LINKS, AND THE TERADATA UNIVERSITY
NETWORK CONNECTION
The use of this chapter and most other chapters in this book can be e nhanced by the tools
described in the following sections.
Resources and Links
We recommend the following major resources and links:
• The Data Warehousing Institute (tdwi.org)
• Informatio n Management (information-management.com)
• DSS Resources (dssresources.com)
• Microsoft Enterprise Consortium (enterprise.waltoncollege.uark.edu/mec.asp)
Vendors, Products, and Demos
Most vendors provide software demos of their products and applications. Information
about products, architecture, and software is available at dssresources.com.
Periodicals
We recommend the following periodicals:
• Decision Support Systems
• CIO Insight (cioinsight.com)
• Technology Evaluation (technologyevaluation.com)
• Baseline Magazine (baselinemag.com)

32 Pan I • Decision Making and Analytics: An Overview
The Teradata University Network Connection
This book is tightly connected with the free resources provided by Teradata University
Network (TUN; see teradatauniversitynetwork.com) . The TUN portal is divided
into two major parts: one for students and one for facu lty. This b ook is connected to
the TUN portal v ia a sp ecial sectio n at the end of each ch a pter. That section includes
appropriate links for the specific chapter, pointing to relevant resources. In addition,
w e provide hands-on exercises, u s ing software and other material (e.g., cases) avail-
able at TUN.
The Book’s Web Site
This book’s Web site, pearsonhighered.com/turban, contains supplemental textual
m aterial o rganized as Web chapters that correspond to the printed b ook’s chapters. The
topics of these ch a pters are listed in the online chapter table of conte n ts . Other conte nt is
also available on an indepe ndent Web site (dssbibook.com) .2
Chapter Highlights
• The business e nvironment is becoming complex
and is rapidly changing, ma king decisio n making
m o re difficult.
• Businesses must respond and adapt to the chang-
ing e nvironment rapid ly by making faster and
better decisions.
• The time frame for making decisio n s is shrinking,
w h ereas the global nature of decision making is
expa nding, necessitating the d evelopment and
u se of computerized DSS.
• Computerized support for ma nagers is ofte n
essential for the survival of a n organization .
• An early decision support framework divides
decision situatio ns into nine categories, depending
on the degree of structuredness and managerial
activities. Each category is supported differently.
• Structured repetitive decisio ns are supported by
standard quantitative analysis methods, such as MS,
MIS, an d rule-based automated decision suppo rt.
• DSS use data , models, and sometimes knowledge
management to find solutions for semistructured
a nd some unstructured proble ms.
• BI metho ds utilize a central repository called a
data warehouse that e na bles efficient data mining,
OLAP , BPM, and data visu alizatio n.
• BI architecture includes a data wareh ouse, busi-
ness analytics tools u sed by end u sers, and a u ser
interface (su ch as a dashboard) .
• Many organizatio ns employ descriptive analytics
to re place the ir traditional fla t reporting with inter-
active reporting that provides insights , trends, and
patterns in the transactional data.
• Predictive analytics enable organizations to estab-
lish predictive rules that drive the business o ut-
comes through historical d ata analysis of the
existing beh avior of the cu stomers .
• Prescriptive analytics he lp in building models that
involve forecasting and optimizatio n techniques
based o n the principles of operatio ns research
and management science to help organizations to
make better decision s.
• Big Data analytics focuses o n un structured, la rge
data sets that may also include vastly different
types of data for analysis.
• Analytics as a fie ld is also known by industry-
specific application names su ch as sports analytics.
It is also known by oth e r rela ted names su ch as
data scie nce or network scie nce.
2 As this book went to p ress, we verified that a ll the cited Web sites were active and valid . However, URLs a re
d ynamic. Web sites to which we re fe r in the text sometimes ch a nge o r are discontinued b ecau se compan ies
c ha nge na mes, are bought o r sold , merge, o r fail. Sometimes Web s ites a re down for maintenance, repair, o r
re design. Many organizations h ave dropped the initial “www” d esignatio n for their sites, but some still use it. If
you have a proble m connecting to a Web s ite that we mentio n , please be patient a nd simply ru n a Web search
to try to ide ntify the possible new site. Most times , you can quickly find the n ew s ite through on e o f th e popular
sea rch e ngines. We apologize in a dvance for this in convenie nce.

Chapter 1 • An Ove rview of Business Intelligence, Analytics, a nd Decision Support 33
Key Terms
business intelligence
(BI)
dashboard
data mining
decision (or normative)
analytics
decision support system
(DSS)
Questions for Discussion
1. Give examples for the conte nt of each cell in Figure 1. 2.
2. Survey the literature fro m the p ast 6 mo nths to find o ne
application each for DSS, BI, and analytics. Summarize
the applications o n o ne page and submit it with the exact
sources.
3. Observe an o rganization with which you a re familiar. List
three decisio ns it makes in each of the following categories :
Exercises
Teradata University Network (TUN) and Other
Hands-On Exercises
1. Go to teradatauniversitynetwork.com. Using the reg-
istration your instructor provides, log on and learn the
conte nt of the site . Yo u will receive ass ig nme nts re lated
to this site. Prepare a list of 20 items in the site tha t you
think could be beneficial to you.
2 . Ente r the TUN site and select “cases, projects and assign-
me nts.” Then select the case study: “Harrah’s High Payoff
from Customer Informatio n .” Answer the following ques-
tions about this case:
a. What informatio n does the data mining generate?
b. How is this information helpful to management in
d ecision making? (Be specific.)
c. List the types of data that are mine d .
d. Is this a DSS o r BI a pplication? Why?
3. Go to teradatauniversitynetwork.com and find the paper
titled “Data Warehousing Supports Corporate Strategy at First
American Corporation” (by Watson, Wixom, and Goodhue).
Read the paper and answer the following questions:
a. What were the drivers for the DW / BI project in the
company?
b. What strategic advantages w e re realized?
c. What o perational and tactical advantages were achieved?
d. What were the critical success fa c tors (CSF) for the
imple me ntation?
4. Go to analytics-magazine.org/issues/digital-editions
and find the January/Februa1y 2012 edition titled “Special
Issue: The Future of Healthcare. ” Read the article “Predictive
descriptive (or re porting)
analytics
predictive analytics
prescriptive analytics
semistructured
problem
structured problem
unstructured proble m
strategic planning, manageme nt control (tactical planning),
and o p e ratio nal planning a nd contro l.
4. Distinguish BI from DSS.
5. Compa re a nd contrast pre dictive a nalytics with prescrip-
tive and descriptive analytics. Use examples.
Analytics-Saving Lives and Lowering Medical Bills.”
Answer the following questions:
a . What is the proble m that is be ing addressed by apply-
ing predictive analytics?
b . What is the FICO Medication Adhere nce Score?
c. How is a prediction mo del traine d to predict the FICO
Medicatio n Adherence Score? Did the prediction
model classify FICO Medication Adhe re nce Score?
d. Zoom in o n Figure 4 and explain w h at kind of tech-
nique is applied o n the generated results.
e. List some of the actio nable decisions that were based
on the results of the p redictions.
5 . Go to analytics-magazine.org/issues/digital-editions
and find the January/ February 2013 editio n titled “Work
Social. ” Read the a rticle “Big Data, Analytics and Elections”
a nd answer the followin g questions:
a . What kinds of Big Data were a nalyzed in the article?
Comment o n some of the sources of Big Data.
b. Explain the term integrated system. Wha t othe r tech-
nical term suits integrated system?
c. What kinds of da ta a nalysis techniques are e mployed
in the project? Comment on some initiatives that
resulted fro m data a nalysis.
d . What a re the diffe re nt prediction problems a nswered
by the models?
e. List some o f the actionable decisions taken that were
based on the predicatio n results.
f. Identify two applications of Big Data a nalytics that are
not listed in the article.

34 Pan I • Decision Making and Analytics: An Ove rview
6. Search the Inte rnet for mate rial regarding the work of man-
agers and the role analytics play. What kind o f references
to consulting firms, academic de paitme nts, and programs
do you find? What major areas are re p resented? Select five
sites that cover one area and repo rt your findings.
7. Explore the public areas of dssresources.corn. Pre p are
a list of its majo r ava ila ble resources. Yo u might want to
refe r to this site as you wo rk through the book.
End-of-Chapter Application Case
8. Go to rnicrostrategy.corn. Find information on the five
styles o f BI. Prepare a su mmary ta ble fo r each style.
9. Go to oracle.corn and click the Hyp e rion link u nder
Applications. Determine what the compa ny’s major prod-
u cts a re. Re late these to the su p port tech nologies cited in
this ch apter.
Nationwide Insurance Used Bl to Enhance Customer Service
Nationwide Mutual Insurance Company, headqu aitered in
Columbus, O hio, is one of the largest insurance and financial
services companies, w ith $23 billion in revenues and more
than $160 billion in statutory assets. It offers a comprehe nsive
range of products through its family of 100-plus companies with
insurance products for auto, motorcycle, boat, life, homeown-
ers, and farms. It also offers financial products and services
including annuities, mongages, mutual funds, pensions, a nd
investme nt management.
Nationw ide strives to achieve greater efficie ncy in all
operatio ns by ma naging its expenses along with its ability to
grow its revenue . It recognizes the use of its su·ategic asset of
info rmation comb ined with analytics to o utp ace competitors
in strategic and o p eratio nal decision making even in complex
and unpredictable environments.
Historically, Natio nw ide’s bu siness units worked inde-
pendently and w ith a lot of autonomy. This led to d uplication
of effo rts, w idely dissimilar data processing environments, and
exu·e me data redundancy, resulting in higher exp enses. The
situatio n got comp licated w hen Natio nwide pursu ed any merg-
ers o r acquisitions.
Nationw ide, using enterprise data warehouse technology
from Teradata, set out to create , from tl1e ground u p, a single,
authoritative e nvironn1ent for clean , consistent, and complete
data that can be effectively used for best-p ractice analytics to
make su-ategic and tactical business decisions in the areas of
customer growth, retention , product profitability, cost contain-
ment, and productivity improvements. Natio nw ide u-ansfonned
its siloed business units, which were supponed by stove-piped
data environments, into integrated units by using cutting-edge
analytics that work w ith clear, consolidated data from all of
its business units. The Teradata data warehouse at Nationwide
has grown from 400 gigabytes to more than 100 terabytes and
supports 85 p ercent of Nationw ide’s business w ith more than
2,500 users.
Integrated Customer Knowledge
Nationw ide ‘s Cu stome r Knowledge Sto re (CKS) m1t1at1ve
develo p ed a cu sto me r-centric database that integrate d cu s-
to me r, product, a nd externa lly acquire d d ata from mo re
than 48 sources into a single customer data ma rt to deliver a
h o listic view of cu sto mers. This d ata mart was coupled w ith
Teradata’s customer relatio nship man agem ent application to
create and ma nage effective customer ma rketing cam paigns
th at u se behavioral analysis of cu stome r interactions to drive
c u stomer ma nageme nt actio ns (CMAs) fo r target segments .
Nationw ide ad ded mo re sophisticated cu stom er analytics
that looke d at custome r p o rtfolios and the effectiveness
of va riou s marketing camp aigns . This data analysis he lped
Nation wide to initiate proactive customer commu nicatio ns
around custo mer life time events like marriage, b irth of child ,
o r h o me p urchase and had significa n t imp act on improv-
ing cu stomer satisfactio n. Also, by integra ting cu stomer
contact history, produ ct own ership, a nd payment informa-
tio n , Nationwide’s be havioral an alytics teams fu rthe r created
p rio ritized models that could identify w hich specific cus-
tome r interaction was importa nt for a customer at any given
t ime. This resulted in o ne percentage point improve ment
in cu stome r rete ntion rates and significant improvement
in custome r e n thusiasm scores. Nationwide also achieved
3 percent annual growth in incremental sales by using CKS .
There are other uses of the customer database. In one of
th e initiatives, by integrating customer te lepho ne d ata from
multip le syste ms into CKS, the relation sh ip man agers at
Natio nw ide tty to be proactives in contacting c ustomers in
adva nce of a possible weather catastroph e, su ch as a hur-
rican e or flood, to provide the p rimary p o licyh older infor-
matio n and explain th e claims p rocesses. These an d other
analytic insights now d rive Natio nw ide to p rovide extrem ely
personal customer service.
Financial Operations
A sinillar p e rformance p ayoff from integrated information was
also noted in finan cial operations. Nationwide’s decentralized
management style resulted in a fragme nted financial report-
ing e nvironment that included more than 14 gene ral ledgers,
20 chans of accounts, 17 separate data re positories, 12 different
repo1ting tools, and hundreds of thousands of spreadsheets.
There was no common central view of the business, which
resulte d in labor-intensive slow a nd inaccu rate reporting.

Chapter 1 • An Ove rview of Business Intelligence, Analytics, a nd Decision Support 35
About 75 percent of the effort was spent on acquiring, clean-
ing, and consolidating and validating the data, and very little
time was spent on meaningful analysis of the data.
The Financial Pe rforma nce Management initiative
impleme nted a new o pe rating approac h that worked on a
single d ata a nd technology architecture with a common set
of systems standa rdizing the process of re porting. It e nabled
Nationwide to opera te analytical centers of excelle nce with
world-class planning, capital manageme nt, risk assessme nt,
and other decision support capabilities that delive red timely ,
accurate, and efficie nt accounting, re porting, and analytical
services.
The d ata from more tha n 200 operational systems was
sent to the e nte rprise -w ide da ta warehouse and the n distrib-
ute d to various applications and analytics. This resulte d in
a 50 percent improvement in the monthly closing process
with closing inte rvals reduced from 14 days to 7 days.
Postmerger Data Integration
Na tionwide’s Goal State Rate Manageme nt m1t1at1ve e n a –
ble d the company to me rge Allied Insurance’s automobile
p o licy system into its existi ng syste m. Bo th atio nwide and
Allied source systems were custom-built applica tions that
did not share any common values or process data in the
same m a nne r. Nationwide’s IT d e partme nt de cide d to bring
all the data from source systems into a centralized data
ware house, organized in an integrated fash io n tha t resulted
in standard dimensional reporting a nd helped Nationwide
in performing what-if analyses. The data analysis tea m
could identify previously unknown p otential diffe re nces
in the data e nvironme nt where premiums ra tes were cal-
culated diffe re ntly b etween Nationwide and Allied s ides.
Correcting all of these benefited Nationwide’s policyhold-
ers because they were safeguarded from experiencing w ide
pre mium rate swings.
Enhanced Reporting
Nationwide’s legacy re porting syste m , which catered to the
nee ds of prope rty and casualty business units, took weeks
to compile and d eliver the needed re ports to the agents.
Nationw ide determined that it nee ded b e tte r access to sales
and policy information to reach its sales targe ts. It chose a
References
Antho ny, R. N. (1965). Planning and Control Systems: A
Framework/or Analysis. Cambridge, MA: Harvard University
Graduate School of Business.
Asterdata.com. “Gilt Groupe Speaks on Digital Marketing
Optimization. ” www .asterdata.com/gilt_groupe_ video.
php (accessed February 2013).
single data warehouse approach and, after careful assessment
of the needs of sales manageme nt and individual agents,
selected a business intelligence platform that would integrate
d ynamic e nte rprise dashboards into its re p o rting systems,
making it easy for the agents and associates to view policy
information at a gla nce. The new repo rting system, d ubbed
Re ve nue Connection, also en abled users to analyze the info r-
mation with a lo t of interactive and drill-down-to-details capa-
bilities at various levels that elimina ted the need to gene rate
custom ad hoc reports. Reve nue Connection virtually elimi-
n ate d requests for manual p olicy audits, resulting in huge
savings in time and money for the bu siness and techno logy
teams. The reports were produced in 4 to 45 seconds, rather
than days o r weeks, a nd productivity in some units improved
by 20 to 30 percent.
QUESTIONS FOR DISCUSSION
1. Why did Nationwide need an e nterprise -w ide d ata
warehouse?
2. How did integrated data drive the bu siness value?
3 . What forms of analytics are e mployed at Natio nwide?
4. With integrated data available in an e nte rprise d ata
warehouse, what other applications could Natio nwide
potentially d evelo p ?
What We Can Learn from This Application
Case
The proper u se of integra ted informa tion in organiza-
tio ns can help achieve bette r business outcomes. Many
organizatio ns now rely o n data w a re housing techno logies
to p e rfo rm the o nline analytical processes o n the d ata to
derive valuable insights. Th e insights are u sed to d evelop
predictive mo d els that further e nable the growth of the
organizations by m ore precisely assessing customer needs.
Increasingly, o rganizations a re moving toward d e rivi ng
value from analytical applications in real time with the
he lp of integrated data fro m real-time data warehousing
techno logies.
Source: Te radata.com, “Nationw ide, Delivering an On Your Side
Experie nce,” teradata.com/WorkArea/linkit.aspx?Linkldentifie
r=id&Item1D=14714 (accessed Februa ry 2013).
Barabasilab. ne u.edu. “Network Science.” barabasilab.neu.
edu/networksciencebook/downlPDF .html (accessed
February 2013).
Brooks, D. (2009, May 18). “In Praise of Dullness.” New York
Times, nytimes.com/2009/05/19/opinion/19brooks.
html (accessed February 2013).

36 Part I • Decision Making and Analytics: An Overview
Centers for Disease Control and Prevention, Vaccines
for Children Program, “Module 6 of the VFC Operatio ns
Guide.” cdc.gov/vaccines/pubs/pinkbook/vac-stor-
age.html#storage (accessed Ja nuary 2013).
Eck erson , W. (2003). Smart Companies in the 21st Century:
Tbe Secrets of Creating Successful Business Intelligent
Solutions. Seattle , WA: The Data Warehousing Institute.
Emc.com. “Data Scie n ce Revealed: A Data-Drive n Glimpse
into the Burgeoning ew Field.” emc.com/collateraV
about/ news/ emc-data-science-study-wp. pdf (accessed
February 2013).
Gorry, G. A., an d M. S. Scott-Morto n . (1971). “A Framework
for Manageme nt Information Systems.” Sloan Management
Review, Vol. 13, No. 1, pp. 55-70.
INFORMS. “An alytics Section Overview.” informs.org/
Community/ Analytics (accessed February 2013).
Keen, P. G. W. , and M. S. Scott-Morton. 0978). Decision
Support Systems: An Organizational Perspective. Reading,
MA: Addison-Wesley.
Krivda , C. D. (2008, March) . “Dialing Up Growth in a Mature
Market. ” Teradata Magazine, pp. 1-3.
Magpiesensing.com. “MagpieSensing Cold Cha in Analyt-
ics a nd Monitoring.” magpiesensing.com/wp-content/
uploads/2013/01/ColdChainAnalyticsMagpieSensing-
Whitepaper (accessed Janua1y 2013).
Mintzberg, H. A. (1980). Tbe Nature of Managerial Work.
Englewood Cliffs, NJ: Pre ntice Hall.
Mintzberg, H. A. (1993). Tbe Rise and Fall of Strategic
Planning. New York: The Free Press.
Simo n , H. (1977). Tbe New Science of Management Decision.
Eng lewood Cliffs, NJ: Pre ntice Hall.
Tableausoftware.com. “Eliminating Waste at Seattle Ch il-
dren’s. ” tableausoftware.com/ eliminating-waste-at-
seattle-childrens (accessed February 2013).
Tableausoftware.com. “Kaleida Health Finds Efficie ncie s , Stays
Competitive. ” tableausoftware.com/learn/ stories/user-
experience-speed-thought-kaleida-health (accessed
February 2013).
Teradata. com. “Natio nw ide, Delivering an On Your Side
Experience. ” teradata.com/ case-studies/ delivering-on-
your-side-experience (accessed February 2013).
Teradata.com. “Sabre Airline Solutions. ” teradata.com/t/
case-studies/Sabre-Airline-Sol utions-EB62 81
(accessed February 2013).
Wa ng , X., et a l. (2012 , J a nuary/ February). “Bran ch Reconfigu-
ratio n Practice Through Operations Research in Industrial
and Comme rcial Bank of Ch in a .” Interfaces, Vol. 42 , No . 1,
pp. 33-44.
Watson, H. (2005 , Winter). “Sorting Out What’s New in Decision
Support.” Business Intelligence journal.
Wikipedia. “On Base Percentage. ” en.wikipedia.org/wiki/
On_base_percentage (accessed Ja n uary 2013).
Wikipedia. “Sab erm etrics. ” en.wikipedia.org/wiki/
Sabermetrics (accessed January 2013).
Zaleski, A. (2012). “Magpie Analytics System Tracks Cold-
Ch ain Products to Keep Vaccines, Reagents Fresh. ”
TechnicallyBaltimore.com (accessed February 2013).
Zaman, M. (2009, April). “Business Intelligence: Its Ins and O uts.”
technologyevaluation.com (accessed February 2013).
Z iam a , A. , a n d J. Kasher. ( 2004) . “Data Mining Primer
for the Data Wareh ousing Professional. ” Dayton, OH:
Teradata.

CHAPTER
Foundations and Technologies
for Decision Making
LEARNING OBJECTIVES
• Understand the conceptual foundations
of decision making
• Understand Simon’s four phases of
decision making: intelligence, design,
choice, and implementation
• Understand the essential definition
of DSS
• Understand impo1tant DSS classifications
• Learn how DSS suppo1t for decision
making can be provided in practice
• Understand DSS components and how
they integrate
0
ur major focus in this book is the support of decision making through
computer-based information systems. The purpose of this chapter is to describe
the conceptual foundations of decision making and how decision support is
provided. This chapter includes the fo llowing sections:
2.1 Opening Vignette: Decision Modeling at HP Using Spreadsheets 38
2.2 Decision Making: Introduction and Definitions 40
2.3 Phases of the Decision-Making Process 42
2.4 Decision Making: The Intelligence Phase 44
2.5 Decision Making: The Design Phase 47
2.6 Decision Making: The Choice Phase 55
2.7 Decision Making: The Implementation Phase 55
2.8 How Decisions Are Supported 56
2.9 Decision Support Systems: Capabilities 59
2.10 DSS Classifications 61
2.11 Components of Decision Support Systems 64
37

38 Pan I • Decision Making and Analytics: An Overview
2.1 OPENING VIGNETTE: Decision Modeling at HP Using
Spreadsheets
HP is a major manufacturer of computers, printers, and many industrial products. Its vast
product line leads to many decision problems. Olavson and Fry (2008) have worked on
many spreadsheet models for assisting decision makers at HP and have identified several
lessons from both their successes and their failures when it comes to constructing and
applying spreadsheet-based tools. They define a tool as “a reusable, analytical solution
designed to be handed off to nontechnical end users to assist them in solving a repeated
business problem. ”
When trying to solve a problem, HP developers consider the three phases in devel-
oping a model. The first phase is problem framing, where they consider the following
questions in order to develop the best solution for the problem:
• Will analytics solve the problem?
• Can an existing solution be leveraged?
• Is a tool needed?
The first question is important because the problem may not be of an analytic nature,
and therefore, a spreadsheet tool may not be of much help in the long run without fixing
the nonanalytical part of the problem first. For example , many inventory-related issues
arise because of the inherent differences between the goals of marketing and supply
chain groups. Marketing likes to have the maximum variety in the product line, whereas
supply chain management focuses on reducing the inventory costs. This difference is par-
tially outside the scope of any model. Coming up with nonmodeling solutions is impor-
tant as well . If the problem arises due to “misalignment” of incentives or unclear lines
of authority or plans, no model can help. Thus, it is important to identify the root issue.
The second question is important because sometimes an existing tool may solve a
problem that then saves time and money. Sometimes modifying an existing tool may solve
the problem, again saving some time and money, but sometimes a custom tool is neces-
sary to solve the problem. This is clearly worthwhile to explore.
The third question is important because sometimes a new computer-based system
is not required to solve the problem. The developers have found that they often use
analytically derived decision guidelines instead of a tool. This solution requires less time
for development and training, has lower maintenance requirements, and also provides
simpler and more intuitive results. That is, after they have explored the problem deeper,
the developers may determine that it is better to present decision rules that can be eas-
ily implemented as guidelines for decision making rather than asking the managers to
run some type of a computer model. This results in easier training, better understanding
of the rules being proposed, and increased acceptance. It also typically leads to lower
development costs and reduced time for deployment.
If a model has to be built, the developers move on to the second phase-the actual
design and development of the tools. Adhering to five guidelines tends to increase the
probability that the new tool will be successful. The first guideline is to develop a proto-
type as quickly as possible. This allows the developers to test the designs , demonstrate
various features and ideas for the new tools, get early feedback from the end users to
see what works for them and what needs to be changed, and test adoption. Developing
a prototype also prevents the developers from overbuilding the tool and yet allows them
to construct more scalable and standardized software applications later. Additionally, by
developing a prototype, developers can stop the process once the tool is “good enough, ”
rather than building a standardized solution that would take longer to build and be more
expen sive .

Chapter 2 • Foundations and Technologies for Decis ion Making 39
The second guideline is to “build insight, n o t black boxes. ” The HP spreadsheet
model developers believe that this is impo rta nt, because ofte n just e n tering some data
and receiving a calculated output is not e noug h. The u sers n eed to be able to think
of alternative scen arios, and the tool d oes n o t support this if it is a “black box” that
provides o nly o ne recomme ndatio n. They a rgue that a tool is best only if it provides
informa tio n to help make and support decisio ns rather tha n just give the answers. They
also believe that a n interactive tool h elps the u sers to unde rstand the problem better,
therefore leading to mo re informed decisions.
The third guideline is to “remove unneeded complexity before handoff. ” This is
important, because as a tool becomes mo re complex it requires mo re training and exper-
tise, more data, and more recalibrations. The risk of bugs and misuse also increases.
Sometimes it is best to study the problem, begin modeling and analysis, and then start
shaping the program into a simple -to -use tool for the end user.
The fourth guideline is to “partner w ith end users in discovery and design. ” By work-
ing with the end u sers the developers get a better feel of the problem a nd a better idea
of what the e nd users want. It also increases the e nd users’ ability to use analytic tools.
The end users also gain a better understanding of the problem and how it is solved u sin g
the new tool. Additionally, including the e nd users in the development process e nha nces
the decision makers’ a na lytical knowledge a nd capabilities. By working together, their
knowledge and skills complement each other in the final solutio n .
The fifth guide line is to “develop a n Operations Research (OR) champio n .” By involv-
ing e nd users in the development p rocess, the developers create champions for the new
tools w h o then go back to their departments o r companies and encourage their cowork-
ers to accept and use them. The champions are the n the experts on the tools in their areas
and can then h e lp those being introduced to the new tools. Having champio ns increases
the possibility that the tools w ill be adopted into the businesses su ccessfully .
The final stage is the h andoff, when the final tools that provide comple te solutions
are given to the businesses. When pla nning the h andoff, it is important to answer the fol-
lowing questions:
• Who w ill use the tool?
• Who owns the decisions that the tool w ill support?
• Who else must be involved?
• Who is resp o n sible for maintenance and enhancement of the tool?
• When will the too l be used?
• How w ill the u se of the tool fit in w ith othe r processes?
• Does it ch ange the processes?
• Does it generate input into those processes?
• How w ill the tool impact business performance?
• Are the existing m e trics sufficient to reward this aspect of performance?
• How should the metrics and incentives be changed to maximize impact to the busi-
n ess from the tool and process?
By keeping these lesson s in mind, developers and proponents of computerized deci-
sion support in gene ral and spreadsheet-based models in particular are likely to enjoy
greater success.
QUESTIONS FOR THE OPENING VIGNETTE
1. What are some of the key questions to be asked in supporting decision making
through DSS?
2. What guidelines can be learned from this vig n e tte about developing DSS?
3. What lessons sh ould be kept in mind for successful m odel implementation?

40 Pan I • Decision Making and Analytics: An Overview
WHAT WE CAN LEARN FROM THIS VIGNETTE
This vignette relates to providing decision support in a large organization:
• Before building a mo del, decision makers should develop a good understanding of
the problem that needs to be addressed.
• A model may not be necessary to address the problem.
• Before developing a n ew tool, decision makers should explo re reuse of existing tools.
• The goal of model building is to gain better insight into the problem, not just to
generate more numbers.
• Implementatio n plan s should be developed alo ng w ith the model.
Source: Based on T. O lavson and C. Fry, “Spre adshee t Decision-Support Too ls : Lesso ns Learne d at Hew le tt-
Packard,” ln teifaces, Vol. 38, No. 4, Ju ly/ August 2008, pp. 300-310.
2.2 DECISION MAKING: INTRODUCTION AND DEFINITIONS
We are about to examine how decision making is practiced a nd some of the underlying
theories and models of decision m aking. You w ill a lso learn a b out the various traits of
decision makers, including what ch aracterizes a good decision maker. Knowing this can
help you to understand the types of decision support tools that m a nagers can use to
m ake mo re effective decisions . In the following sections , we discuss variou s aspects of
decision making.
Characteristics of Decision Making
In additio n to the ch aracteristics presented in the opening vignette, decision making
may involve the fo llowing:
• Groupthink (i.e., group members accept the solution w ithout thinking for them-
selves) can lead to bad decisions.
• Decision ma kers are interested in evaluating what-if scenarios.
• Experimentation w ith a real system (e.g., develop a schedule, try it, and see h ow
well it works) m ay result in failure.
• Experime ntation w ith a real system is possible only for one set of conditions at a
time and can be disastrous.
• Ch anges in th e decision-making e nvironment may occur continuo u sly, leading to
invalidating assumptions about a situatio n (e.g., deliveries around holiday times may
increase, requiring a different view of the problem).
• Changes in the decision-making environme nt may affect decision quality by impos-
ing time pressure on the decision maker.
• Collecting informatio n and an alyzing a problem takes time a nd can be expen s ive. It
is difficult to determine w he n to stop a nd make a decisio n .
• There m ay not be sufficient information to make an intelligent decision.
• Too much information may b e available (i.e. , info rmatio n overload).
To determine how real decision makers make decisions, we must first understand the
process and the important issues involved in decision making. The n we can understand
appropriate metho dologies for assisting decision makers and the contributio ns informatio n
systems can make . Only the n can we develop DSS to he lp decision makers.
This ch apter is organized based o n the three key words that form the term DSS:
decision, support, and systems. A de cisio n maker sh ould n ot simply apply IT tools
b lind ly. Rather, the decision ma ker gets support throug h a ra tio nal approach that

Chapter 2 • Foundations and Technologies for Decis ion Making 41
simplifies reality and provides a relatively quick a nd inexpen sive means of con sidering
variou s alte rna tive courses of action to arrive at the best (or at least a ve1y good) solu-
tio n to the problem.
A Working Definition of Decision Making
Decision making is a process of choosin g among two or more alternative courses of
action for the purpose of attaining one or more goals. According to Simo n (1977), ma na-
gerial decision making is syno nymo u s w ith the e ntire m anagem e nt process. Con side r
the important managerial function of planning. Planning invo lves a series of decisions:
What should be done? When? Where? Why? How? By w hom? Managers set goals, or plan;
he nce, planning implies d ecision making. Other m anagerial functions, such as organizin g
and controlling, also involve decisio n making.
Decision-Making Disciplines
Decision m aking is directly influe nced by several major disciplines, some of which are
behavioral a nd some of which a re scien tific in nature. We must be aware of how their
philosophies can affect our ability to ma ke decisions and provide support. Behavioral
disciplines include anthropo logy, law, philosophy, political scien ce, psychology, social
psycho logy, and socio logy. Scie ntific disciplines include computer scie n ce, decision
analysis, econo mics, engineering, the h ard scien ces (e.g. , biology, ch emistry, p hysics),
man ageme nt science/ operatio ns research, mathematics, an d statistics.
An important characte ristic of management support systems (MSS) is their e mpha-
sis on the effectiveness, or “goodness,” of the decision produced rathe r than o n th e
computational efficie ncy of obtaining it; this is usually a major concern of a transaction
processing system. Most Web-based DSS are focused on improving decision effectiveness.
Efficiency may be a by-product.
Decision Style and Decision Makers
In the following sections, we examine the n otion of decision style and specific aspects
about decision makers.
DECISION STYLE Decision style is the manner by which decision makers think and react
to problems. This includes the way they perceive a problem, their cognitive respon ses,
and how values and beliefs vary from individual to individual and from situation to
situatio n . As a result, people make decisions in different ways. Although the re is a general
process of decision making, it is far from linear. People do n o t follow the same steps
of the process in the same sequence, nor do they use all the steps. Furthermore, the
emphasis, time allo tme nt, and priorities given to each step vary significantly, not o nly
fro m o ne person to a noth e r, but a lso from o ne situatio n to the next. The manner in which
managers make decisions (and the way they inte ract w ith other people) describes their
decision style . Because decision styles depend o n the factors described earlier, there are
many decision styles. Personality temperame n t tests are often u sed to determine decision
styles. Because the re are many su ch tests , it is important to try to equate them in deter-
mining decision style. However, the vario u s te sts mea sure somewh at different aspects of
pe rsonality, so they cannot be equated.
Researche rs h ave identified a numbe r of decision-making styles. These include heu-
ristic and a nalytic styles. One can also distinguish between autocratic versus democratic
styles. Another style is con sultative (with individuals or groups). Of course, there are
many combinations and variatio ns of styles. For example, a person can b e an alytic and
autocratic, o r consultative (with ind ividuals) a nd he uristic.

42 Pan I • Decision Making and Analytics: An Overview
For a computerized system to successfully support a man ager, it should fit the
decision situation as well as the decision style. Therefore, the system sh ould be flexible
and adaptable to different use rs. The a bility to ask w hat-if and goal-seeking questions
provides flexibility in this direction. A Web-based interface using graphics is a desirable
feature in supporting certain decision styles. If a DSS is to support varying styles, skills,
and knowledge, it should not attempt to enforce a specific process. Rather, it sho uld help
decision makers u se and develop their own styles, skills, an d knowledge .
Different decision styles re quire different types of support. A ma jo r factor that deter-
mines the type of support required is w hether the decision maker is a n individual or a
group. Individual decision makers need access to data and to experts who can provide
advice, w hereas groups additio nally need collaboration tools. Web-based DSS can p ro-
vide suppo rt to both.
A lo t of information is available on the Web about cognitive styles and d ecision
styles (e.g., see Birkman Inte rnational, Inc. , birkman.com; Keirsey Temperament Sorter
a nd Keirsey Temperament Theory-II, keirsey.com) . Many personality/ temperament tests
a re available to h e lp managers identify their own styles a nd those of their employees.
Identifying an individual’s style can help establish the m ost effective communication
p atterns a nd ideal tasks for which the p erson is suited.
DECISION MAKERS Decisio ns are often made by individuals, especially at lower manage-
rial levels and in small organizatio n s. There may be conflicting objectives even for a sole
decision maker. For example, w hen making an investment decision, an individual investor
may consider the rate of return on the investment, liquidity, and safety as objectives. Finally,
decisions may be fully automated (but only after a human decision maker decides to do so!).
This discussio n of decision m aking focuses in large p art o n an individual decision
maker. Most major decisions in medium-sized and large organizatio ns a re mad e by groups.
Obviously, there are often conflicting objectives in a group decision-making setting. Grou ps
can be of variable size and may include people from different departments or from differ-
e nt o rganizatio ns. Collaborating individuals may have different cognitive styles, personality
types, and decision styles. Some clash, w hereas others a re mutually enhancing. Con sensus
can be a difficult political problem. Therefore, the process of decision making by a group
can be ve1y complicated. Computerized support can greatly enha nce group decision
making. Computer support can be provided at a broad level, en abling members of w ho le
departments, divisions , or even entire o rganization s to collaborate online . Su ch supp ort
has evolved over the past few years into ente rprise info rmation systems (EIS) and inclu des
group support syste ms (GSS), enterprise resource management (ERM)/enterprise resource
p lanning (ERP), supply ch ain m anagem ent (SCM), knowledge management systems (KMS),
and custome r relatio nship management (CRM) systems.
SECTION 2.2 REVIEW QUESTIONS
1. What are the vario us asp ects of decisio n making7
2. Ide ntify similarities and diffe re n ces between individual and group decisio n making .
3. Define decision style a nd describe w hy it is importa nt to consider in the d ecision-
making process.
4. What are the benefits of m athematical models?
2.3 PHASES OF THE DECISION-MAKING PROCESS
It is advisable to follow a systematic decision-making process. Simon 0977) said that this
involves three major phases: intelligence, design , and choice. He later added a fourth phase,
imple me ntatio n . Monitoring can be considered a fifth phase- a form of feedback. However,

Success~
Simplification
Assumptions
Chapter 2 • Foundations and Technologies for Decision Making 43
Organization objectives
Search and scanning procedures
Data collection
Problem identification
Problem ownership
Problem classification
Problem statement
Formulate a model
-1- – – – – -;
Validation of the model Set criteria for choice
Verification, testing of
ro osed solution
Search for alternatives ~ • • • • • -:
Predict and measure outcomes
Solution to the model
Sensitivity analysis
Selection of the best (good)
alternative(s)
Plan for implementation
… ••••••I
Implementation
of solution – – – – – – – – – – – – -?- ———————————-_:
• Failure
FIGURE 2.1 The Decision-Making/Modeling Process.
we view mo nito ring as the intelligence phase applied to the implementation phase. Simo n ‘s
mod el is the most concise and yet comple te characte rizatio n of ratio nal decisio n making.
A conceptual picture of the decision-making process is shown in Figure 2. 1.
There is a continuo us flow of activity fro m intelligence to design to ch oice (see the
bold lines in Figure 2.1), but at any phase, there may be a return to a previous phase
(feedback) . Modeling is a n essential part o f this process. The seemingly ch aotic n ature of
following a haphazard path from problem discovery to solutio n via decision making can
be explained by these feedback loops .
The decision-making process starts w ith the intelligence phase; in this phase, th e
decision maker examines reality and identifies and defines the problem. Problem ownership
is established as well. In the design phase, a model that represents the system is constrncted.
This is done by making assumptions that simplify reality and w riting down the relationships
among all the variables. The model is then validated, and crite ria are determined in a princi-
ple of choice for evaluatio n of the alternative courses of action that a re identified. Often, the
process of model development identifies alternative solutions and vice versa.
The choice phase includes selection of a proposed solution to the model (not
necessarily to the problem it represents). This solutio n is tested to determine its viability.
When the proposed solution seems reasonable, we are ready for the last phase: imple –
me ntatio n of the decision (n o t n ecessarily of a system). Su ccessful implementation results
in solv ing the real problem. Failure leads to a return to an earlier phase of the process. In
fact, we can return to an earlie r phase during a ny o f the latter three phases. The decision-
making situatio ns described in the opening vignette follow Simon ‘s four-ph ase model, as
do almost a ll other d ecisio n-making situatio n s. Web impacts on the four phases, and vice
versa, a re sh own in Table 2.1 .

44 Pan I • Decision Making and Analytics: An Overview
TABLE 2.1 Simon’s Four Phases of Decision Making and the Web
Phase
Intell igence
Design
Choice
Implementation
Web Impacts
Access to information to identify
problems and opportunities from
internal and external data sources
Access to analytics methods to
identify opportunities
Collaboration through group support
systems (GSS) and knowledge
management systems (KMS)
Access to data, models, and solution
methods
Use of online analytical processing
(OLAP), data mining, and data
warehouses
Collaboration through GSS and KMS
Similar solutions available from KMS
Access to methods to evaluate the
impacts of proposed solutions
Web-based collaboration tools (e.g.,
GSS) and KMS, which can assist in
implementing decisions
Tools, which monitor the performance
of e-commerce and other sites,
including intranets, extranets, and
the Internet
Impacts on the Web
Identification of opportunities for
e-commerce, Web infrastructure,
hardware and softwa re tools, etc.
Intelligent agents, which reduce the
burden of information overload
Smart search engines
Brainstorming methods (e.g ., GSS)
to collaborate in Web
infrastructure design
Models and solutions of Web
infrast ructure issues
Decision support system (DSS) tools,
which examine and establish criteria
from models to determine Web,
intranet, and extranet infrastructure
DSS tool s, wh ich determine how
to route messages
Decisions implemented on browser
and server design and access,
which ultimately determined how
to set up the various compone nts
that have evolved into the Internet
Note that there are many other decision-making processes. Notable among them is
the Kepner-Tregoe method (Kepner and Tregoe, 1998), which has been adopted by many
firms because its tools are readily available from Kepner-Tregoe, Inc. (kepner-tregoe.
com). We have found that these alternative models , including the Kepner-Tregoe method,
readily map into Simon ‘s four-phase model.
We next turn to a detailed discussion of the four phases identified by Simon.
SECTION 2.3 REVIEW QUESTIONS
1. List a nd briefly describe Simon’s four phases of decision making.
2. What are the impacts of the Web on the phases of decision making?
2.4 DECISION MAKING: THE INTELLIGENCE PHASE
Intelligence in decision making involves scanning the environment, either intermitte ntly
or continuously. It includes several activities aimed at identifying problem situations or
opportunities. It may also include monitoring the results of the implementation phase of
a decision-making process.

Chapter 2 • Foundations and Technologies for Decis ion Making 45
Problem (or Opportunity) Identification
The intelligence phase begins with the ide ntification of organizational goals and objectives
re lated to an issu e of concern (e.g ., inventory manageme nt, job selectio n, lack of or incorrect
Web presence) and determination of w hether they are being met. Problems occur because of
dissatisfactio n with the status quo. Dissatisfactio n is the result of a difference between w hat
people desire (or expect) and what is occurring . In this first phase, a decision make r atte mpts
to determine whether a problem exists, identify its symptoms, determine its magnitude, and
explicitly define it. Often, what is described as a problem (e.g. , excessive costs) may be
only a symptom (i.e., measure) of a proble m (e.g., improper invento1y levels). Because real-
world problems are usually complicated by many interrelated factors, it is sometimes difficult
to distinguish between the sympto ms and the real problem. New opportunities and prob-
lems certainly may be uncovered w hile investigating the causes of symptoms. For example,
Application Case 2.1 describes a classic story of recognizing the correct p roblem.
The existe n ce of a proble m can be determined by mo nitoring and analyzing the
organization’s productivity level. The measure ment of prod uctivity and the construction
of a model are based o n real data. The collection of data and the estima tion of future data
are amo ng the most difficult step s in the an alysis. The following are some issues that may
arise during data collection and estimatio n and thus plagu e decisio n m akers:
• Data a re not available . As a result, the model is m ade with, and relies on, p otentially
inaccurate estimates .
• Obtaining data may be expensive .
• Data m ay not be accurate or precise enough.
• Data estimatio n is o fte n subjective .
• Data may be insecure.
• Impo rtant data that influence the results may be qualitative (soft) .
• There may be too ma ny data (i.e. , info rmatio n overload).
Application Case 2.1
Making Elevators Go Faster!
This sto1y h as been reported in numerous places
and has a lmost become a classic example to explain
the n eed for problem identification. Ackoff (as cited
in La rso n , 1987) described the problem of managing
complaints about slow elevators in a tall hotel tower.
After trying m any solutions for reducing the com-
plaint: staggering elevators to go to different floors,
adding o perators, and so on, the management deter-
mined that the real p roble m was n ot about the actual
waiting time but rathe r the p erceived waiting time.
So the solution was to install full-length mirrors on
e levator doors o n each floor. As Hesse and Woolsey
(1975) put it, “the women would look at the mselves
in the mirrors and make adjustme nts, w hile the me n
would look at the women , and before they knew it,
the e levator was the re .” By reducing the perceived
waiting time, the problem went away. Baker and
Cameron 0996) give several other examples of dis-
tractions, including lightin g, displays, and so on, that
organizations use to reduce p erceived waiting time .
If the real problem is identified as perceived waiting
time, it can make a big difference in the proposed
solutions and th e ir costs. For example, full-length
mirrors probably cost a whole lot less tha n adding
an elevator!
Sources: Based on J. Baker and M. Came ron , “The Effects of
the Service Environment on Affect and Consu me r Perception o f
Waiting Time: An Integrative Review and Research Propositions,”
Journal of the Academy of Marketing Science, Vol. 24, September
1996, pp. 338-349; R. He sse and G. Woolsey, Applied Management
Science: A Quick and Dirty Approach, SRA Inc., Chicago, 1975;
R. C. Larson, “Perspectives on Queues: Social J ustice and the
Psychology of Queuing ,” Operations Research, Vol. 35, No. 6,
November/ December 1987, pp. 895- 905.

46 Pan I • Decision Making and Analytics: An Overview
• Outcomes (or results) may occur over an extended period. As a result, rev-
enues, expen ses, a nd profits w ill be recorded at different points in time. To
overcome this difficulty, a present-value a pproac h can be used if the results are
quantifiable.
• It is assumed that future data w ill be similar to historical data. If this is n ot the case,
the nature of the change has to be predicted and included in the analysis .
When the preliminary investigation is completed, it is possible to determine w hether
a proble m really exists, where it is located, and h ow significant it is. A key issue is w he ther
an informatio n system is reporting a problem o r o nly the sympto ms of a problem. For
example, if reports indicate that sales are down, there is a p roblem, but the situation, no
d oubt, is symptomatic of the problem. It is critical to know the real problem. Sometimes
it may be a problem o f perception , incentive mismatch, or organizational processes rather
than a poor decision model.
Problem Classification
Problem classification is the conceptualization of a problem in an attempt to place it in
a definable category, p ossibly leading to a standard solutio n approach . An important
approach classifies problems according to the degree of structuredness evident in them.
This ranges from totally structured (i.e., programmed) to totally unstructured (i.e., unpro-
grammed), as described in Ch apte r 1.
Problem Decomposition
Many complex problems can be divided into subproblems. Solving the simpler subprob-
le ms may help in solving a complex problem. Also, seemingly poorly structured problems
sometimes have highly structured subproble ms. Just as a semistructured problem results
w he n some phases of decisio n ma king are structured w h ereas other phases are unstruc-
tured, so w h e n some subproblem s of a decision-making problem are structured w ith
others unstructured, the problem itself is semistructured. As a DSS is develo ped and the
d ecision maker and development staff learn more about the problem, it gains structure.
Decomposition also facilitates communicatio n a mo ng decision makers. Decomposition is
o ne of the m ost important aspects of the analytical hierarchy process. (AHP is discussed
in Chapte r 11 , which h elps decision m akers incorp o rate both qualitative and quantitative
factors into the ir decision-making models.)
Problem Ownership
In the inte lligen ce phase , it is impo rta nt to establish problem own e rsh ip. A problem
exists in a n organization o nly if someone or some group takes o n the responsibility of
a ttacking it and if the o rganizatio n has the ability to solve it. The assignment of auth o r-
ity to solve the problem is ca lled problem ownership. For example, a ma nager m ay
feel that he or she has a problem because interest rates are too high. Becau se interest
rate levels are determined at the n ational a nd international levels, a nd most managers
ca n do nothing about the m , hig h interest rates a re the problem of the government, n ot
a problem for a s p ecific company to solve. The problem compa nies actu ally face is
h ow to operate in a high-inte rest-ra te e n vironme nt. For an individual compan y, the
interest rate level s hould be h a ndle d as an uncontrollable (env ironme ntal) factor to be
predicte d.
When problem owne rship is not establish ed, e ithe r someone is n o t doing his or
her job or the problem at hand has yet to be identified as belonging to anyone . It is then
important for someon e to e ithe r volunteer to own it or assig n it to someone.
The inte lligen ce p hase e nds w ith a fo rmal problem stateme nt.

Chapter 2 • Foundations and Technologies for Decis ion Making 47
SECTION 2.4 REVIEW QUESTIONS
1. What is the difference between a problem and its symptoms?
2. Why is it impo rtant to classify a p roblem?
3. What is meant by p roblem decomposition?
4. Why is establishing prob lem ownership so impo rta nt in the decision-making process?
2.5 DECISION MAKING: THE DESIGN PHASE
The design phase involves finding or develo ping and a n alyzing possible courses of action .
These include understanding the problem and testing solu tio ns for feasibility. A model
of the decision-making problem is constructed, tested, and validated. Let u s first define
a m odel.
Models1
A majo r ch aracteristic of a DSS and many BI tools (n o tably those of business a nalytics) is the
inclusio n of at least o ne model. The basic idea is to perform the DSS analysis o n a m odel
of reality rather than on the real system . A model is a simplified representation or abstrac-
tio n of reality. It is usually simplified because reality is too complex to describe exactly and
because much of the complexity is actually irre levant in solving a specific problem.
Mathematical (Quantitative) Models
The complexity of relationships in many organizatio nal systems is described mathemati-
cally. Most DSS analyses are performed numerically with mathematical or oth er quantitative
models.
The Benefits of Models
We use models for the following reason s:
• Manipulating a mo del (ch anging decisio n variables or the environment) is much
easie r tha n manipulating a real system. Experimentation is easier and does not
interfere w ith the o rga nizatio n’s daily operations.
• Models enable the compression of time. Years of operation s can be simulated in
minutes o r seconds of compute r time.
• The cost o f modeling analysis is much lower than the cost of a similar experiment
conducted on a real system.
• The cost of m aking mistakes during a trial-and-error experime n t is much lower
w hen models are used than w ith real systems .
• The business e nvironment involves considerable uncertainty. With modeling, a
ma nager can estimate the risks resulting from specific action s.
• Mathematical models enable the analysis of a ve ry large, sometimes infinite , number
of possible solutio ns. Even in simple proble ms, managers ofte n have a large number
of alte rnatives fro m which to choose.
• Models enhance a nd reinforce learning and training .
• Models and solution m e tho ds are readily available .
Modeling involves con ceptua lizing a proble m a nd ab stracting it to quantita tive
and/ or qualitative form (see Chapter 9). For a mathematical model, the variables are
‘Cautio n : Many students a nd professio nals view models strictly as those of “data modeling” in the context o f
systems a nalysis and design. He re, we conside r analytical models su ch as those of linear progra mming, simula-
tio n , a nd forecasting.

48 Pan I • Decisio n Making and Analytics: An Ove rview
identified, a nd their mutual relationships are established. Simplificatio ns are made ,
w h e n ever n ecessary, through assumptio n s . For example, a relationship b etween two
variables may be assumed to be linear even though in reality there may be som e n on-
linear effects. A proper b alan ce b e twee n the level of model simplification and the rep-
rese ntatio n of reality must be obtained becau se of the cost-be n efit trade-off. A s impler
model leads to lower development costs, easier manipulation, and a faster solution but
is less representa tive o f the real proble m and can produce inaccurate results . However,
a simple r model gene rally re quires fe wer data , o r the data are aggregated and easier
to obtain.
The process of modeling is a combination of art a nd science. As a science, there
a re many standard model classes available , and, w ith practice, an analyst can determine
w hich one is applicable to a given situation. As an art, creativity a nd finesse are required
w he n determining what simplifying assumptions can work, how to combine approp ri-
ate features of the mode l classes, a nd how to integrate models to obtain valid solutio n s.
Models h ave decision variables that describe the alte rnatives from among which a
man ager must ch oose (e.g., h ow many cars to deliver to a sp ecific re n tal agen cy, h ow to
advertise at specific times, w hich Web server to buy or lease), a result variable or a set
of result variables (e.g. , profit, revenue, sales) that d escribes th e objective o r goal of the
decis io n-making problem, and uncontrollable variables or pa ra me te rs (e.g., econo mic
conditions) that describe the environment. The process of modeling involves determin-
ing the (usu ally mathe matical, sometimes symbolic) relationships among the variables.
These topics a re discussed in Chapter 9.
Selection of a Principle of Choice
A principle of choice is a crite rio n that d escribes the acceptability of a solutio n
approach . In a model, it is a result variable. Selecting a principle of ch o ice is n ot part
of the choice phase but involves h ow a p e rson establishes decision-making objective(s)
a nd incorporates the objective(s) into the m odel(s). Are we w illing to assum e high
risk , or do we pre fer a low-risk approach? Are we attempting to optimize or satisfice?
It is also important to recognize the diffe re n ce b etween a criterion a nd a con straint
(see Technology Ins ig hts 2.1). Among the many principles of choice, normative and
descriptive are of prime importan ce.
TECHNOLOGY INSIGHTS 2.1 The Difference Between a Criterion
and a Constraint
Many p eople new to the forma l study of decisio n making inadvenently confuse the concepts of
criterio n a nd constraint. Ofte n , this is because a criterion may imply a constraint, either implicit
or explicit, thereby adding to the confusion. For example, there may be a distance criterio n that
the decisio n make r does no t want to travel too far from ho me . However, there is a n imp licit
constraint that the alte rnatives from which he selects must be within a cen ain distance from his
ho me . This constraint effectively says tha t if the distance fro m h o me is greater tha n a ceitain
amo unt, the n the al ternative is not feasible- o r, rathe r, the distance to an alte rnative must b e less
than or equ al to a certain numbe r (this would be a formal relatio nship in some models; in the
model in this case, it reduces the search , considering fewer alternatives). This is similar to w h at
happens in some cases w he n selecting a u niversity, w here schools beyond a single day’s driv-
ing distan ce would no t be conside red by most people, and, in fact, the utility fun ction (criterio n
value) of dista nce can start o ut lo w close to home, p eak a t about 70 miles (about 100 km)- say,
the dista nce between Atlanta (home) a nd Athe ns, Georgia- a nd sharply drop off th ereafter.

Chapter 2 • Foundations and Technologies for Decis ion Making 49
Normative Models
Nonnative models are models in which the chosen alternative is d emo nstrably the best
of a ll possible alternatives. To find it, the decision maker should examine a ll the alterna-
tives and prove that the o ne selected is indeed the best, which is what the person would
no rmally want. This process is basically optimization. This is typically the goal of what
we call prescriptive a nalytics (Part IV) . In o p eratio n al terms, optimizatio n can be achieved
in o ne of three ways:
1 . Get the highest level of goal attainment from a given set of resources . For example,
w hich alte rnative w ill yield the maximum profit from an investment of $10 million?
2. Find the alternative with the hig hest ratio of goa l attainment to cost (e.g., profit per
d o llar invested) or maximize productivity.
3. Find the alternative w ith the lowest cost (or smallest amount of oth e r resources) that
w ill meet an acceptable level of goals. For example, if your task is to select hardware
for an intrane t with a minimum bandwidth, which alternative w ill accomplish this
goal at the least cost?
Normative decision theory is based o n the following assumptio n s of rational
decisio n m akers :
• Humans a re economic beings w hose objective is to maximize the attainment of
goals; that is, the decision maker is ratio nal. (More of a good thing [revenue, fun] is
better than less; less of a bad thing [cost, pa in] is better than more .)
• Fo r a decision-making situation, all viable alternative courses of actio n and their
conseque n ces, o r at least the probability and the values of the consequ ences, are
known .
• Decisio n make rs have an order or prefe ren ce that enables the m to rank the desir-
ability of a ll conseque n ces of the an alysis (best to worst).
Are decision makers really ratio nal? Though there may be major an omalies in the pre-
sumed rationality of financial and economic behavior, we take the view that they could be
cau sed by incompetence, lack of knowledge, multiple goals being framed inadequ ately, mis-
understanding of a decision maker’s true expected utility, and time-pressure impacts. There
are other anomalies, often caused by time pressure. For example, Stewart (2002) described
a number of researchers working with intuitive decision making. The idea of “thinking with
your gut” is obviously a heuristic approach to decision making. It works well for firefighters
and military personnel o n the battlefield. One critical aspect of decision making in this mode
is that many scen arios have been thought through in advance. Even when a situation is n ew,
it can quickly be matched to an existing o ne o n-the-fly , and a reasonable solution can be
obtained (through pattern recognition). Luce et al. (2004) described how emotions affect
decisio n making, and Pauly (2004) discussed inconsistencies in decision making.
We believe that irrationality is cau sed by the factors listed previously. Fo r exam-
ple, Tversky et al. 0990) investigated the phe nomen o n of preferen ce reve rsal, w hich is
a known problem in applying the AHP to problems. Also, some crite rio n or preference
may be omitted from the analysis. Ratner et a l. 0999) investigated how variety can cause
individuals to choose less-preferred optio ns, even though they w ill e njoy them less. But
we maintain that variety clearly has value, is part of a d ecision maker’s utility, and is a
criterion a nd/ or con straint that should be con sidered in decision m aking.
Suboptimization
By definition, optimizatio n requires a decision maker to consider the impact of each alter-
native course o f actio n on the e ntire o rganizatio n because a decision made in o ne area
may h ave significant effects (positive o r negative) o n other areas. Consider, for example , a

50 Pan I • Decision Making and Analytics: An Overview
marketing department that impleme nts a n electronic commerce (e-commerce) site. Within
hours, o rders far exceed production capacity. The production department, which plans
its own schedule , cannot meet demand. It may gear u p for as high demand as possi-
ble. Ideally and inde pende ntly , the department sh ould produce o nly a few products in
extre mely large quantities to minimize manufacturing costs. However, su ch a plan might
result in large, costly inventories and marketing difficulties caused by the lack of a variety
of products, especially if customers start to cancel orders that are n ot met in a timely way.
This s ituation illustrates the sequential nature of decision making.
A systems point of view assesses the impact of every decision o n the entire sys-
tem. Thus, the marketing department sho uld make its plans in conjunction with oth er
departments. However, su ch an approach may require a complicated, expen sive, time-
consuming analysis. In practice, the MSS builder may close the system within n arrow
boundaries, considering o nly the part of the organization under study (the marketing a nd/
or production department, in this case). By simplifying, the m od e l th e n does n ot incorpo-
rate certain complicated relatio n ships that describe interactio n s w ith and among the o ther
departments. The oth er departments can be aggregated into simple model components.
Such an approach is called suboptimization.
If a subo ptimal decision is made in one part of the o rganization w ithout considering
the details of the rest o f the organizatio n , then a n optimal solutio n from the point of view
of that p art may be inferior for the whole. However, su boptimizatio n may still be a very
practical approach to decision making, and many problems are first approached from this
perspective. It is possible to reach tentative conclusions (and generally u sable resu lts) by
analyzing only a portion of a system, witho ut getting bogged down in too many details.
After a solutio n is proposed, its potential effects on the remaining depa1tments of the
organizatio n can be tested. If no significant negative effects are found, the solution can
be imple m ented.
Suboptimization may also apply w h e n simplifying assumption s are used in mod-
eling a specific problem. There may be too many details or too m any data to incorporate
into a specific decision-making situatio n , and so n ot all of them are used in the model.
If the solutio n to the mode l seems reasonable, it may be valid for the problem and thus
be adopted. For example, in a production department, parts are often partitioned into
A/ B/ C inventory categories. Generally, A items (e.g ., large gears, w h ole assemblies) are
expen sive (say, $3,000 or more each), built to order in small batches, and inventoried in
low quantities; C item s (e.g ., nuts, bolts, screws) are very inexpensive (say, less than $2)
and o rdered a nd used in ve1y large quantities; and B ite ms fall in betwee n . All A items
can be handle d by a detailed scheduling model and physically monitored closely by man-
agement; B ite ms a re gen e rally som ewh at aggregated, their groupings are sch eduled, and
m an ageme nt reviews these parts less frequently; an d C items are n ot sch eduled but are
simply acquired or built based on a p o licy defined by management w ith a simple eco-
n o mic order quantity (EOQ) o rdering system that assumes consta nt annual demand. The
policy mig ht be reviewed o nce a year. This situation applies w he n determining all crite ria
o r mo deling the entire problem becomes p rohibitively time-consuming or expensive.
Suboptimization may also involve simply bounding the search fo r an optimum
(e.g., by a heuristic) by conside ring fewer criteria o r alternatives or by eliminating large
portions of the problem from evaluation. If it takes too lo ng to solve a problem, a good-
e nough solutio n found already may be used and the optimization effort terminated.
Descriptive Models
Descriptive models describe things as they are or as they are believed to be. These
models are typically mathematically based. Descriptive models are extremely u seful in
DSS for investigating the consequences of various alternative courses of actio n under

Chapter 2 • Foundations and Technologies for Decis ion Making 51
different configurations of inputs a nd processes. However, because a descriptive analysis
checks the p erforma nce of the syste m for a given set of alternatives (rather than for all
alternatives), there is no guarantee tha t a n alternative selected w ith the aid of descriptive
analysis is o ptimal. In m any cases, it is o nly satisfactory.
Simulation is probably the most commo n descriptive modeling method. Simulation
is the imitation of reality and has been applied to many areas of decisio n making.
Computer and video games are a form of simulation : An a rtificial reality is created, and
the game p layer lives within it. Vi11ual reality is also a form of s imulation because the e nvi-
ronment is simulated, not real. A common u se of simulation is in manufacturing. Again,
consider the productio n departme nt of a firm w ith complicatio n s caused by the marketing
de paltme nt. The characteristics of each machine in a job shop along the supply chain
can be described mathematically. Relatio nships can be established based o n how each
machine physically runs and relates to others. Given a trial sch edule of batches of parts ,
it is possib le to measure how batches flow through the system and to use th e statistics
fro m each machine. Alte rnative schedules may the n be tried and the statistics recorded
until a reasonable schedule is found. Marketing can examine access and purchase pat-
terns on its Web site. Simulation can be u sed to determine h ow to structure a Web site for
improved p e rforma n ce an d to estimate future purchases. Both departments can therefore
use primarily exp erime ntal modeling methods.
Classes of descriptive models include the following :
• Complex inventory decisions
• Environmental impact analysis
• Financial planning
• Information flow
• Markov a n alysis (predictio n s)
• Scenario analysis
• Simulation (alternative types)
• Technological forecasting
• Waiting-line (queuing) management
A number of nonmathematical descriptive models are available for d ecision mak-
ing. One is the cognitive map (see Eden and Ackermann, 2002; and Jenkins, 2002). A
cognitive map can he lp a decision ma ke r sketch out the impoltant qualitative factors and
their causal re lationships in a messy decision-making situation. This helps the decision
mak er (or decision-making group ) focus o n w hat is relevant and w hat is not, and the
map evolves as more is learned about the problem. The map can h e lp the d ecisio n ma ker
understand issues better, focus better, and reach closure. One inte resting software tool
for cognitive mapping is Decisio n Explorer fro m Banxia Software Ltd. (banxia.com; try
the demo) .
Another descriptive decisio n-ma king model is the u se of narratives to describe a
decision-ma king situatio n. A narrative is a story that he lps a decisio n maker uncover the
impoltant aspects of the situation and leads to better understanding and framing. This is
extrem ely effective w he n a group is making a decision, and it can lead to a m o re com-
mo n viewpoint, also called a frame. Juries in coult trials typically use n arrative-based
approaches in reaching verdicts (see Allan, Frame, and Turney, 2003; Beach, 2005; and
Denning, 2000).
Good Enough, or Satisficing
According to Simon 0977), most human decision making, w hethe r organizatio nal or indi-
vidual, involves a w illingness to settle for a satisfactory solutio n , “something less than the
best.” When satisficing, the decisio n make r sets up a n aspiratio n , a goal, o r a desired

52 Pan I • Decision Making and Analytics: An Overview
level of performance and then searches the alternatives until o ne is found that achieves
this level. The u sual reasons for satisficing are time pressures (e.g., decisions may lose
value over time), the ability to achieve optimization (e.g., solving som e models could
take a really long time, and recognition that the m arginal be nefit of a better solution is
no t worth the marginal cost to obtain it (e.g. , in searching the Inte rnet, you can look at
o nly so many Web sites before you run out of time and energy). In such a situation, the
decision maker is behaving rationally, though in reality h e o r she is satisficing. Essen tially,
satisficing is a form of suboptimizatio n . There may be a best solution , an optimum , but it
would be difficult, if not impossible, to a ttain it. With a normative model, too much com-
putation m ay be involved; w ith a descriptive model, it may n ot be possible to evaluate all
the sets of alte rnatives.
Rela ted to satisficing is Simo n ‘s idea of bounded rationality. Humans h ave a
limited capacity for rational thinking; they generally con stru ct and analyze a sim-
plified model o f a real situation by considering fewer alternatives, criteria, and/ or
con straints than actually exist. The ir behavior w ith respect to the simplified model
m ay be ratio n a l. However, the ration al solutio n for the simplified model may n ot be
rational for the real-world problem. Rationality is bounded not o nly by limitations on
huma n processing cap acities, but also by individual differences, such as age, edu ca-
tion, knowledge, a nd attitudes. Bounded ratio n ality is also why ma ny models are
descriptive rather tha n n ormative. This may also explain why so m any good managers
rely o n intuitio n , a n important aspect of good decis ion making (see Stewart, 2002 ; a nd
Pauly, 2004).
Because rationality a nd the use of normative models lead to good decision s, it is
n atural to ask w hy so ma ny bad decisions are made in practice. Intuitio n is a c ritical
factor that d ecis io n m akers u se in solv ing unstructured and semistructured problem s.
The best decision makers recognize the trade -off between the m argin al cost of obtain-
ing further info rma tio n a nd an alysis versus the benefit of making a better decision. But
sometimes decisions must be made quickly, and, ideally, the intuitio n of a season ed,
excellent decision maker is called for. When ad equate planning, funding, or informa-
tion is n o t ava ilable, o r w he n a decision maker is inexperie nced or ill trained, disaster
can strike.
Developing (Generating) Alternatives
A significant part of the model-build ing process is gene rating alternatives. In optimization
models (such as linear programming), the alternatives may be generated automatically by
the model. In most decision situations, however, it is n ecessary to generate alternatives
m anually . This can be a le ngthy process that involves searching an d creativity, perhaps
utilizing electronic brainstorming in a GSS. It takes time and costs money. Issu es such as
w hen to stop generating alternatives can be very impo rta nt. Too many alternatives can be
d etrime ntal to the process o f decision making. A decisio n maker may suffer from info rma-
tion overload.
Gen erating alternatives is heavily dependent o n the availability and cost of informa-
tion a nd re quires expe rtise in the problem area. This is the least formal aspect of problem
solving. Alternatives can be generated and evaluated u sing heuristics. The generatio n of
a lte rnatives fro m either ind ividuals or groups can be supported by electronic brainstorm-
ing software in a Web-based GSS.
Note that the search for alterna tives u sually occurs after the criteria for evaluating the
a lte rnatives a re determined. This sequen ce can ease the search fo r alternatives and redu ce
the effort involved in evalua ting them, but identifying potential alternatives can sometimes
a id in ide ntifying c rite ria.

Chapter 2 • Foundations and Technologies for Decision Making 53
The outcome of every proposed alternative must be established. Depending on
whether the decision-making problem is classified as one of certainty, risk, or uncertainty,
diffe rent modeling approaches may be u sed (see Drummond, 2001; and Koller, 2000).
These a re discussed in Chapter 9.
Measuring Outcomes
The value o f a n a lte rna tive is eva lu ated in terms of goal atta inme n t. Sometimes a n
outcome is expressed directly in terms of a goal. For example, profit is a n outcome,
profit maximization is a goal, and both are expressed in dollar terms. An outcome such
as cu stomer satisfactio n may be measured by the number of complaints, by the level
of loyalty to a product, or by ratings found through surveys . Ideally, a decision maker
would wan t to deal w ith a single goal, but in practice, it is not unusual to have multiple
goals (see Barba-Romero, 2001; and Koksalan a nd Zionts, 2001). When grou ps make
decisions, each group participant may h ave a diffe re nt agenda . For example, executives
mig ht want to maximize profit, marketing might want to maximize market penetration,
operations might want to minimize costs, and stockholders mig ht want to maximize the
bo tto m line . Typically, these goals conflict, so special multiple-criteria meth odologies
have been developed to ha ndle this. One such me thod is the AHP. We will study AHP
in Cha pter 9.
Risk
All decisions are made in an inhe re ntly unstable e nv ironment. This is d ue to the man y
unpredictable events in both the economic and physical e nvironments . Some risk (m eas-
ured as probability) may be due to inte rnal o rganizatio n al events, such as a valued
employee quitting or becoming ill , w h ereas othe rs may be due to natural disasters, such
as a hurricane . Aside fro m the human toll, one economic aspect of Hurricane Katrina was
that the price of a gallon of gasoline doubled overnight due to uncertainty in the port
capabilities, refining, a nd pipelines of the south e rn United States. What can a decisio n
maker do in the face of such instability?
In gen eral, p eople have a tendency to measure uncertainty a nd risk badly. Purdy
(2005) said that people tend to be overconfide nt and have an illusion of control in
decisio n making. The results of experiments by Adam Goodie at the University of
Georgia ind icate that m ost people are overconfident most of the tim e (Goodie, 2004).
This m ay explain w hy people often feel that o ne more pull of a slot machine w ill
definitely p ay o ff.
However, meth odologies for h andling extreme uncertainty do exist. For example,
Yakov (2001) described a way to make good decisions based o n very little info rmatio n ,
using an information gap theory and methodology approach . Aside from estimating the
potential utility o r va lue of a particular decision’s o utcome, the best decision makers are
capable of accurately estimating the risk associated with the outcomes that result from
making each decision. Thus, on e impo rta nt task of a decision maker is to attribute a level
of risk to the o utcome associated w ith each potential alte rnative being con sidered. Some
decisions may lead to unacceptable risks in terms of success a nd can therefore be dis-
carded or discounted immediately.
In some cases, some decisions a re assumed to be m ade under condition s of cer-
ta inty simply because the e nvironment is assumed to be stable. Other decisions are
made unde r conditio n s of uncertainty, w h ere risk is unknown. Still , a good decision
ma ker can make working estimates of risk. Also, the p rocess of developing BI/DSS
involves learning more about the situatio n , w hich leads to a more accurate assessment
of the risks .

54 Pan I • Decision Making and Analytics: An Overview
Scenarios
A scenario is a statement of assumptio ns about the operating environment of a p articu-
la r system at a given time; tha t is, it is a n arrative description of the d e cision-situation
setting . A scenario d escribes the decision a nd uncontrollable variables and parameters
for a sp ecific mo d eling situation. It may also p rovide the p rocedures and con straints for
the modeling.
Scenarios originated in the theater, an d the term was borrowed for war gaming and
large-scale simulations. Scenario planning and analysis is a DSS tool that can capture a
whole range o f possibilities. A ma nager can constm ct a series of scenarios (i.e., w hat-if
cases) , perform computerized analyses, and learn more abo u t the system and decision-
making proble m w hile an alyzing it. Ideally, the manager can identify an excellent, p ossibly
o ptimal, solution to the mo de l of the problem.
Scen arios are especially helpful in simulation s and what-if a nalyses . In both cases,
we ch ange scenarios a nd examine the results. For example, we can ch ange the anticipated
dema nd for hospitalization (an input variable for pla nning), thus creating a new scenario .
The n we can measure the anticipated cash flow of the h ospital for each scenario .
Scenarios p lay an impo rtant role in decision making because they:
• Help identify opportunities a nd proble m areas
• Provide flexibility in p lanning
• Ide ntify the leading edges of cha nges that management sh ould monitor
• He lp validate major mod e ling assumptions
• Allow the decisio n maker to explore the behavior of a system throu g h a model
• Help to check the sensitivity of proposed solutions to changes in the e nvironment,
as described by the scen ario
Possible Scenarios
The re may be thousands of possib le scena rios for every decision situatio n. However, the
following are especia lly u seful in practice:
• The worst possible scenario
• The best possible scenario
• The most like ly scen a rio
• The average scenario
The scen ario determines the context of the a nalysis to be performed .
Errors in Decision Making
The model is a critical component in the decisio n-making p rocess, but a decisio n maker
m ay make a number of e rrors in its development and u se. Validating the model before it
is used is critical. Gathering the rig ht amount of information , w ith th e right level of preci-
sion and accuracy, to incorporate into the decision-making p rocess is also critical. Sawyer
0999) described “the seven deadly sins of decision making, ” most of w hich are behavior
o r informatio n re la ted .
SECTION 2.5 REVIEW QUESTIONS
1. Define optimization and contrast it w ith suboptimization.
2. Compare the normative a nd d escriptive approaches to d ecision making .
3. Define rational decision making. What does it really mean to be a rational decisio n
m aker?
4. Why do people exhibit bounded ratio n ality when solving problems?

Chapter 2 • Foundations and Technologies for Decision Making 55
5. Define scenario. How is a scenario u sed in decision m aking?
6. Some “errors” in decision making can be attributed to the n o tion of decisio n making
from the gut. Explain w h at is meant by this a nd h ow such e rrors can happ e n.
2.6 DECISION MAKING: THE CHOICE PHASE
Choice is the critical act of decision making. The choice phase is the o ne in which the
actual d ecisio n and the commitment to follow a certain course of action are m ade . The
boundary between the design and ch oice phases is often unclear because certain activi-
ties can be performed during both of them and because the decision maker can return
freque ntly from cho ice activities to design activities (e.g. , generate new alternatives w hile
performing an evaluatio n of existing o nes) . The cho ice phase includ es the search fo r,
evaluation of, and recommendation of an appropriate solutio n to a model. A solution to a
model is a sp ecific set of values for the decision variables in a selected alternative. Ch o ices
can be evaluated as to the ir viability and profitability.
Note that solving a model is not the same as solving the problem the model represents.
The solution to the mod e l yie lds a recomme nded solutio n to the problem. The p roblem is
considered solved o nly if the recommended solution is su ccessfully implemented.
Solving a decision-making model invo lves searching for an appropriate course
of actio n . Search approaches include analytical techniques (i.e. , solving a formula) ,
algorithms (i.e. , step-by-step procedures), heuristics (i.e., rules o f thumb), and blind
search es (i.e ., sh ooting in the d ark, ideally in a logical way). These approaches are
examined in Cha pter 9.
Each alternative must be evaluated. If an alternative has multiple goals, they must
all be examine d and balanced against each oth er. Sensitivity analysis is u sed to deter-
mine the robustness of any given alte rnative ; slig ht ch anges in the parameters sh o uld
ideally lead to slight or no ch a nges in the alte rnative chosen. What-if analysis is
used to explo re ma jo r ch an ges in the parameters. Goal seeking h e lps a m a nager deter-
mine values of the decis ion variables to meet a specific objective . All this is discussed
in Cha pte r 9 .
SECTION 2.6 REVIEW QUESTIONS
1. Explain the difference between a principle of choice and the actual choice phase of
decisio n making.
2. Why do som e people claim that the choice phase is the point in time w h e n a decisio n
is really made?
3. How can sen sitivity a nalysis h elp in the choice phase?
2.7 DECISION MAKING: THE IMPLEMENTATION PHASE
In The Prince, Machiavelli astutely noted some 500 years ago that there was “nothing more
difficult to carry out, nor more do ubtful of success, nor more dangerous to handle , than to
initiate a new o rder of things. ” The implementation of a proposed solution to a problem is ,
in effect, the initiatio n of a n ew order of things or the introduction of change. And change
must be managed. User expectations must be managed as part of ch ange management.
The definitio n o f implementation is som ewh at complicated b ecau se impleme ntation
is a lo ng, involved process w ith vague boundaries. Simplistically, the implementation
phase involves putting a recommended solution to work, not necessarily implementing
a compute r system. Many generic imple me nta tio n issues, such as resistan ce to change ,
degree of suppo rt o f top ma nageme nt, a nd u ser training, are important in dea ling w ith

56 Pan I • Decision Making and Analytics: An Overview
information system supported decision m aking. Indeed, many previous technology-
related waves (e.g., business p rocess reengineering (BPR), knowledge ma nagement,
e tc .) have faced mixed results mainly because of change management challenges and
issues. Management of ch a nge is almost an entire discipline in itself, so we recogn ize its
impo rtan ce a nd e n courage the readers to focus o n it indepen dently. Implemen tation also
includes a thorou g h understanding of project management. Importan ce of project man-
agement goes far beyond a nalytics, so the last few years have w itnessed a m ajor growth
in ce1tificatio n progra ms fo r project ma nagers. A very p opular certificatio n now is P roject
Management Professional (PMP). See pmi.org for more details .
Implementatio n must also involve collecting and a nalyzing data to learn from the
previous decisions and improve the next decision. Although analysis of data is usually
conducted to identify the problem and/ o r the solution, a n alytics should also be employed
in the feedback process. This is especia lly true for any public p o licy decisions. We n eed
to be sure that the data being used for problem identification is valid. Sometimes people
find this o ut o nly afte r the impleme ntation phase.
The decisio n-making process, though conducted by people, can be improved with
computer su pport, which is the subject of the n ext section.
SECTION 2. 7 REVIEW QUESTIONS
1. Define implementation.
2. How can DSS support the implementation of a decision?
2.8 HOW DECISIONS ARE SUPPORTED
In Chapter 1, we d iscussed the n eed for computerized decision support and briefly
described some decision aids. Here we relate specific technologies to the decision-
making process (see Figure 2.2). Databases, data m arts, an d especially data ware h o u ses
a re important technologies in supporting a ll phases of decisio n making. They provide
the data that drive decision m a king.
Support for the Intelligence Phase
The primary requirement of decision support for the intelligence phase is the ability to scan
exte rnal and internal information sources for opportunities and problems and to interpret
w hat the scanning discovers. Web tools and sources are extrem ely useful for environmen tal
Phase
:- —.-~1 __ 1n_t_e1_1ig_e_n_c_e_ 1~{
\~ –.-~1 __ o_e_s_ig_n __ I-
\~ –.-~1 __ c_h_o_ic_e __ I-
l —.- ~1 _,m_p_le_m_e_n_ta_t_io_n~I-{
FIGURE 2.2 DSS Support.
ANN
MIS
Data Mining, OLAP
ES, ERP
ESS, ES, SCM
CRM, ERP, KVS
Management
Science
ANN
ESS, ES
KMS, ERP
DSS
ES
CRM
SCM

Chapter 2 • Foundations and Technologies for Decision Making 57
scanning. Web browsers provide useful front ends for a variety of tools, from OLAF to data
mining and data warehouses. Data sources can be internal or external. Internal sources may
be accessible via a corporate intranet. External sources are many and varied.
Decision support/ BI technologies can be very helpful. For example, a data ware-
house can support the intelligence phase by continuously monitoring both internal and
external information, looking for early sign s of problems and opportunities through
a Web-based enterprise information portal (also called a dashboard) . Similarly, (automatic)
data (a nd Web) mining (which may include expert systems [ES], CRM, genetic a lgorithms ,
neural networks, and other analytics systems) and (manual) OLAP also support the intel-
ligence phase by identifying relationships among activities and other factors. Geographic
information systems (GIS) can be utilized either as stand-alone systems or integrated with
these systems so that a decision maker can determine opportunities and problems in a
spatial sense. These relationships can be exploited for competitive advantage (e.g., CRM
identifies classes of customers to approach with specific products and services). A KMS
can be used to identify similar past situations and how they were handled. GSS can be
used to share information and for brainstorming. As seen in Chapter 14, even cell phone
and GPS data can be captured to create a micro-view of customers and their habits.
Another aspect of identifying internal problems and capabilities involves monitoring
the current status of operations. When something goes wrong, it can be identified quickly
and the problem can be solved. Tools such as business activity monitoring (BAM), busi-
ness process management (BPM) , and product life-cycle management (PLM) provide such
capability to decision makers. Both routine and ad hoc reports can aid in the intelligence
phase. For example, regular reports can be designed to assist in the problem-finding
activity by comparing expectations with current and projected performance. Web-based
OLAP tools are excellent at this task. So are visualization tools and electronic document
management systems.
Expert systems (ES) , in contrast, can render advice regarding the nature of a prob-
lem, its classification, its seriousness, and the like. ES can advise on the suitability of a
solution approach and the likelihood of successfully solving the problem. One of the
primary areas of ES success is interpreting information and d iagnosing problems. This
capability can be exploited in the inte lligence phase. Even intelligent agents can be used
to identify opportunities.
Much of the information used in seeking new opportunities is qualitative, or soft.
This indicates a high level of unstructuredness in the problems, thus making DSS quite
useful in the intelligence phase.
The Internet and advanced database technologies have created a glut of data and
information available to decision makers-so much that it can detract from the quality
and speed of decision making. It is important to recognize some issues in using data and
analytics tools for decision making. First, to paraphrase baseball great Vin Scully, “data
should be used the way a drunk uses a lamppost. For support, not for illumination.” It
is especially true when the focus is on understanding the problem. We should recognize
that not all the data that may help understand the problem is available. To quote Einstein,
“Not everything that counts can be counted, and not everything that can be counted
counts. ” There might be other issues that have to be recognized as well.
Support for the Design Phase
The design phase involves generating alternative courses of action, discussing the criteria
for choices and their relative importance, and forecasting the future consequences of
using various alternatives. Several of these activities can use standard models provided by
a DSS (e.g., financial and forecasting models, available as applets) . Alternatives for struc-
tured problems can be generated through the use of e ither standard or special models.

58 Pan I • Decision Making and Analytics: An Overview
However, the genera tion of alternatives for complex problems requires expertise that can
be provided o nly by a human, brainstorming software, or an ES. OLAP and data mining
softwa re are quite useful in identifying relatio nsh ips that can be u sed in m odels. Most DSS
h ave quantitative a nalysis cap abilities, and an internal ES can assist with qualitative meth-
ods as well as with the expertise required in selecting quantitative an alysis and forecasting
models. A KMS should certainly be consulted to determine whether such a problem h as
been en counte red before o r whether there are experts on h and w h o can provide quick
unde rstanding and a nswers. CRM syste ms, revenue ma nageme nt systems, ERP , a nd SCM
systems software are u seful in that they provide models of business processes that can test
assumptio ns and scen arios. If a problem requires brainstorming to help identify important
issu es and options, a GSS may prove helpful. Tools that provide cognitive mapping can
also help. Cohen et al. (2001) described several Web-based tools that p rovide decisio n
suppo rt, mainly in the design phase, by providing models a nd reporting of alternative
results . Each of the ir cases has saved millio n s o f dollars annually by utilizing these tools.
Such DSS are helping en gineers in product d esign as well as decision makers solving
business problems.
Support for the Choice Phase
In additio n to providing models that rapidly ide ntify a best or good-en ough alternative,
a DSS can support the choice phase throug h what-if and goal-seeking a nalyses. Different
scenarios can be tested for the selected option to reinforce the final decision. Again, a KMS
helps identify similar p ast experie n ces; CRM, ERP , and SCM systems are u sed to test the
impacts of decisio ns in establishing their value, leading to an intelligent choice. An ES can
be u sed to assess the desirability of certain solutions as well as to recommend an appropri-
ate solutio n . If a group ma kes a decision, a GSS can provide support to lead to consensus.
Support for the Implementation Phase
This is w h ere “making the decision h app e n ” occurs . The DSS benefits p rovided during
imple me ntatio n may be as impo rta nt as o r even more importa nt than th ose in the earlier
phases. DSS can be u sed in implementation activities such as decision communication,
explanatio n, a nd justificatio n .
Implementation-phase DSS benefits are partly due to the vividness and d etail of
analyses and reports. For example , one chief executive officer (CEO) gives employees
and external parties n o t o nly the aggregate financial goals and cash needs for the n ear
term, but also the calculation s, interme diate results, and statistics u sed in d etermining
the aggregate figures. In additio n to communicating the financial goals unambiguously,
the CEO sign als oth e r messages. Employees know that the CEO has thought through the
assumptions behind the financial goals and is serious about their importance and attain-
ability. Bankers a nd directors are shown that the CEO was personally involved in a n a-
lyzing cash need s and is aware o f a nd respons ible fo r the implications of the fina ncing
requests prepa red by the finance department. Each o f these messages improves decisio n
imple me ntatio n in some way.
As mentioned earlier, reporting systems a nd other tools variou sly labeled as BAM,
BPM, KMS , EIS, ERP, CRM, and SCM are all useful in tracking how well an implementation
is working. GSS is useful for a tea m to collaborate in establishing implementation effec-
tiveness . For example, a d ecisio n might be made to get rid of unprofitable customers. An
effective CRM can identify classes of customers to get rid of, identify the impact of doing
so, and then verify that it really worked that way.
All phases of the decision-making process can be suppo rted by improved communica-
tion through collaborative computing via GSS and KMS. Computerized systems can facilitate
communicatio n by he lping people explain and justify the ir suggestio ns and opinio ns.

Chapter 2 • Foundations and Technologies for Decision Making 59
Decision implementation can also be supported by ES. An ES can be used as an advi-
sory system regarding implementation problems (such as handling resistance to change).
Finally, an ES can provide training that may smooth the course of implementation.
Impacts along the value chain, though reported by an EIS through a Web-based
enterprise information portal, are typically identified by BAM, BPM, SCM, and ERP systems.
CRM systems report and update internal records, based on the impacts of the implementa-
tion. These inputs are then used to identify new problems and opportunities- a return to
the intelligence phase.
SECTION 2.8 REVIEW QUESTIONS
1. Describe how DSS/BI technologies and tools can aid in each phase of decision making.
2. Describe how new technologies can provide decision-making support.
Now that we have studied how technology can assist in decision making, we study some
details of decision support systems (DSS) in the next two sections.
2.9 DECISION SUPPORT SYSTEMS: CAPABILITIES
The early definitions of a DSS identified it as a system intended to support managerial
decision makers in semistructured and unstructured decision situations. DSS were meant
to be adjuncts to decision makers , extending their capabilities but not replacing their judg-
ment. They were aimed at decisions that required judgment or at decisions that could not
be completely supported by algorithms . Not specifically stated but implied in the early
definitions was the notion that the system would be computer based, would operate inter-
actively online, and preferably would have graphical output capabilities, now simplified
via browsers and mobile devices.
A DSS Application
A DSS is typically built to support the solution of a certain problem or to evaluate an
opportunity. This is a key difference between DSS and BI applications. In a very strict
sense, business intelligence (BI) systems monitor situations and identify problems and/
or opportunities, using analytic methods. Reporting plays a major role in BI; the user
generally must identify whether a particular situation warrants attention, and then analyti-
cal methods can be applied. Again, although models and data access (generally through
a data warehouse) are included in BI, DSS typically have their own databases and are
developed to solve a specific problem or set of problems. They are therefore called
DSS applications.
Formally, a DSS is an approach (or methodology) for supporting decision making .
It uses an interactive, flexible, adaptable computer-based information system (CBIS)
especia lly developed for supporting the solution to a specific unstructured manage-
ment problem. It uses data, provides an easy user interface, and can incorporate the
decision maker’s own insights. In addition, a DSS includes models and is developed
(possibly by end users) through an interactive and iterative process. It can support a ll
phases of decision making and may include a knowledge component. Finally, a DSS
can be used by a single user or can be Web based for use by many people at several
locations.
Because there is no consensus on exactly what a DSS is, there is obviously no agree-
ment on the standard characteristics and capabilities of DSS. The capabilities in Figure 2.3
constitute an ideal set, some members of which are described in the definitions of DSS
and illustrated in the application cases.
The key characteristics and capabilities of DSS (as s hown in Figure 2.3) are:

60 Pan I • Decision Making and Analytics: An Overview
13
12
11
Data access
Modeling
and analysis
Ease of
development
by end users
10
the process
9
14
Stand-alone,
integration, and
Web-based
8
1
Semistructured
or unstructured
problems
2
Support
managers at
all levels
3
Support
individuals
and groups
Interdependent
or sequential
decisions
Support
intelligence
design, choice , and
implementation
Support variety
of decision
processes and styles
Effectiveness
and efficiency
Interactive ,
ease of use
Adaptable
and f lexible
FIGURE 2.3 Key Characteristics and Capabilities of DSS.
1. Support for decision makers, mainly in semistructured and unstructured situ ations,
by bringing together human judgment and computerized information. Such prob-
lems cannot be solved (or cannot be solved conveniently) by oth er computerized
systems or through use of sta ndard quantitative methods or tools. Gen erally , these
problems gain structure as the DSS is developed. Even some structured problems
have been solved by DSS.
2 . Support for a ll managerial levels, ranging from top executives to line managers.
3. Support for individuals as well as groups . Less-structured problems often require the
involvement of individuals from d ifferent departments and organizational levels or
even from d ifferent o rganizations. DSS support virtual teams through collaborative
Web tools. DSS have been developed to support individual and group work, as well
as to support individual decisio n making and groups of decision makers working
somewhat independently.
4. Support fo r interdependent a nd/ or sequential decisions. The decisio ns may be made
once, several times, or repeatedly .
5. Support in all phases of the decision-making process: intelligence, design, choice,
and impleme ntation.
6. Support for a variety of decision-making processes and styles.
7. The decision m aker should be reactive, able to confront cha nging con ditions quickly,
an d able to adapt the DSS to meet these changes. DSS are flexible, so users can add,
delete, combine, change, or rearrange basic elements. They are also flexible in that
they can be readily modified to solve oth er, similar p roblems.

Chapter 2 • Foundations and Technologies for Decision Making 61
8 . User-friendliness, strong graphical capabilities, and a n atural language interactive
human-machine interface can greatly increase the effectiveness of DSS. Most new
DSS applications u se Web-based inte rfaces or mobile platform interfaces.
9 . Improvement of the effectiveness of decision making (e.g. , accuracy, timeliness,
quality) rathe r tha n its efficie ncy (e.g., the cost of making decisio n s). When DSS are
deployed , decision making often takes longer, but the decisions are better.
10. The decision maker h as comple te control over all step s of th e decisio n-making
process in solving a proble m . A DSS sp ecifically aims to support, not to replace, the
decision maker.
11. End u sers are able to develop a nd modify simple systems by them selves. Larger
systems can be built with assistance from informatio n system (IS) specialists.
Spreadsheet p ackages have been utilized in developing simpler systems. OLAP and
data mining software, in conjunctio n w ith d ata warehouses, e nable u sers to build
fairly large, complex DSS.
12. Models a re generally utilized to analyze decision-making situation s. The mod-
e ling capability e n ables experim e ntation with different strategies under diffe re nt
config urations .
13. Access is provided to a variety o f data sources, formats, a nd types, including GIS ,
multimedia , a nd object-o rie nted data.
14. The DSS can be employed as a stand-alone tool u sed by an individual decision maker
in one locatio n or distributed throughout an o rganization and in several organizations
along the supply chain. It can be integrated with other DSS and/ or applications, and it
can be distributed internally and exte rnally, u sing networking and Web technologies.
These key DSS characteristics and cap abilities allow decision make rs to make
better, more consistent decisions in a time ly m anne r, a nd they are provided by the major
DSS compo nents, w hich we w ill describe after discussing various ways of classifying
DSS (n ext) .
SECTION 2.9 REVIEW QUESTIONS
1. List the key characte ristics and capabilities of DSS.
2. Describe h ow providing support to a workgroup is different from providing support
to group work . Explain why it is impo rtant to differentiate these concepts.
3. What kinds of DSS can end users develop in spreadsheets?
4. Why is it so important to include a m odel in a DSS?
2.10 DSS CLASSIFICATIONS
DSS application s have been classified in several different ways (see Power, 2002; Power
and Sharda, 2009). The design process, as well as the operatio n a nd implementatio n of DSS,
depends in many cases on the type of DSS involved . However, remember that not every
DSS fits neatly into o n e category. Most fit into the classification provided by the Association
for Information Syste ms Special Interest Group on Decision Support Systems (AIS SIGDSS).
We discuss this classification but also point out a few other attempts at classifying DSS.
The AIS SIGDSS Classification for DSS
The AIS SIGDSS (ais.site-ym.com/group/SIGDSS) h as adopted a concise classification
scheme for DSS that was proposed by Power (2002). It includes the following categories:
• Communicatio n s-driven and group DSS (GSS)
• Data-driven DSS

62 Pan I • Decision Making and Analytics: An Overview
• Document-driven DSS
• Knowledge-driven DSS, data mining, and management ES applications
• Model-driven DSS
There may also be hybrids that combine two or more categories. These are called
compound DSS. We discuss the major categories next.
COMMUNICATIONS-DRIVEN AND GROUP DSS Communications-driven and group DSS
(GSS) include DSS tha t use computer, collaboration, and communication technologies
to support groups in tasks that may or may not include decision making. Essentially,
all DSS that support any kind of group work fall into this category. They include
those that support meetings, design collaboration, and even supply c hain management.
Knowledge management systems (KMS) that a re developed around communities that
practice collaborative work also fall into this category. We discuss these in more detail
in later chapters.
DATA-DRIVEN DSS Data-driven DSS are primarily involved with data and processing
them into information and presenting the information to a decision maker. Many DSS
developed in OLAP and reporting analytics software systems fall into this category. There
is minimal emphasis on the use of mathematical models.
In this type of DSS, the database organization, often in a data warehouse, plays
a major role in the DSS structure . Early generations of database-oriented DSS mainly
used the relational database configuration. The information handled by relational
databases tends to be voluminous, descriptive, and rigidly structured. A database-
oriented DSS features strong report generation and query capabilities . Indeed, this
is primarily the current application of the tools marked under the BI umbrella or
under the label of reporting/business analytics. The chapters on data warehousing and
business performance management (BPM) describe several examples of this category
of DSS.
DOCUMENT-DRIVEN DSS Document-driven DSS rely on knowledge coding, analysis,
search, and retrieval for decision suppo rt. They essentially include all DSS that are text
based. Most KMS fall into this category. These DSS also have minimal emphasis on utiliz-
ing mathematical models. For example , a system that we built for the U.S . Army’s Defense
Ammunitions Center fa lls in this catego1y. The main objective of document-driven DSS is
to provide support for decision making using documents in various forms: oral, written,
and multimedia.
KNOWLEDGE-DRIVEN DSS, DATA MINING, AND MANAGEMENT EXPERT SYSTEMS
APPLICATIONS These DSS involve the application of knowledge technologies to address
specific decision support needs. Essentially , all artificial intelligence-based DSS fall into
this category. When symbolic storage is utilized in a DSS, it is generally in this category.
ANN and ES are included here. Because the benefits o f these intelligent DSS or knowledge-
based DSS can be large, organizations have invested in them. These DSS are utilized in the
creation of automated decision-making systems, as described in Chapter 12. The basic idea
is that rules are u sed to a utomate the decision-making process. These mies are basically
either an ES or structured like one. This is important when decisions must be made quickly,
as in many e-commerce situations.
MODEL-DRIVEN DSS The major emphases of DSS that are primarily developed around
one or more (large-scale/ complex) optimization or simulation models typically include
s ignificant activities in model formulation, model maintenance, model m anagement

Chapter 2 • Foundations and Technologies for Decision Making 63
in distributed computing environments, and what-if analyses . Many large-scale applica-
tions fall into this category. Notable examples include those used by Procter & Gamble
(Farasyn et al. , 2008), HP (Olavson and Fry, 2008), and many others.
The focus of such systems is on using the model(s) to optimize one or more objec-
tives (e.g., profit). The most common end-user tool for DSS development is Microsoft
Excel. Excel includes dozens of statistical packages, a linear programming package
(Solver), and many financial and management science models. We will study these in
more detail in Chapter 9. These DSS typically can be grouped under the new label of
prescriptive analytics.
COMPOUND DSS A compound, or hybrid, DSS includes two or more of the major cat-
egories described earlier. Often, an ES can benefit by utilizing some optimization, and
clearly a data-driven DSS can feed a large-scale optimization model. Sometimes docu-
ments are critical in understanding how to interpret the results of visualizing data from
a data-driven DSS.
An emerging example of a compound DSS is a product offered by WolframAlpha
(wolframalpha.com). It compiles knowledge from outside databases , models, algo-
rithms, documents, and so on to provide answers to specific questions. For example , it
ca n find and analyze current data for a stock and compare it w ith other stocks. It can
also tell you how many calories you will burn when performing a specific exercise or
the side effects of a particular medicine. Although it is in early stages as a collection of
knowledge components from many different areas, it is a good example of a compound
DSS in getting its knowledge from many diverse sources and attempting to synthesize it.
Other DSS Categories
Many other proposals have been made to classify DSS. Perhaps the first formal attempt
was by Alter (1980). Several other important categories of DSS include (1) institutional
and ad hoc DSS; (2) personal, group, and organizational support; (3) individual support
system versus GSS; and (4) custom-made systems versus ready-made systems. We discuss
some of these next.
INSTITUTIONAL AND AD HOC DSS Institutional DSS (see Donovan and Madnick, 1977)
deal with decisions of a recurring nature. A typical example is a portfolio management
system (PMS), which has been used by several large banks for supporting investment
decisions. An institutionalized DSS can be developed and refined as it evolves over a
number of years, because the DSS is used repeatedly to solve identical or similar prob-
lems. It is important to remember that an institutional DSS may not be used by everyone
in an organization; it is the recurring nature of the decision-making problem that deter-
mines whether a DSS is institutional versus ad hoc.
Ad hoc DSS deal w ith specific problems that are usually neither anticipated nor recur-
ring. Ad hoc decisions often involve strategic planning issues and sometimes management
control problems. Justifying a DSS that w ill be used only once or twice is a major issue
in DSS development. Countless ad hoc DSS applications have evolved into institutional
DSS. Either the problem recurs and the system is reused or others in the organization have
similar needs that can be handled by the formerly ad hoc DSS.
Custom-Made Systems Versus Ready-Made Systems
Many DSS are custom made for individual users and organizations. However, a com-
parable problem may exist in similar organizations. For example, hospitals, banks,
and universities share many similar problems. Similarly, certain nonroutine problems in
a functional area (e.g. , finance , accou nting) can repeat themselves in the same functional

64 Pan I • Decision Making and Analytics: An Overview
area of different areas or organizations. Therefore, it makes sense to build generic DSS
that can be used (sometimes with modifications) in several organizations. Such DSS
are called ready-made and are sold by various vendors (e.g., Cognos , MicroStrategy,
Teradata). Essentially, the database, models, interface, and other support features are
built in: Just add an organization’s data and logo. The major OLAP and analytics vendors
provide DSS templates for a variety of functional areas , including finance, real estate,
marketing, and accounting. The number of ready-made DSS continu es to increase
because of their flexibility and low cost. They are typically developed using Internet
technologies for database access and communications, and Web browsers for interfaces.
They also readily incorporate OLAP and other easy-to-use DSS generators.
One complication in terminology results when an organization develops an
institutional system but, because of its structure, uses it in an ad hoc manner. An organi-
zation can build a large data warehouse but then use OLAP tools to que1y it and perform
ad hoc analysis to solve nonrecurring problems. The DSS exhibits the traits of ad hoc
and institutional systems and also of custom and ready-made systems. Several ERP, CRM,
knowledge management (KM), and SCM companies offer DSS applications online. These
kinds of systems can be viewed as ready-made, although typically they require modifica-
tions (sometimes major) before they can be used effectively.
SECTION 2 . 10 REVIEW QUESTIONS
1. List the DSS classifications of the AIS SIGDSS.
2. Define document-driven DSS.
3. List the capabilities of institutional DSS and ad hoc DSS.
4. Define the term ready-made DSS.
2.11 COMPONENTS OF DECISION SUPPORT SYSTEMS
A DSS application can be composed of a data management subsystem, a model man-
agement subsystem, a user interface subsystem, and a knowledge-based m anagement
subsystem. We show these in Figure 2.4 .
FIGURE 2.4
Data: external
and/ or internal
~/,
§/
§
Organizational
Knowledge Base
Other
computer-based
systems
Data
Schematic View of DSS.
Internet,
intranet,
extranet
Model External
management models
Knowledge-based
subsystems
User
interface
t
Manager [user)

Chapter 2 • Foundations and Technologies for Decision Making 65
Finance
sources
Organizational
knowledge base
Query
facility
Data
directory
Internal Data Sources
Production
Decision
support
~-d-a-ta_b_as_e_~
Database
management
system
• Retrieval
• Inquiry
• Update
• Report .
generation
• Delete
FIGURE 2.5 Structure of the Data Management Subsystem.
The Data Management Subsystem
Private ,
personal
data
Cor porate
data
warehouse
Interface
management
Model
management
Knowledge-based
subsystem
The data management subsystem includes a database that contains relevant data for
the situation and is managed by software called the database management system
(DBMS) .2 The data management subsystem can be interconnected with the corporate
data warehouse, a repository for corporate relevant decision-making data. Usually, the
data are stored or accessed via a database Web server. The data management subsystem
is composed of the following e lements:
• DSS database
• Database management system
• Data directory
• Query facility
These e lements are shown schematically in Figure 2.5 (in the shaded area) . The figure
also shows the interaction of the data management subsystem with the other parts of the
DSS, as well as its interaction with several data sources. Many of the BI or descriptive
analytics applications derive their strength from the data management side of the subsys-
tems. Application Case 2.2 provides an example of a DSS that focuses on data.
The Model Management Subsystem
The model management subsystem is the component that includes financial, statistical,
management science, or other quantitative models that provide the system’s analytical
capabilities and appropriate software management. Modeling languages for bu ilding cus-
tom models are also included. This software is often called a model base management
‘DBMS is used as both singular and plura l (system and systems), as are many other acronyms in this text.

66 Pan I • Decision Making and Analytics: An Overview
Application Case 2.2
Station Casinos Wins by Building Customer Relationships Using Its Data
Station Casinos is a major provider of gaming for
Las Vegas-area residents. It owns about 20 proper-
ties in Nevada and other states, employs over 12,000
people, and has revenue of over $1 billion.
Station Casinos wanted to develop an in-depth
view of each customer/guest who visited Casino
Station propetties. This would permit them to bet-
ter understand customer trends as well as enhance
their one-to -one marketing for each guest. The com-
pany employed the Teradata warehouse to develop
the “Total Guest Worth” solution. The project used
used Aprimo Relationship Manager, Informatica, and
Cognos to capture, analyze, and segment customers.
Almost 500 different data sources were integrated to
develop the full view of a customer. As a result, the
company was able to realize the following benefits:
• Customer segments were expanded from 14
(originally) to 160 segments so as to be able to
target more specific promotions to each segment.
• A 4 percent to 6 percent increase in monthly
slot profit.
• Slot promotion costs were reduced by $1 million
(from $13 million per month) by better targeting
the customer segments.
• A 14 percent improvement in guest retention.
• Increased new-member acquisition by 160
percent.
• Reduction in data error rates from as high as
80 percent to less than 1 percent.
• Reduced the time to analyze a campaign’s effec-
tiveness from almost 2 weeks to just a few hours.
QUESTIONS FOR DISCUSSION
1. Why is this decision support system classified as
a data-focused DSS?
2. What were some of the benefits from implement-
ing this solution?
Source: Teradata .com, “No Limits: Station Casinos Breaks the
Mold on Custome r Re lationships ,” teradata.com/case-studies/
Station-Casinos-No-Limits-Station-Casinos-Brea1′:s-the-Mold-
on-Customer-Relationships-Executive-Summary-eb64 IO
(accessed February 2013).
system (MBMS) . This component can be connected to corporate or external storage
of models . Model solution methods and management systems are implemented in Web
development systems (such as Java) to run on application servers. The model manage-
ment subsystem of a DSS is composed of the following elements:
• Model base
• MBMS
• Modeling language
• Model directory
• Model execution, integratio n , and command processor
These elements and their interfaces with other DSS components are shown in Figure 2.6.
At a higher level than building blocks, it is important to consider the different types of
models and solutio n methods needed in the DSS. Often at the start of development, there is
some sense of the model types to be incorporated, but this may change as more is learned
about the decision problem. Some DSS development systems include a wide variety of com-
ponents (e.g., Analytica from Lumina Decision Systems), whereas others have a single one
(e.g. , Lindo). Often, the results of one type of model component (e.g., forecasting) a re used
as input to another (e.g., production scheduling). In some cases, a modeling language is a
component that generates input to a solver, w hereas in other cases, the two are combined.
Because DSS deal with semistructured o r unstructured problems, it is often necessary
to customize models, using programming tools and languages. Some examples of these are
.NET Framework languages, C++, and Java. OLAP software may also be used to work with
models in data analysis. Even languages for simulation such as Arena and statistical pack-
ages such as those of SPSS offer modeling tools developed through the use of a proprietary

Models (Model Base)
• Strategic, tactical, operational
• Statistical, financial, marketing,
management science,
accounting, engineering, etc.
• Model building blocks
Model Base Management
• Modeling commands: creation
• Maintenance: update
• Database interface
• Modeling language
Chapter 2 • Foundations and Technologies for Decision Making 67
Model
Directory
Model execution,
_. integration, and
command processor
Data Interface Knowledge-based
management management subsystem
FIGURE 2.6 Structure of the Model Management Subsystem.
programming language. For small and medium-sized DSS or for less complex ones, a spread-
sheet (e.g., Excel) is usually used. We will use Excel for many key examples in this book.
Application Case 2.3 describes a spreadsheet-based DSS. However, using a spreadsheet
for modeling a problem of any significant size presents problems with documentation and
error diagnosis. It is very difficult to determine or understand nested, complex relationships
in spreadsheets created by someone else. This makes it difficult to modify a model built by
someone else. A related issue is the increased likelihood of errors creeping into the formu-
las. With all the equations appearing in the form of cell references, it is challenging to figure
out where an error might be. These issues were addressed in an early gen eration of DSS
developme nt software that was available on mainframe computers in the 1980s. One such
product was called Interactive Financial Planning System (IFPS). Its developer, Dr. Gerald
Wagner, then released a desktop software called Planners Lab. Planners Lab includes the
following components: (1) an easy-to-use algebraically oriented model-building language
and (2) an easy-to-use state-of-the-art option for visualizing model output, such as answers
to what-if and goal seek questions to analyze results of changes in assumptions. The com-
bination of these components enables business managers and analysts to build, review, and
challenge the assumptions that underlie decision-making scenarios.
Planners Lab makes it possible for the decision makers to “play” with assumptions
to reflect alternative views of the future. Every Planners Lab model is an assemblage of
assumptions about the future. Assumptions may come from databases of historical per-
formance, market research, and the decision makers’ minds, to name a few sources. Most
assumptions about the future come from the decision makers’ accumulated experiences
in the form of opinions.
The resulting collection of equations is a Planners Lab model that tells a readable
story for a particular scenario. Planners Lab lets decision makers describe their plans
in their own words and with their own assumptions . The product’s raison d’etre is that
a s imulator should facilitate a conversation with the decision maker in the process of

68 Pan I • Decision Making and Analytics: An Overview
Application Case 2.3
SNAP DSS Helps OneNet Make Telecommunications Rate Decisions
Telecommunications network services to educational
institutions and government entities are typically
provided by a mix of private and public organiza-
tions. Many states in the United States have one or
more state agencies that are responsible for providin g
network services to sch ools, colleges, and other state
agencies. One example of such an agency is OneNet
in Oklahoma. OneNet is a division of the Oklahoma
State Regents for Higher Education and operated in
cooperation with the Office of State Finance.
Usually agencies such as OneNet operate as
an enterprise-type fund. They must recover their
costs through billing their clients and/or by justifying
appropriations directly from the state legislatures.
This cost recovery should occur through a pricing
mechanism that is efficie nt, simple to implement,
and equitable. This pricing model typically needs to
recognize many factors: convergence of voice, data ,
and video traffic on the same infrastructure; diver-
sity of user base in terms of educational institutions,
state agencies, and so on; diversity of applications
in use by state clients, from e -mail to videoconfer-
ences, IP telephoning, and distance learning; recov-
ery of current costs, as well as planning for upgrades
and future developments; and leverage of the shared
infrastructure to enable further economic develop-
ment and collaborative work across the state that
leads to innovative uses of OneNet.
These considerations led to the development of
a spreadsheet-based model. The system, SNAP-DSS,
or Service Network Application and Pricing (SNAP)-
based DSS, was developed in Microsoft Excel 2007
and used the VBA programming language .
The SNAP-DSS offers OneNet the ability to select
the rate card options that best fit the preferred pric-
ing strategies by providing a real-time, user-friendly,
graphical user interface (GUI). In addition, the SNAP-
DSS not only illustrates the influence of the changes in
the pricing factors on each rate card option, but also
allows the u ser to an alyze various rate card options
in different scenarios using different parameters. This
model has been used by OneNet financial planners to
gain insights into their customers and analyze many
what-if scenarios of different rate plan options.
Source: Based on J. Chongwatpol and R. Sharda, “SNAP: A DSS
to Analyze Network Service Pricing for State Netwo rks, ” Decision
Support Systems, Vol. 50, No. 1, December 2010, p p. 347-359.
describing business assumptions. All assumptions are described in English equations (or
the user’s native lan guage).
The best way to learn how to use Planners Lab is to launch the software and follow
the tutorials. The software can be downloaded at plannerslab.com.
The User Interface Subsystem
The user communicates with a nd comman ds the DSS through the user interface sub-
system. The user is considered part of the system. Researchers assert that some of the
unique contributio ns of DSS are derived from the intensive interaction between the
computer and the decision maker. The Web browser provides a familiar, consistent
graphical user interface (GUI) structure for most DSS. For locally used DSS , a spread-
sheet also provides a familiar user interface. A difficult user interface is one of the
major reasons managers do not use computers and quantitative analyses as much as
they could, given the availability of these technologies. The Web browser has b e en
recognized as an effective DSS GUI because it is flexible, user friendly, and a gateway
to almost all sources of necessary information and data . Essentially, Web browsers h ave
led to the development of portals and dashboards, which front end many DSS.
Explosive growth in portable devices including smartphones and tablets has changed
the DSS user interfaces as well. These devices allow either handwritten input or typed input
from inte rna l or external keyboa rds. Some DSS user interfaces utilize natural-language input

Chapter 2 • Foundations and Technologies for Decis ion Making 69
(i.e., text in a human language) so that the users can easily express themselves in a mean-
ingful way. Because of the fuzzy n ature of human language, it is fairly difficult to develop
software to interpret it. However, these p ackages increase in accuracy every year, and they
will ultimate ly lead to accurate input, o utput, and langu age translators .
Cell phone inputs through SMS a re becoming mo re commo n for at least some con-
sumer DSS-type applications. Fo r example, one can send an SMS request for search on
any topic to GOOGL (46645) . It is m ost useful in locating nearby businesses, addresses,
o r phone numbers, but it can also be used for many othe r decis io n support tasks. For
example, users can find definitions of words by entering the word “define” followed by a
word, su ch as “define extenuate. ” Some of the other capabilities include:
• Translatio n s: “Tran slate thanks in Sp anish .”
• Price lookups: “Price 32GB iPho ne.”
• Calculator: Although you would probably just want to use your phone’s built-in
calcula to r function , you can send a math expression as an SMS for an a nswer.
• Currency conversions: “10 usd in e uros. ”
• Sports scores and game times: Just enter the name of a team (“NYC Giants”), and Google
SMS w ill send the most recent game’s score and the date and time of the next match.
This typ e of SMS-based search capability is also available for oth er search e ngin es, includ-
ing Yahoo! and Microsoft’s n ew search e ngine Bing.
With the e me rge n ce of sm a rtphon es su ch as Apple ‘s iPho n e and Android smart-
phones fro m m a ny vendors , ma ny companies are developing applicatio n s (commonly
called apps) to provide purchasing-decision support. For example, Amazon.corn’s app
allows a u ser to tak e a picture of a ny item in a store (or w h erever) a nd send it to Amazon.
com. Amazon. corn ‘s graphics-understanding a lgorithm tries to matc h the image to a real
product in its d atabases and sends the user a page similar to Amazon. corn’s prod uct
info pages, allowing users to perform price comp arisons in real time. Thousands of
othe r apps have been developed tha t provide consumers support for decision making
on finding and selecting stores/ restaura nts/ service providers on the basis of locatio n ,
recommendatio n s from o thers, a nd especially fro m your own social circles.
Voice input for these devices and PCs is common and fairly accurate (but not per-
fect). When voice input with accompanying speech-recognition software (and readily
available text-to-speech software) is u sed, verbal instructions w ith accompanied actions
and outputs can be invoked. These are readily available for DSS and are incorporated into
the portable devices described earlier. An example of voice inputs that can be used for
a gene ral-purpose DSS is Apple’s Siri applicatio n a nd Google ‘s Google Now service. For
example, a user can give her zip code and say “pizza delivery.” These devices provide the
search results and can even p lace a call to a business.
Recent efforts in business process management (BPM) have led to inputs directly
from physical devices fo r analysis via DSS. For example, radio-frequency identificatio n
(RFID) chips can record data fro m sen sors in railcars or in-process products in a factory.
Data from these sen sors (e.g., recording an ite m’s status) can be dow nloaded at key loca-
tio ns and immediately transmitted to a database o r data ware h ou se, w h ere they can be
an alyzed and decisions can be made con cerning the status of the ite ms being mo nitored.
Walmart and Best Buy are developing this technology in their SCM, an d su ch sensor
networks are a lso being u sed effectively by o ther firms.
The Knowledge-Based Management Subsystem
The knowledge-based management subsystem can support any of the other subsystems or
act as an independe nt compo ne nt. It provides inte llige nce to augment the decision mak-
er’s own. It can be inte rconnected w ith the o rganizatio n’s knowledge repository (part of

70 Pan I • Decision Making and Analytics: An Overview
a knowledge management system [KMS]), which is sometimes called the organizational
knowledge base. Knowledge may be provided via Web se1vers. Many artificial intelligence
methods have been implemented in Web development systems su ch as Java and are easy
to integrate into the oth er DSS components. One of the most widely publicized knowledge-
based DSS is IBM’s Watson computer system. It is described in Application Case 2.4.
We conclude the sections on the three major DSS componen ts with information
o n some recent technology and methodology developments that affect DSS a nd de ci-
s io n making. Technology Ins ig hts 2.2 summarizes some emerging developments in user
Application Case 2.4
From a Game Winner to a Doctor!
The television show Jeopardy! inspired an IBM
research team to build a supercomputer n amed
Watson that successfully took o n the ch allenge of
playing Jeopardy! and beat the other human com-
p etitors. Since the n , Watson has evolved into a
question-answering computing platform that is now
being u sed commercially in the medical field and
is exp ected to find its use in man y othe r a reas.
Watson is a cognitive system built on clus-
ters of powerful processors supported by IBM’s
DeepQA® software. Watson employs a combina-
tion of techniques like n atural-language processing,
hypo thesis generation and evaluatio n , and evide nce-
based learning to overcome the con straints imposed
by programmatic computing. This enables Watson
to work on massive amounts of real-world , unstruc-
ture d Big Data e fficie ntly .
In the medical field, it is estimated that the
amount of medical information doubles every
5 years. This massive growth limits a physician’s
decision-making ability in diagnosis and treatment of
illness using an evide n ce-based approach. With the
advancements being made in the medical field every
day, physicians do n o t have enough time to read
eve1y jo urnal that can he lp the m in keeping up-to –
date with the latest ad van cements. Patient histories
and electronic medical records contain lo ts of data . If
this info rmatio n can be an alyzed in com binatio n with
vast amounts of existing medical know ledge, many
u seful clues can be provided to the physicians to
help the m ide ntify diagnostic and treatment options.
Watson, dubbed Dr. Watson, w ith its advanced
machine learning capabilities, now finds a new role
as a computer compa nion that assists physicians by
providing relevant real-time information for critical
d ecisio n making in ch oosing the right diagnostic and
treatment procedures. (Also see the opening vignette
for Chapter 7.)
Memorial Sloan-Kettering Cancer Center
(MSKCC), New York, and WellPoint, a major insur-
ance provider, have begun using Watson as a treat-
ment advisor in oncology diagnosis. Watson learned
the process of diagnosis and treatme nt through its
natural-language processing capabilities, which ena-
bled it to leverage the unstructured data with an enor-
mous amount of clinical exp ertise data, molecular
a nd genomic data from existing cancer case histo-
ries, jo urnal articles, physicians’ no tes, and guidelines
and best practices from the Nation al Comprehensive
Cancer Network. It was then trained by oncologists to
apply the knowledge gained in comparing an individ-
u al patient’s med ical information against a w ide vari-
ety of treatment guidelines, published research, and
o ther insights to provide individualized, confidence-
scored recomme ndatio ns to the physicians.
At MSKCC, Watson facilitates evidence-based
support for every suggestion it makes while analyz-
ing an individual case by bringing out the facts from
medical literature that point to a particular sugges-
tion. It also provides a platform for the physicians to
look at the case from multiple directions by doing fur-
ther analysis relevant to the individual case. Its voice
recognition capabilities allow physicians to speak to
Watson, enabling it to be a perfect assistant that he lps
physicians in critical evidence-based decision making.
WellPoint also trained Watson w ith a vast his-
tory o f medical cases and now relies o n Watson’s
h ypothesis generation and eviden ce-based learning
to generate recommendations in providing approval
for medical treatments based on the clinical and
patient data. Watson also assists the insuran ce pro-
viders in detecting fraudulent claims and protecting
physicians from malpractice claims.
Watson provides a n excellent example of
a knowledge-based DSS that employs multip le
ad vanced technologies.

Chapter 2 • Foundations and Technologies for Decision Making 71
QUESTIONS FOR DISCUSSION
1. What is a cognitive system? How can it assist in
real-time decision making’
2. What is evide n ce-based decision making?
3. What is the role played by Watson in the
discussion?
4 . Does Watson eliminate the need for human deci-
s io n making?
What We Can Learn from This Application
Case
Advanceme nts in technology now e nable the build-
ing of powerful, cognitive computing platfo rms com-
bined w ith complex analytics. These systems are
TECHNOLOGY INSIGHTS 2.2
Next Generation of Input Devices
impacting the decision-making process radically by
shifting them from an opinion-based process to a
more real-time, eviden ce-based process, thereby turn-
ing available information intelligence into actio nable
wisdom that can be readily employed across many
industrial secto rs.
Sources, lbm.com, “IBM Watson: Ushering In a New Era of
Computing,” www-03.ibm.com/innovation/us/watson (accessed
February 2013); lbm.com, “IBM Watson Helps Fight Cancer with
Evidence-Based Diagnosis and Treatment Suggestions ,” www-
03.ibm.com/innovation/us/watson/pdf/MSK_ Case_Study _
IMC14794 (accessed February 2013); lbm.com, “IBM Watson
Enables More Effective Healthcare Preapproval Decisions Using
Evidence-Based Learning,” www-03.ibm.com/innovation/us/
watson/pdf/WellPoint_ Case_Study _IM Cl 4 792 (accessed
February 2013).
The last few years have seen exciting developments in u ser interfaces. Perhaps the most com-
mo n example of the new user interfaces is the iPhone ‘s multi-to uch interface that allows a user
to zoom, pan, and scroll through a screen just w ith the use of a finger. The success of iPhone has
spawned developme nts of similar user interfaces from many other providers including Blackberry,
HTC, LG, Motorola (a part of Google), Microsoft, Nokia, Samsung, and others. Mobile platform
has become the major access mechanism for all decision su pport applications .
In the last few years, gaming devices have evolved significantly to be able to receive and
process gesture-based inputs. In 2007, Nintendo introdu ced the Wii game p latform , which is
able to process mo tio ns and gestures. Microsoft’s Kinect is able to recognize image movements
and use that to discern inputs. The n ext generation of these technologies is in th e form of
mind-readin g p latforms . A company called Emotiv (en.wikipedia.org/wiki/Emotiv) made
big n ews in early 2008 w ith a promise to deliver a ga me controller that a u ser would be able
to control by thinking about it. These technologies are to b e based on electroencephalogra-
phy (EEG), the technique of reading a nd processing the electrical activity at the scalp level
as a result of specific tho ughts in the brain. The technical details are available on Wikipedia
(en.wikipedia.org/wiki/Electroencephalography) and the Web. Although EEG has not
yet been known to be used as a DSS user inte rface (at least to the autho rs), its potential is
significant for many oth e r DSS-type applications. Many other companies a re developing similar
technologies.
It is also possible to speculate on other developmen ts on the horizon. O ne major growth
a rea is like ly to be in wearable devices. Google ‘s wea rable glasses that are labeled “augmented
reality” glasses w ill likely emerge as a new u ser interfa ce for decision suppo rt in both consumer
a nd corporate decisio n settings. Similarly, Apple is supposed to be working on iOS-based wrist-
watch-type computers. These devices will significantly impact h ow we interact w ith a syste m and
use the system for decision support. So it is a safe bet that user interfaces are going to change
significantly in the next few yea rs. Their first u se w ill probably be in gaming and consu mer
a pplicatio ns, but the business and DSS applicatio ns won ‘t be far behind.
Sources, Various Wikipedia sites and the company Web sites provided in the feature.

72 Pan I • Decision Making and Analytics: An Overview
Chapter Highlights
interfaces. Many developments in DSS compone nts are the result of new developments
in h a rdware and software compute r technology, data wareh ou sing, data mining, OLAP,
Web technologies, integration of technologies, a nd DSS applicatio n to variou s a n d n ew
function al areas. There is also a clear link between hardware and software capabilities
and improvem ents in DSS. Hardware continues to shrink in size while increasing in speed
and other capabilities. The sizes of data bases and data warehouses h ave increased dra-
m atically . Data warehouses n ow provide hundreds of petabytes of sales data for retail
o rga nizatio ns a nd content for majo r news networks.
We expect to see more seamless integration of DSS components as they adopt Web
technologies, especially XML. These Web-based technologies have become the center of
activ ity in developing DSS. Web-based DSS have reduced technological barriers and h ave
made it easier and less costly to make decision-relevant information and m odel-d riven
DSS available to managers and staff u sers in geographically distributed location s, espe-
cially through mobile devices.
DSS are becoming mo re embedded in o ther systems. Similarly, a major area to expect
improvements in DSS is in GSS in suppo rting collaboration at the enterprise level. This is
true even in the edu cational arena. Almost every new area of informatio n systems involves
some level of d ecision-making support. Thus, DSS, eithe r directly or indirectly, h as impacts
o n CRM, SCM, ERP , KM, PLM, BAM, BPM, and other EIS. As these systems evolve, the
active decision-making component that utilizes mathematical, statistical, or even descriptive
models increases in size and capability, although it may be buried deep w ithin the system.
Finally, different typ es of DSS compon e nts are being integrated more frequently. For
example, GIS are readily integrated w ith other, more traditional, DSS compo nents and
tools for improved decision making.
By definition, a DSS must include the three major compo ne nts-DBMS, MBMS, and
user inte rface . The knowledge-based management subsyste m is optio nal, but it can pro-
vide many benefits by providing intelligence in and to the three m ajo r compon ents . As in
any other MIS, the user m ay be considered a componen t of DSS.
• Managerial decision m aking is synonymous with
the w h ole process of management.
• In the choice phase, alternatives are compared, and
a search for the best (or a good-en ough) solution is
launched. Many search techniques are available. • Human decision styles need to be recognized in
designing systems.
• Individual and group decision making can both
be supported by systems .
• Problem solving is also opp ortunity evaluatio n.
• A model is a simplified representatio n or abstrac-
tion of reality.
• Decisio n making involves four major phases:
inte llige nce, design , choice, a nd imple me ntatio n .
• In the intelligence phase, the problem (oppor-
tunity) is ide ntified , classified, and decom-
posed (if nee d ed), and problem ownership is
established.
• In the design phase, a model of the syste m is
built, criteria for selection are agreed on, alterna-
tives a re generated, results are predicted, and a
decision methodology is created.
• In implementing alternatives, a decision maker
should conside r multiple goals and sen sitivity-
analysis issues.
• Satisficing is a w illingn ess to settle for a satis-
factory solution. In effect, satisficing is subopti-
mizing. Bounded rationality results in decision
makers satisficing.
• Computer systems can support all p hases of deci-
sion making by automating many of the required
tasks or by applying a rtificial intelligen ce.
• A DSS is designed to support complex m anage-
rial problems that other computerized techniques
cannot. DSS is user oriented, and it uses data and
models .
• DSS are generally developed to solve specific
manage1ial p roblems, wh ereas BI systems typically

Chapter 2 • Foundations and Technologies for Decision Making 73
report status, and, whe n a problem is discovered,
the ir analysis tools are utilized by decisio n makers .
• DSS can provide support in a ll phases of the deci-
sio n-making process a nd to all m an age rial leve ls
for individ u als, groups, and organizatio n s .
• DSS is a u ser-oriented tool. Ma ny applica-
tio ns can b e d eve lo p e d by e nd u sers, ofte n in
spread sheets.
• DSS can improve the effectiveness of decision
m aking, decrease the nee d for training, improve
managem e nt control , fa cilitate communication,
save effort by the users, reduce costs, a nd allow
for m o re o bjective decisio n making .
• The AIS SIGDSS classification of DSS includes
communicatio ns-drive n and group DSS (GSS) ,
d ata-driven D SS, d ocume n t-driven DSS, knowl-
e dge-driven D SS, data mining a nd management
ES a pplicatio n s, a nd m o d e l-driven DSS. Several
o the r classificatio ns map into this o ne .
• Severa l u seful classifications o f DSS are based
on w hy they are d evelo p e d (in stitutio n al versu s
ad hoc), w hat level within the o rganization they
support (personal, group , or organizatio nal),
w h ethe r they suppo rt individual work o r g roup
w ork (indiv idua l DSS versus GSS), and h ow they
are develo ped (cu sto m versu s ready-mad e).
Key Terms
ad hoc DSS
algorithm
an a lytical techniques
business inte llige nce
(BI)
cho ice phase
data warehouse
data base management
syste m (DBMS)
decisio n ma king
decision style
decision variable
descriptive mode l
design phase
DSS applicatio n
effectiveness
efficiency
implem e ntation phase
Questions for Discussion
1. Why is intuition still an impottant aspect of decision making?
2. Define efficiency and effectiveness, and compare and
contrast the two.
3. Why is it impottant to focus on the effectiveness of a deci-
sion, not necessarily the efficiency of making a decision?
4. What are some of the measures of effectiveness in a
toy manufac turing plant, a restaurant, an educational
institutio n, and the U.S. Congress?
• The ma jo r compo nents of a DSS are a datab ase
and its m an agem ent, a mode l base and its m a n-
age m ent, and a u ser-frie ndly interface . An inte lli-
gent (knowle dge -based) com pon e nt can also be
included . The user is also conside red to be a com-
ponent of a DSS.
• Data w areh ouses, data mining, and O LAP h ave
made it p ossib le to develo p DSS quickly and easily.
• The data management subsyste m u su ally includes
a D SS d a tabase, a DBMS, a d a ta d irectory, and a
q u ery fa cility.
• The mo del b ase includes standard m odels and
m od els sp ecifically w ritte n fo r the DSS.
• Cu stom-made models can be w ritten in p rogram-
ming languages, in special m odeling languages,
and in Web-based develo p ment systems (e .g. , Java,
the .NET Framework) .
• The use r inte rface (or dialog) is of utmost impo r-
tance. It is ma naged by software that p rovides the
needed capabilities. Web browsers a nd smart-
pho nes/ tablets commo nly provide a frie ndly, con-
sistent DSS GUI.
• The user interface cap abilities of D SS h ave m oved
into sm a ll, p o rtable d evices, including sm art-
pho n es, tablets, and so forth .
institutio nal D SS
intelligence phase
mo de l b ase m an agem e nt
syste m (MBMS)
normative m o del
o ptimizatio n
o rganizatio nal
know ledge base
principle of ch o ice
p roble m ownership
p roblem solving
satisficing
scena rio
sensitivity analysis
simulatio n
suboptimization
u ser interface
what-if an a lysis
5. Even though implementation of a decision involves change,
and change management is very difficult, explain how
change management has not changed very much in thou-
sands of years. Use specific examples throughout history.
6. Your company is considering opening a branch in China.
List typical activities in each phase of the decision (intel-
ligence, design, choice, implementation) of whether to
open a branch.

74 Part I • Decision Making and Analytics: An Overview
7. You a re about to buy a car. Using Simon’s four-phase
model , describe your activities at each step.
8. Explain, through a n example, the support given to deci-
sion makers by computers in each phase of the decision
process.
9. Some experts believe that the major contribution of DSS
is to the implementatio n of a decision. Why is this so?
10. Review the major characteristics and capabilities of
DSS. How do each of them relate to the major compo-
ne nts of DSS?
Exercises
Teradata University Network TUN)
and Other Hands-On Exercises
1. Choose a case at TUN o r use the case that your instructor
chooses. Describe in detail what decisions were to be made
in the case and what process was actually followed. Be sure
to describe how technology assisted or hindered the deci-
sion-making process and what the decision’s impacts were .
2. Most companies and organizations have downloadable
demos or trial versions of their software products on the
Web so that you can copy a nd try them o ut on your own
compute r. Others have o nline demos. Find one tha t pro-
vides decision support, try it out, and write a short report
about it. Include details about the intended purpose of
the software , how it works, a nd how it supports decision
making.
3. Comme nt o n Simo n’s (1977) philosophy that managerial
decision making is synonymous with the whole process
End-of-Chapter Application Case
11. List some inte rnal data and external data that could be
found in a DSS for a university’s admissions office.
12. Why does a DSS need a DBMS, a model management
system, and a user interface, but not necessarily a knowl-
edge-based management system?
13. What are the benefits and the limitations of the AIS
SIGDSS classification for DSS?
14. Search for a ready-made DSS . Wha t type of indu stry is its
market’ Explain why it is a ready-made DSS.
of management. Does this make sense? Explain. Use a
real-world example in your explanation.
4. Consider a situation in which you have a preference
about where you go to college: You want to be not too
far away from home and not too close. Why might this
situation a rise? Explain how this situatio n fits with rational
decision-making behavior.
5. Explore teradatauniversitynetwork.com. In a report,
d escribe at least three inte resting DSS applications and
three inte resting DSS areas (e.g. , CRM, SCM) that you
h ave discove red there .
6. Examine Daniel Power’s DSS Resources site at
dssresources.com. Take the Decision Support Sys-
tems Web Tour (dssresources.com/tour/index.html).
Explore other areas of the Web site.
Logistics Optimization in a Major Shipping Company (CSAV}
Introduction
Compafiia Sud Americana de Vapores (CSAV) is a shipping
company headquarte red in Chile, South America , a nd is the
sixth largest shipping company in the world. Its operations
in over 100 countries worldwide a re managed from seven
regio nal offices. CSA V operates 700,000 containers valued at
$2 billion. Less than 10 pe rcent of these containers are owned
by CSAV. The rest are acquired fro m other third-party com-
panies o n lease. At the heart of CSA V’s business operations
is their container fleet, w hich is o nly second to vessel fuel in
terms of cost. As part of their strategic planning, the company
recognized that addressing the problem of empty containe r
logistics would help reduce operational cost. In a typical cycle
of a cargo container, a shippe r first acquires a n empty con-
tainer from a containe r depot. The containe r is the n loaded
onto a truck a nd sent to the merchant, w ho then fills it with his
products. Finally, the container is sent by truck to the ship for
onward transport to the destination. Typically, there are trans-
shipme nts alo ng the way w he re a containe r may be moved
from one vessel to another until it gets to its destination. At the
destination, the container is transported to the consignee. After
emptying the container, it is sent to the nearest CSAV depot,
w here maintenance is done on the container.
There were four main ch alle nges recognized by
CSA V to its empty container logistics problem:
• Imbalance. Some geographic regions are net expotters
while others are net ivmporters. Places like China are
net exporters; hence, there are always shortages of con-
tainers. North America is a net importer; it always has a
surplus of containers. This creates an imbalance of con-
tainers as a result of uneven flow of containers.
• Uncertainty. Factors like demand, date of return of
empty containe rs, travel times, and the s hi p ‘s capacity

Chapter 2 • Foundations and Technologies for Decision Making 75
for empty containers create uncertainty in the location
and availability of containe rs .
• Information handling and sharing. Huge loads o f
data need to be processed every day. CSAV processes
400,000 containe r transactions eve1y day. Timely deci-
sions based on accurate information had to be gener-
ated in orde r to he lp reduce safety stocks of e mpty
containers.
• Coordination of interrelated decisions worldwide.
Previously , decisions were made at the local level.
Consequently, in order to alleviate the empty container
proble m, decisions regarding movement of empty con-
tainers at various locations had to be coordinated.
Methodology /Solution
CSA V developed an integrated system called Empty Container
Logistics Optimization (ECO) using moving average, trended
and seasonal time series, and sales force forecast (CFM) meth-
ods. The ECO system comprises a forecasting model, inven-
to1y model, multi-commodity (MC) network flow model , and
a Web inte rface. The forecasting model draws data from the
regional offices, processes it, and feeds the resultant info rma-
tion to the inventory model. Some of the information the fore –
casting model generates are the space in the vessel for empty
containers and container demand. The forecasting module
also helps reduce forecast error and, hence, allows CSAV’s
depot to maintain lower safety stocks. The inventory model
calculates the safety stocks and feeds it to the MC Network
Flow model. The MC Network Flow model is the core o f the
ECO system. It provides information for optimal decisions
to be made regarding inventory levels, container reposition-
ing fl ows, and the leasing and return of empty conta iners.
The objective function is to minimize empty container logis-
tics cost, which is mostly a result o f leasing, repositio ning,
storage, loading, and discharge operations .
Results/Benefits
The ECO system activities in all regional centers are well coor-
dinated while still maintaining flexibility and creativity in their
operations. The system resulted in a 50 percent redu ction
in invento1y stock. The generation of intelligent information
from historical transactional data he lped increase efficiency
of operation. For instance, the empty time per containe r cycle
decreased from a high of 47.2 days in 2009 to only 27.3 days
the following year, resulting in an increase of 60 percent of
the average empty container turnover. Also, container cycles
References
Allan , N., R. Frame, and I. Turney. (2003). “Trust and Narrative :
Experiences of Sustainability.” Tbe Corporate Citizen,
Vol. 3, No. 2.
Alter, S. L. (1980). Decision Support Systems: Current Practices
and Continuing Challenges. Reading, MA: Addison-Wesley.
increased from a record low of 3.8 cycles in 2009 to 4.8 cycles
in 2010. Moreover, when the ECO system was implemented in
2010, the excess cost per full voyage became $35 cheaper than
the average cost for the period between 2006 and 2009. This
resu lted in cost savings of $101 million on all voyages in 2010.
It was estimated that ECO’s direct contribution to this cost
reduction was about 80 percent ($81 millio n). CSAV projected
that ECO will help generate $200 million profits over the next
2 years sin ce its implementation in 2010.
CASE QUESTIONS
1. Explain w hy solving the empty container logistics
problem contributes to cost savings for CSAV.
2. What are some of the qualitative benefits of the optimi-
zation model for the empty contain er movements?
3. What are some of the key benefits of the forecasting
model in the ECO system implemented by CSA V?
4. Perform an online search to dete rmine how other ship-
ping companies handle the empty container problem.
Do you think the ECO system would directly benefit
those companies?
5. Besides shipping logistics, can you think of any other
domain where su ch a system wou ld be useful in reduc-
ing cost?
What We Can Learn from This End-of-
Chapter Application Case
The empty containe r problem is faced by most sh ipping
companies. The problem is partly caused by an imbalance
in the demand of empty containers between different geo-
graphic areas. CSAV used an optimization system to solve
the empty container problem. The case demonstrates a situ-
ation w here a business problem is solved not just by one
method or model, but by a combination of different opera-
tions research and analytics methods. For instance, we rea li ze
that the optimization model u sed by CSA V consisted of differ-
ent s ubmodels such as the forecasting and inventory models.
The shipping industiy is only one sector among a myriad of
sectors where optimization models are used to decrease the
cost of business operations. The lessons learned in this case
could be explored in other domains such as manufacturing
and supply chain.
Source: R. Epste in et al. , “A Strategic Empty Co ntaine r Logistics
Optimization in a Major Shipping Company,” Inteifaces, Vol. 42, No.
1, Janua ry- February 2012, pp. 5-16.
Bake r, J., and M. Cameron. 0996, September). “The Effects of
the Service Environment on Affect and Consumer Perception
of Waiting Time: An Integrative Review and Research
Propositions.” Journal of the Academy of Marketing Science,
Vol. 24, pp. 338-349.

76 Pan I • Decision Making and Analytics: An Overview
Ba rba-Romero, S. (2001, July/August) . “The Spanish Govern-
m e nt Uses a Discrete Multicriteria DSS to Determine
Data Processing Acquisitions. ” l ntetfaces, Vol. 31, No. 4,
pp. 123-131.
Beach , L. R. (2005). The Psychology of Decision Making:
People in Organizations, 2nd ed. Thousand Oak s, CA:
Sage.
Birkma n Inte rnational, Inc. , birkman.com; Keirsey Tem-
perament Sorter a nd Keirsey Temperament Theory-II,
keirsey .com.
Ch o ngwatpo l, J. , and R. Sharda . (2010, December).
“SNAP: A DSS to Analyze Network Service Pricing for
State Networks .” Decision Support Systems, Vol. 50, No. 1,
pp. 347-359
Cohe n , M.-D. , C. B. Charles, and A. L. Medaglia. (2001 , March/
April). “Decision Suppo rt w ith Web-Enabled Software.”
lntetjaces, Vol. 31, No. 2, pp. 109-129.
De nning, S. (2000) . The Springboard: How Storytelling Ignites
Action in Knowledge-Era Organizations. Burlington , MA:
Butterwonh-He ine m ann.
Donovan, J. J. , and S. E. Madnick. 0977). “Institutiona l and
Ad Hoc DSS a nd Their Effective Use. ” Data Base, Vol. 8,
No. 3, pp. 79-88.
Drummond, H. (2001) . The Art of Decision Making: Mirrors of
Imagination, Masks of Fate. New York: Wiley.
Eden , C., and F. Ackermann. (2002) . “Emerge nt Strategizing. ”
In A. Huff a nd M. Jenkins (eds.) . Mapping Strategic
Thinking. Thousand Oaks, CA: Sage Publications.
Epste in, R. , et al. (2012, Janua1y/ Februa1y). “A Strategic Empty
Conta ine r Logistics Optimization in a Majo r Shipping
Company.” lntetjaces, Vol. 42, No. 1, pp. 5-16.
Farasyn, I. , K. Perkoz, and W. Van de Velde. (2008, July/
August). “Spreadsheet Models fo r Inventory Target
Setting at Procte r a nd Ga mble.” l n tetjaces, Vol. 38, No. 4,
pp. 241-250.
Goodie, A. (2004, Fall). “Goodie Studies Pathological
Gamblers’ Risk-Taking Behavio r. ” The Independent
Variable. Athens, GA: The University o f Georg ia, Institute
of Behavioral Research. ibr.uga.edu/publications/
fall2004 (accessed Fe bruary 2013) .
Hesse, R. , and G. Woolsey. 0975). Applied Manage-
ment Science: A Quick and Dirty Approach. Chicago:
SRA Inc.
Ibm.com . “IBM Watson: Ushe ring In a New Era of Computing. ”
www-03.ibm.com/innovation/us/watson (accessed
February 2013).
Ibm.com. “IBM Watson Helps Fight Can cer w ith Evidence-
Based Diagnosis a nd Treatment Suggestions.” www-03.
ibm. com/innovation/us/watson/pdf/MSK_ Case_
Study_IMC14794 (accessed February 2013).
Ibm.com. “IBM Watson Enables More Effective Healthcare
Preapproval Decisions Using Evide nce-Based Learn-
ing. ” www-03.ibm.com/innovation/us/watson/pdf/
Wel1Point_Case_Study_IMC14792 (accessed
Febru ary 2013).
Jenkins, M. (2002). “Cognitive Mapping. ” In D . Paitington
(ed.). Essential Skills for Management Research. Thousand
Oaks, CA: Sage Publications.
Kepner, C. , and B. Tregoe. 0998). TheNew RationalManager.
Princeton, NJ: Kepner-Tregoe .
Koksalan, M., a nd S. Zionts (eds.) . (2001) . Multiple Criteria
Decision Making in the New Millennium . Heidelberg:
Springer-Verlag.
Koller, G. R. (2000). Risk Modeling/or Determining Value and
Decision Making. Boca Raton, FL: CRC Press.
Larson, R. C. 0987, November/ December) . “Perspectives on
Q u e u es: Socia l Justice a nd the Psychology of Queueing.”
Operations Research, Vol. 35, No . 6 , pp. 895- 905.
Luce, M. F., J. W. Payne, and J. R. Bettman. (2004). “The
Emotio nal Nature of Decision Trade-offs. ” In S. J. Hoch,
H. C. Kunreuther, a nd R. E. Gunthe r (eds.). Wharton on
Making Decisions. New York: Wiley.
Olavson, T. , and C. Fry. (2008, July/ Augu st). “Spreadsheet
Decision-Support Tools: Lessons Learned at Hewlett-
Packard. ” l ntetjaces, Vol. 38, No. 4, pp. 300- 310.
Pauly, M. V. (2004). “Split Personality: Inconsistencies in Private
a nd Public Decisions. ” In S. J. Hoch, H. C. Ku nreuthe r,
and R. E. Gunther (eds.) . Wharton on Making Decisions.
New York: Wiley.
Power, D . J. (2002). Decision Making Support Systems:
Achievements, Trends and Challenges. Hershey, PA: Idea
Group Publishing.
Power, D. J. , and R. Sharda. (2009) . “Decisions Support
Systems. ” In S.Y. Nof (ed.) , Springer Handbook of
Automation. New York: Springe r.
Purdy, ]. (2005, Summe r). “Decisio ns, Delusions, & Debacles.”
UGA Research Magazine.
Ratne r, R. K., B. E. Kahn, and D. Kahne man. 0999, June).
“Choosing Less-Preferred Experie nces for the Sake of
Variety. ” Journal of Consumer Research, Vol. 26, No. 1.
Sawyer, D. C. 0999). Getting It Right: Avoiding the High Cost
of Wrong Decisions. Boca Rato n , FL: St. Lucie Press.
Simo n , H. 0977). The New Science of Management Decision.
Englewood Cliffs, NJ: Prentice Hall.
Stewart, T. A. (2002, November). “How to Think w ith Your
Gut. ” Business 2.0.
Teradata.com. “No Limits: Station Casinos Breaks the Mold
o n Customer Re lationships. ” teradata.com/case-studies/
Station-Casinos-No-Limits-Station-Casinos-Breaks-
the-Mold-on-Customer-Relationships-Executive-
Summary-eb6410 (accessed Fe bruary 2013).
Tversky, A. , P. Slovic, and D. Kahneman. 0990, March) .
“The Causes o f Prefere nce Reversal. ” American Economic
Review, Vol. 80, No. 1.
Yakov, B.-H. (2001). Information Gap Decision Theory: Decisions
Under Severe Uncertainty. New York: Academic Press.

p A R T
Descriptive Analytics
LEARNING OBJECTIVES FOR PART II
• Learn the role of descriptive analytics (DA) in
solving business problems
• Learn the basic definitions, concepts, and
architectures of data warehousing CDW)
• Learn the role of data warehouses in managerial
decisio n support
• Learn the capabilities of bu siness reporting an d
visualization as e n ablers of DA
• Learn the importance of information
visualization in managerial decision support
• Learn the foundatio ns of the emerging field
of visual analytics
• Learn the capabilities and limitation s
of dashboards and scorecards
• Learn the fu n damentals of business
performance man agement (BPM)
Descriptive analytics, often referred to as business intelligence, uses data and models to answer
the “what happened?” and “why did it happen?” questions in business settings. It is perhaps
the most fundamental echelon in the three-step analytics continuum upon which predictive and
prescriptive analytics capabilities are built. As you w ill see in the following chapters, the key enablers
of descriptive analytics include data warehousing, business reporting , decision dashboard/
scorecards, and visual analytics.
77

78
CHAPTER
Data Warehousing
LEARNING OBJECTIVES
• Understand the basic d efinitio ns and
co ncepts o f data wareh o uses
• Explain the role of d ata ware ho u ses in
decisio n su p p o rt
• Understand data wareh o u sing
architectures
• Explain data integratio n and the
extractio n , transformatio n , and load
(ETL) p rocesses • Describe the p rocesses u sed in
develo ping and m an aging data
wareh o u ses
• Describe real-time (active) d ata
ware ho u sing
• Explain da ta ware ho using o p eration s • Understa nd d ata warehou se
administratio n a nd security issues
T
he con cept of d ata warehou sing has been around since the late 1980s. This chapter
provides the foundatio n fo r an impo rtant typ e of d atab ase, called a data ware-
house, w hic h is primarily used for decisio n suppo rt a nd p rovides improved analyti-
cal cap abilities. We discu ss data wareh o using in th e following sectio n s:
3. 1
3 .2
3 .3
3.4
3 .5
3.6
3.7
3 .8
3 .9
3 .10
Ope ning Vig n e tte: Isle o f Capri Casinos Is W inning w ith Enterprise Da ta
Wa reho u se 79
Da ta Ware h o u s ing De finitio ns a nd Con cepts 81
Da ta W are h o u s ing Process Overview 8 7
Da ta Ware housing Arc hitectures 90
Da ta Integratio n and the Extractio n , Tra nsformation , a nd Load (ETL)
Processes 97
Da ta Ware h o u se D evelo pme nt 102
Da ta W areho u s ing Imple m e ntatio n Issues 113
Rea l-Time D ata Wa reho us ing 117
Da ta Ware h o u se Administratio n , Security Issu es, a nd Fu ture Tre nds 121
Resources, Links, a nd the T e rada ta Un ivers ity Network Connectio n 126

Chapter 3 • Data Warehousing 79
3.1 OPENING VIGNETTE: Isle of Capri Casinos Is Winning
with Enterprise Data Warehouse
Isle of Capri is a unique and innovative player in the gaming industry. After entering the
market in Biloxi, Mississippi, in 1992, Isle has grown into one of the country’s largest
publicly traded gaming companies, mostly by establishing properties in the southeastern
United States and in the country’s hea1tland. Isle of Capri Casinos, Inc., is currently operat-
ing 18 casinos in seven states, serving nearly 2 million visitors each year.
CHALLENGE
Even though they seem to have a differentiating edge, compared to others in the highly
competitive gaming industry, Isle is not entirely unique. Like any gaming company, Isle’s
success depends largely on its relationship with its customers-its ability to create a gaming,
entertainment, and hospitality atmosphere that anticipates customers’ needs and exceeds
their expectations. Meeting such a goal is impossible without two important components:
a company culture that is laser-focused on making the custome r experience an e njoyable
one, and a data and technology architecture that enables Isle to constantly deepen its under-
standing of its customers, as well as the various ways customer needs can be efficiently met.
SOLUTION
After an initial data warehouse implementation was derailed in 2005 , in part by Hurricane
Katrina, Isle decided to reboot the project with entirely new components and Teradata
as the core solution and key partner, along with IBM Cognos for Business Intelligence.
Shortly after that choice was made, Isle brought on a management team that clearly
understood how the Teradata and Cognos solution could enable key decision make rs
throughout the operation to easily frame their own initial queries, as well as timely follow-
up questions, thus opening up a wealth of possibilities to enhance the business.
RESULTS
Thanks to its successful implementation of a comprehensive data warehousing and busi-
ness intelligence solution, Isle has achieved some deeply satisfying results. The company
has dramatically accelerated and expanded the process of information gathering and
dispersal, producing about 150 reports on a daily basis, 100 weekly , and 50 monthly, in
addition to ad hoc queries, completed within minutes, a ll day every day. Prior to an enter-
prise data warehouse (EDW) from Te radata, Isle produced about 5 monthly re ports per
property, but because they took a week or more to produce, properties could not begin to
analyze monthly activity until the second week of the following month. Moreover, none
of the reports analyzed anything less than an e ntire month at a time; today, reports using
up-to-the minute data on specific customer segments at particular properties are available ,
often the same day, enabling the company to react much more quickly to a wide range
of customer needs .
Isle has cut the time in half needed to construct its core monthly direct-mail cam-
paigns and can generate less involved campa igns practically on the sp o t. In addition to
moving faster, Isle has honed the process of segmentation and now can cross-reference
a wide range of attributes, such as overall customer value , gaming behaviors, and hotel
prefere n ces. This e nables the m to produce more targeted campaigns aimed at p articular
customer segments and particular behaviors.
Isle also has enabled its management and employees to further deepen their under-
standing of customer behaviors by connecting data from its hotel systems and d ata from

80 Pan II • Descriptive Analytics
its custo mer-tracking systems-and to act on that understanding through improved
marketing campaigns a nd heightened levels of customer service. For example, the addi-
tion o f h o tel data offered n ew insights abou t the increased gaming local patrons do w hen
they stay at a ho te l. This, in turn, e nabled new incentive programs (such as a free h otel
night) that h ave pleased locals and increased Isle’s customer loyalty.
The hotel data also has enhanced Isle’s cu stomer h osting program. By automatically
n o tifying h osts w h e n a high-value gu est arrives at a h o te l, hosts have forged deeper re la-
tionships with the ir most importan t clients. “This is by fa r the best tool we’ve had s ince
I’ve been at the company, ” wrote one of the hosts.
Isle of Capri can now do more accurate property-to-property comparisons and
a na lyses, largely because Teradata consolidated disparate data housed at individual
properties a nd centralized it in o ne location. One result: A centralized intranet site
posts daily figures for each individu al property, so they can compare such things as
performance o f revenue fro m slot machines a n d table games, as well as complim entary
redemptio n values. In additio n , the IBM Cognos Business Inte lligen ce tool enables
additio n al comparison s, such as direct-mail redemption values, specific direct-mail
program respon se rates , direct-mail-incented gaming revenue, hotel-incented gaming
revenue, noncomplime nta ry (cash ) revenue from h o tel room reservations , and hote l
room occupa ncy. One clear ben e fit is that it h o lds individua l properties accountable
for consta ntly ra is ing the bar.
Beginning w ith a n important change in marketing strategy that shifted the focus to
customer days, time and again the Teradata/ IBM Cognos BI implementation has dem-
o nstra ted the value of extending the power of data throug hout Isle ‘s e nterprise. This
includes immediate a nalysis of respon se rates to m arketing campaign s a nd the addition
of profit and loss data that has su ccessfully connected customer value and total property
value. O ne example of the p ower of this integratio n: By joining customer value an d total
property value, Isle gains a b e tter understanding of its retail customers- a population
invisible to them before-enabling them to more effectively target marketing efforts , su ch
as radio ads .
Perhaps most sig nificantly, Is le h as begun to add slot m achine data to the mix .
The most importa nt a nd immediate impact will be the way in w hich customer value
w ill inform purch asing of n ew machines a nd product placement o n the customer floor.
Down the road, the additio n of this data also might position Isle to take advantage
o f server-based gaming, w here s lo t machines o n the casino floor w ill essentially be
compute r te rminals that e nable the casino to switch a gam e to a n ew one in a matter
o f seconds.
In sh o rt, as Isle constructs its solutio ns for regularly funneling slot machine data into
the warehouse, its ability to use data to re-imagine the floor an d forge ever deeper and
more lasting relationships w ill exceed anything it mig ht have expected w he n it embarked
o n this project.
QUESTIONS FOR THE OPENING VIGNETTE
1. Why is it impo rta nt for Isle to have an EDW?
2. What were the business challenges or opportunities that Isle was facing?
3. What was the process Isle followed to realize EDW? Commen t on the potential
challe nges Isle mig ht have had going through the process of EDW development.
4. What were the benefits of imple me nting a n EDW at Isle? Can you think of other
potential benefits that were not listed in the case?
5. Why do you think large e nterprises like Isle in the gaming ind u stry can succeed
w itho ut having a capable data ware ho use/business inte lligence infrastructure?

Chapter 3 • Data Warehousing 81
WHAT WE CAN LEARN FROM THIS VIGNETTE
The opening vignette illustrates the s trategic value o f impleme nting an enterprise data
warehouse, alo n g w ith its suppo rting BI m ethods. Isle of Capri Casinos was able to
leverage its data assets spread throughout the e nterprise to be used by knowledge
worke rs (wh erever a nd whenever they are n eeded) to m a ke accurate and timely deci-
sion s. The data warehouse integrated various databases throughout the o rganizatio n
into a s ingle , in-house enterprise unit to generate a s ingle version of the truth for th e
company, putting all d ecis io n makers, from planning to marketing , on the same p age.
Furthermore, by regularly funneling s lot machine d a ta into the ware house, combined
w ith customer-specific rich da ta that comes from variety of sources, Isle significantly
improved its ability to discover patterns to re -imagine / re invent the gaming floor opera-
tions and forge ever deeper and more lasting relationships with its customers. The key
lesson h ere is that a n e nterprise-level data wareho use combined w ith a strategy for its
use in d ecisio n support can result in s ig nificant benefits (fina ncia l a nd othe1wise) fo r
an organization.
Sources: Te radata, Customer Success Stories, teradata.com/t/case-stud.ies/Isle-of-Capri-Casinos-Executive-
Summary-EB6277 (accessed February 2013); www-01.ibm.com/software/analytics/cognos.
3.2 DATA WAREHOUSING DEFINITIONS AND CONCEPTS
Using real-time data warehousing in conjunction w ith DSS and BI tools is an important w ay
to conduct business processes. The opening vignette demonstrates a scenario in which a
real-time active data warehouse supported decision making by analyzing large amounts of
data from various sources to provide rapid results to support critical processes. The single
versio n of the truth stored in the data ware house and provided in an easily digestible form
expands the boundaries of Isle o f Capri’s innovative business processes. With real-time
data flows , Isle can view the current s tate of its business and q uickly ide ntify problems,
w hich is the first and foremost step toward solving them an alytically .
Decision makers require con cise, dependable information about current operations,
tre nds, and cha nges. Data are ofte n fragmented in distinct operational systems, so manag-
ers often m ake decisions with partial informatio n , at best. Data ware housing cuts th rough
this obstacle by accessing, integratin g, and organizing key o perational data in a form that
is consiste nt, re liable, timely, and readily available, wherever and w henever needed.
What Is a Data Warehouse?
In simple te rms, a data warehouse (DW) is a pool of data produced to support decision
ma king; it is a lso a repository of curre nt a nd historical data of potential inte rest to man-
agers throughout the organization. Data a re u su ally structured to be available in a form
ready for a n alytical processing activities (i. e. , online analytical processing [OLAP], data
mining, querying, reporting , and oth e r decision support applicatio ns) . A data wareho use
is a subject-oriented, integrated, time-variant, nonvolatile collection of data in support of
management’s decision-making process.
A Historical Perspective to Data Warehousing
Even though data ware housing is a relatively new term in informatio n techno logy, its
roots can be traced way back in time, even b efore computers were widely used. In th e
early 1900s, people were u sing data (th o ugh mostly via manual methods) to formulate
trends to h elp bu siness users make informed decisions, which is the most prevailing pur-
pose of data ware ho us ing.

82 Pan II • Descriptive Analytics
The motivations that led to developing data warehousing technologies go back to
the 1970s, when the computing world was dominated by the mainframes. Real business
data-processing applications, the ones run on the corporate mainframes, h ad complicated
file structures using early-generation databases (not the table-oriented relational databases
most applications use today) in which they stored data. Although these applications d id
a decent job of performing routine transactional data-processing functions, the data cre-
ated as a result of these functions (such as information about customers, the products
they ordered, and how much money they spent) was locked away in the depths of the
files and databases. When aggregated information such as sales trends by region and by
product type was needed, one had to formally request it from the data-processing depart-
ment, where it was put on a waiting list w ith a couple hundred other report requests
(Hammergren and Simon, 2009). Even though the need for information and the data that
could be used to generate it existed, the database technology was not there to satisfy it.
Figure 3.1 shows a timeline where some of the significant events that led to the develop-
ment of data warehousing are shown.
Later in this decade, commercial hardware and software companies began to emerge
with solutions to this problem. Between 1976 and 1979, the concept for a new company,
Teradata, grew out of research at the California Institute of Technology (Caltech), driven
from discussions with Citibank’s advanced technology group. Founders worked to design
a database management system for parallel processing with multiple microprocessors,
targeted specifically for decision suppo1t. Teradata was incorporated on July 13, 1979, and
started in a garage in Brentwood, California . The name Teradata was chosen to symbolize
the ability to manage terabytes (trillions of bytes) of data .
The 1980s were the decade of personal computers and minicomputers. Before any-
o ne knew it, real computer applications were no longer o nly o n mainframes; they were
all over the place-everywhere you looked in an organization. That led to a portentous
problem called islands of data . The solution to this problem led to a n ew type of soft-
ware, called a distributed database management system, which would magically pull the
requested data from databases across the organization, bring all the data back to the same
place, a nd then consolidate it, sort it, and do whatever else was necessa1y to answer the
user’s question. Although the con cept was a good one and early results from research
were promising, the results were plain and simple: They just didn’t work efficiently in the
real world, and the islands-of-data problem still existed .
./ Mainframe computers
./ Simple data entry
./ Routine reporting
./ Centralized data storage
./ Data warehousing was born
./ Big Data analytics
./ Social media analytics
./ Text and Web analytics
./ Primitive database structures
./ Teradata incorporated
./ Inmon, Building the Oat;a Warehouse
./ Kimball, The Oat;a Warehouse Toolkit
./ EDW architecture design
./ Hadoop , MapReduce, NoSQL
./ In-memory , in-database
—–1970s —-1 ssos—-1ssos—-2ooos—-201os ~
./ Mini/personal computers [PCs)
./ Business applications for PCs
./ Distributer DBMS
./ Relational DBMS
./ Ter adata ships commercial DBs
./ Business Data Warehouse coined
./ Exponentially growing data Web data
./ Consolidation of OW / Bl industry
./ Data warehouse appliances emerged
./ Business intelligence popularized
./ Data mining and predictive modeling
./ Open source software
./ Saas, PaaS, Cloud computing
FIGURE 3.1 A List of Events That Led to Data Warehousing Development.

Chapter 3 • Data Warehousing 83
Meanwhile, Teradata began shipping commercial products to solve this prob-
lem. Wells Fargo Bank rece ived the first Teradata test system in 1983, a parallel RDBMS
(relational database management system) for decision support- the world’s first . By 1984,
Teradata released a production version of their product, and in 1986, Fortune m agazine
named Teradata Product of the Year. Te radata, still in existence today, built the first data
warehousing appliance- a combination of hardware and software to solve the data ware-
housing needs of many. Other companies began to formulate their strategies, as well.
During this decade several other events happened, collectively making it the decade
of data warehousing innovation. For instance , Ralph Kimball founded Red Brick Systems
in 1986. Red Brick began to emerge as a visionary software company by discussing how
to improve data access; in 1988, Barry Devlin and Paul Murphy of IBM Ireland introduced
the term business data warehouse as a key component of business information systems.
In the 1990s a new approach to solving the islands-of-data proble m surfaced. If the
1980s approach of reaching out and accessing data directly from the files and databases
didn’t work, the 1990s philosophy involved going back to the 1970s method, in which
data from those places was copied to another location-only d oing it right this time;
hence, data warehousing was born. In 1993, Bill Inmon wrote the seminal book Building
the Data Warehouse. Many people recognize Bill as the father of data ware housing.
Additional publications emerged, including the 1996 book by Ralph Kimba ll , Tbe Data
Warehouse Toolkit, which discussed general-purpose dimensional design techniques to
improve the data architecture for query-cente red decision support systems.
In the 2000s, in the world of data warehousing, both popularity and the amount of
data continued to grow. The vendor community and options have begun to consolidate.
In 2006, Microsoft acquired ProClarity, jumping into the data warehousing market. In
2007, Oracle purchased Hyperion, SAP acquired Business Objects, and IBM merged w ith
Cognos. The data warehousing leaders of the 1990s have been swallowed by some of
the largest providers of informatio n system solutions in the world. During this time, other
innovations have emerged, including data warehouse appliances from vendors such as
Netezza (acquired by IBM), Greenplum (acquired by EMC) , DATAllegro (acquire d by
Microsoft), and performance ma nageme nt appliances that enable real-time performance
monitoring. These innovative solutions provided cost savings because they were plug-
compatible to legacy data warehouse solutions.
In the 2010s the big buzz has been Big Data. Many be lieve that Big Data is going to
make an impact on data warehousing as we know it. Either they will find a way to coex-
ist (which seems to be the most likely case, at least for several years) or Big Data (and
the technologies that come w ith it) w ill make traditional data warehousing obsolete. The
technologies that came with Big Data include Hadoop , MapReduce , NoSQL, Hive , and so
forth . Mayb e we will see a new te rm coined in the world of data that combines the needs
and capabilities of traditional data warehousing and the Big Data phenomenon.
Characteristics of Data Warehousing
A common way of introducing data warehousing is to refer to its fundamental character-
istics (see Inmon, 2005):
• Subject oriented. Data are organized by detailed subject, such as sales , products, or
customers, containing only information relevant for decision support. Subject orie nta-
tion enables users to determine not only how their business is p e rforming, but why. A
data warehouse differs from an operational database in that most operational databases
have a product orientation and are tuned to h andle transactions that update the data-
base. Subject orientation provides a more comprehensive view of the organization.
• Integrated. Integration is closely related to subject orientation. Data warehouses
must place data from differe nt sources into a consistent format. To do so, they must

84 Pan II • Descriptive Analytics
deal with naming conflicts and discrepancies among units of measure. A data ware-
house is presumed to be totally integrated.
• Time variant (time series). A warehouse maintains historical data. The data
do not necessarily provide current status (except in real-time systems). They detect
trends, deviations, and long-term relationships for forecasting and comparisons, lead-
ing to decision making. Every data warehouse has a temporal quality. Time is the one
important dimension that all data warehouses must support. Data for analysis from
multiple sources conta ins multiple time points (e.g., daily, weekly, monthly views).
• Nonvolatile. After data are entered into a data warehouse, users cannot change or
update the data. Obsolete data are discarded, and changes are recorded as new data.
These characteristics enable data warehouses to be tuned almost exclusively for data
access. Some additional characteristics may include the following:
• Web based. Data warehouses are typically designed to provide an efficient
computing environment for Web-based applications.
• Relational/multidimensional. A data warehouse u ses either a relational struc-
ture or a multidimensional stmcture. A recent survey on multidimensional stmctures
can be found in Romero and Abell6 (2009).
• Clientjserver. A data warehouse uses the client/ server architecture to provide
easy access for end users.
• Real time. Newer data warehouses provide real-time, or active, data-access and
analysis capabilities (see Basu, 2003; and Bonde and Kuckuk, 2004).
• Include metadata. A data warehouse contains metadata (data about data) about
how the data are organized and how to effectively use them.
Whereas a data warehouse is a repository of data, data wareh ousing is lite rally the
entire process (see Watson, 2002). Data warehousing is a discipline that results in appli-
cations that provide decision support capability, allows ready access to business infor-
mation, and creates business insight. The three main types of d ata warehouses are data
marts, operational data stores (ODS), and enterprise data warehouses (EDW). In addition
to discussing these three types of warehouses next, we also discuss metadata.
Data Marts
Whereas a data warehouse combines databases across an entire enterprise, a data mart
is usually smaller and focuses on a particular subject or department. A data m art is a
subset of a data warehouse, typically consisting of a single subject area (e.g ., marketing,
operations). A data mart can be either dependent or independent. A dependent data
mart is a subset that is created directly from the data warehouse. It has the advantages
of using a consistent data model and providing quality data. Dependent data marts sup-
port the concept of a single enterprise-wide data model, but the data warehouse must be
constructed first. A dependent data mart ensures that the end user is viewing the same
version of the data that is accessed by all other data warehouse users. The high cost of
data warehouses limits their use to large companies. As an alternative, many firms use a
lower-cost, scaled-dow n version of a data warehouse referred to as an independent data
mart. An independent data mart is a small warehouse designed for a strategic business
unit (SBU) or a department, but its source is not an EDW.
Operational Data Stores
An operational data store (ODS) provides a fairly recent form of customer information
file (CIF). This type of database is often used as an interim staging area for a data ware-
house. Unlike the static contents of a data warehouse, the contents of an ODS are updated
throughout the course of business operations. An ODS is used for sho1t-term decisions

Chapter 3 • Data Warehousing 85
involving mission-critical applications rather than for the medium- and long-term decisions
associated w ith an EDW. An ODS is similar to short-term memo1y in that it stores only very
recent information. In comparison, a data warehouse is like long-term memory because it
stores permanent information. An ODS consolidates data from multiple source systems and
provides a near-real-time, integrated view of volatile, current data. The exchange, transfer,
and load (ETI) processes (discussed later in this chapter) for an ODS are identical to those
for a data warehouse. Finally, oper marts (see Imhoff, 2001) are created when operational
data needs to be a nalyzed multidimensionally. The data for an oper mart come from an ODS.
Enterprise Data Warehouses (EDW)
An enterprise data warehouse (EDW) is a large-scale data warehouse that is used
across the enterprise for decision support. It is the type of data warehouse that Isle of
Capri developed, as described in the opening vignette. The large-scale nature provides
integratio n of data from many sources into a standard format for effective BI and decision
support applications. EDW are used to provide data for many types of DSS, including
CRM, supply ch ain management (SCM), business performance management (BPM), busi-
ness activity monito ring (BAM), product life-cycle ma nagement (PLM) , revenue manage-
ment, and sometimes even knowledge management systems (KMS) . Application Case 3. 1
shows the variety of b e nefits that telecommunication companies leverage from imple –
menting data warehouse driven analytics solutions.
Metadata
Metadata are data about data (e.g., see Sen , 2004; and Zhao , 2005). Metadata describe
th e structure of and some meaning about data , thereby contributing to their effective or
Application Case 3.1
A Better Data Plan: Well-Established TELCOs Leverage Data Warehousing and Analytics
to Stay on Top in a Competitive Industry
Customer Retention Mobile service p rovide rs (i.e ., Telecommunication
Companies, or TELCOs in sh ort) that helped trigger
the explosive growth of the industry in the mid- to
late-1990s have lo ng reaped the benefits of b eing first
to market. But to stay competitive, these companies
must continuously refine everything from customer
service to plan pricing. In fact, veteran carriers face
many of the sam e ch allenges that up-and-coming
carriers do: retaining custome rs, decreasing costs,
fine-tuning pricing models, improving customer sat-
isfaction, acquirin g new customers and understand-
ing the role of social media in customer loyalty
It’s no secret that the speed a nd success with which
a provider handles service requests directly affects
customer satisfaction and, in turn, the propensity to
churn. But getting down to which factors h ave the
greatest impact is a challenge.
Hig hly targeted data analytics play an ever-
more-critical role in helping carriers secure or
improve their standing in an increasingly competi-
tive marketplace. Here’s how some of the world’s
leading providers are creating a stron g future based
o n solid business and customer intelligence.
“If we could trace th e steps involved with each
process, we could un derstand points of failure and
acceleration, ” notes Roxanne Garcia, manager of
the Commercial Operations Cente r for Telefonica
de Argentina. “We could measure workflows both
within and across functions, anticipate rather than
react to performance indicators, and improve the
overall satisfaction w ith onboardin g new cu stomers.”
The company’s solution was its traceability pro-
ject, w hich began w ith 10 dashboards in 2009. It has
since realized US$2.4 millio n in annualized revenues
(Continued)

86 Pan II • Descriptive Analytics
Application Case 3.1 (Continued}
and cost savings, shortened customer provisioning
times and reduced customer defections by 30%.
Cost Reduction
Staying ahead of the game in any industry depends,
in large part, on keeping costs in line. For France’s
Bouygues Telecom, cost reduction came in the form
of automation. Aladin, the company’s Teradata-based
marketing operations management system, auto-
mates marketing/communications collateral produc-
tion. It delivered more than US$1 million in savings
in a single year while tripling email campaign and
content production.
“The goal is to be more productive and respon-
sive, to simplify teamwork, [and] to standardize and
protect our expertise,” notes Catherine Corrado, the
company’s project lead and retail communications
manager. “(Aladin lets] team members focus on value-
added work by reducing low-value tasks. The end
result is more quality and more creative [output].”
An unintended but very welcome benefit of
Aladin is that other departments have been inspired to
begin deploying similar projects for everything from
call center support to product/offer launch processes.
Customer Acquisition
With market penetration near or above 100% in
many countries, thanks to consumers who own
multiple devices, the issue of new customer acquisi-
tion is n o small challe nge. Pakistan’s largest carrier,
Mobilink, also faces the difficulty of operating in a
market where 98% of users have a pre-paid plan that
requires regular purchases of additional minutes.
“Topping up, in particular, keeps the revenues
strong and is critical to our company’s growth, ” says
Umer Afzal, senior manager, BI. “Previously we
lacked the ability to enhance this aspect of incremen-
tal growth. Our sales info1mation model gave us that
ability because it helped the distribution team plan
sales tactics based on smarter data-driven strategies
that keep our suppliers [of SIM cards, scratch cards
and electronic top-up capability] fully stocked.”
As a result, Mobilink has not only grown sub-
scriber recharges by 2% but also expanded new cus-
tomer acquisition by 4% and improved the profitability
of those sales by 4%.
Social Networking
The expanding use of social networks is chang-
ing how many organizations approach everything
from customer service to sales a nd marketing. More
carriers are turning their attention to social net-
works to better understand a n d influence customer
behavior.
Mobilink has initiated a social n etwork analy-
sis project that will enable the company to explore
the concept of viral marketing and identify key
influencers who can act as brand ambassadors to
cross-sell products. Velcom is looking for similar
key influencers as well as low-value customers
whose social value can be leveraged to improve
existing relationships. Meanwhile, Swisscom is
looking to combine the social network aspect of
customer behavior with the rest of its a nalysis over
the next several months .
Rise to the Challenge
While each market presents its own unique chal-
le nges, most mobile carriers spend a great deal of
time and resources creating, deploying and refining
plans to address each of the ch allenges outlined
here . The good news is that just as the industry a nd
mobile technology h ave expanded and improved
over the years, so also have the data analytics solu-
tions that have been created to meet these chal-
le nges head on.
Sound data analysis uses existing customer,
business and market intelligence to predict and influ-
ence future behaviors and outcomes. The end result
is a smarter, more agile and more successful approach
to gaining market share and improving profitability.
QUESTIONS FOR DISCUSSION
1. What are the main challenges for TELCOs?
2. How can data warehousing and data analytics
help TELCOs in overcoming their challenges?
3. Why do you think TELCOs are well suited to take
full advantage of data analytics?
Source: Teradata Magazine, Case Study by Colleen Marble , “A
Better Data Plan: Well-Established Telcos Leverage Analytics
to Stay on Top in a Competitive Ind ustry” http://www.
teradatamagazine.com/v13n01/Features/A-Better-Data-
Plan/ (accessed September 2013).

Chapter 3 • Data Warehousing 87
ineffective use. Mehra (2005) indicated that few organizations really understand metadata,
and fewer understand how to design and implement a metadata strategy. Metadata are
generally defined in terms of usage as technical or business metadata. Pattern is another
way to view metadata. According to the pattern view, we can d ifferentiate between syn-
tactic metadata (i.e., data d escribing the syntax of data) , structural me tadata (i.e., data
describing the structure of the data), and semantic metadata (i.e ., data describing the
meaning of the data in a specific domain).
We next explain traditional metadata patterns and insights into how to implement
an effective metadata strategy via a holistic approach to enterprise metadata integration.
The approach includes ontology and metadata registries; enterprise information integration
(Ell); extraction, transformation, and load (ETI); and service-oriented architectures (SOA).
Effectiveness, extensibility, reusability, interoperability, efficiency and performance, evolution,
entitlement, flexibility, segregation, user inte1face, versioning, versatility, and low maintenance
cost are some of the key requirements for building a successful metadata-driven enterprise.
According to Kassam (2002), business metadata comprise information that increases
our understanding of traditional (i.e., structured) data. The primary purpose of metadata
should be to provide context to the reported data ; that is , it provides enriching informa-
tion that leads to the creation of knowledge. Business me tadata, though difficult to pro-
vide efficiently, release more of the potential of structured d ata . The context need not
be the same for all users. In many ways, metadata assist in the conversion of data and
information into knowledge. Metadata form a foundation for a metabusiness architecture
(see Bell, 2001). Tannenbaum (2002) described how to identify metadata requirements.
Vaduva and Vetterli (2001) provided an overview of metadata management for data ware-
housing. Zhao (2005) described five levels of metadata management maturity: (1) ad
hoc, (2) discovered, (3) managed, ( 4) optimized, and (5) automated. These levels help in
understanding where an organization is in terms of how and how well it uses its metadata.
The design, creation, and use of metadata-descriptive or summary data about
data-and its accompanying standards may involve ethical issues. There are ethical
considerations involved in the collection and ownership of the information contained
in metadata, including privacy and intellectual prope 1ty issues that a rise in the design,
collection, and dissemination stages (for more , see Brody, 2003).
SECTION 3.2 REVIEW QUESTIONS
1. What is a data warehouse?
2. How does a data warehouse differ from a database?
3. What is an ODS?
4. Differentiate among a data m art, an ODS, and an EDW.
5. Explain the importance of metadata.
3.3 DATA WAREHOUSING PROCESS OVERVIEW
Organizations, private and public, continuously collect data , information, and knowledge
at an increasingly accelerated ra te and store them in computerized systems. Maintaining
and using these data and information becomes extremely complex, especially as
scalability issues arise. In addition, the number of users needing to access the informa-
tion continues to increase as a result of improved reliability and availability of network
access, especially the Internet. Working with multiple databases, e ither integrated in a
data warehouse or not, has become an extremely difficult task requiring considerable
expertise, but it can provide immense benefits far exceeding its cost. As an illustrative
example , Figure 3 .2 shows business benefits of the enterprise data warehouse built by
Teradata for a major automobile ma nufacture r.

88 Pan II • Descriptive Analytics
,—- —– —
Enterprise Data Warehouse
One management and analytical platform
for product configuration, warranty,
— and diagnostic readout data
I I I I
Reduced Reduced Warranty Improved Cost of Accurate IT Architecture
Infrastructure Expense Quality Environmental Standardization
Expense Improved reimbursement Faster ident ification ,
Performance
One strategic platform for
2 / 3 cost r eduction through accuracy through improved prioritization, and
Reporting
business intelligence and
data mart consolidatio n claim data quality resolution of quality issues compliance r eporting
FIGURE 3.2 Data-Driven Decision Making-Business Benefit s of a n Enterprise Data W arehouse.
Application Case 3.2
Data Warehousing Helps MultiCare Save More Lives
In the spring of 2012, leadership at MultiCare Health
System (MultiCare)- a Tacoma, Washington- based
health system- realized the results of a 12-month
journey to reduce septicemia.
The effort was supported by the system’s
top leadership, who participated in a data-driven
approach to prioritize care improvement based on
an analysis of resources consumed and variation in
care outcomes. Reducing septicemia (mortality rates)
was a top priority for MultiCare as a result of three
hospitals performing below, and one that was per-
forming well below, national mortality averages .
In September 2010, MultiCare implemented
Health Catalyst’s Adaptive Data Warehouse, a
healthcare-specific data model, and subsequent clin-
ical and process improvement services to measure
and effect care through organizational and process
improvements. Two major factors contributed to the
rapid reduction in septicemia mortality.
Clinical Data to Driv e Improvement
The Adaptive Data Warehouse™ organized and sim-
p lified data from multiple data sources across the
continuum of care. It became the single source of
truth requisite to see care improvement opportuni-
ties and to measure change. It also proved to be an
important means to unify clinical, IT, and financial
leaders and to drive accountability for performa nce
improvement.
Because it proved difficu lt to define sepsis due
to the complex comorbidity factors leading to sep-
ticemia, MultiCare partnered with Health Catalyst
to refine the clinical definition of sepsis. Health
Catalyst’s data work allowed MultiCare to explore
around the boundaries of the definition an d to ulti-
mately settle on an algorithm that defined a septic
patient. The iterative work resulted in increased con-
fidence in the severe sepsis cohort.
Sy ste m -Wide Critical Care Collaborative
The establishment and collaborative efforts of per-
manent, integrated teams consisting of clinicians,
technologists, analysts, and quality personnel were
essential for accelerating MultiCare’s efforts to
reduce septicemia mortality. Together the collabora-
tive addressed three key bodies of work- standard
of care definition, early identification, and efficient
delivery of defined-care standard.
Standard o f Care: Sev ere Sep sis
Order Set
The Critical Care Collaborative streamlined seve ral
sepsis order sets from across the organization into
one system-wide standard for the care of severely

septic patients. Adult patients presenting with sepsis
receive the same care, no matter at which MultiCare
hospital they present.
Early Identification: Modified Early
Warning System (MEWS)
MultiCare developed a modified early warning sys-
te m (MEWS) dashboard that leveraged the cohort
definition and the clinical EMR to quickly identify
patients w ho were trending toward a sudden down-
turn. Hospital staff constantly monitor MEWS, which
serves as an early detection tool for caregivers to
provide preemptive interventions.
Efficient Delivery: Code Sepsis
(“Time Is Tissue”)
The final key piece of clinical work undertaken by
the Collaborative was to ensure timely impleme nta –
tion of the defined standard of care to patients who
are more efficie ntly identified. That model already
exists in healthcare a nd is known as the “code” pro-
cess. Similar to other “code” processes (code trauma,
Chapter 3 • Data Warehousing 89
code neuro, code STEMI), code sepsis at MultiCare
is designed to bring together essential caregivers in
order to efficiently deliver time-sensitive, life -saving
treatments to the patient presenting with severe
sepsis.
In just 12 months, MultiCare was able to
redu ce septicemia mo rtality rates by a n average of
22 percent, leading to more than $1.3 million in
validated cost savings during that same period. The
sepsis cost reductio ns and quality of care improve-
ments h ave raised the exp ectatio n that similar
results can b e realized in other areas of MultiCare,
including heart failure, emergency department
performance, and inpatient throughput.
QUESTIONS FOR DISCUSSION
1. What do you think is the role of data wareh ous-
ing in healthcare systems?
2. How did MultiCare u se data warehousing to
improve h ealth outcom es?
Source.- healthcatalyst.com/success_stories/multicare-2 (ac-
cessed February 2013).
Many organizatio ns nee d to crea te data warehouses-massive data stores of time-
series data for decision support. Data are imported from various external and internal
resources and are cleansed a nd o rganized in a manner consistent with the organization’s
needs. Afte r the data are populated in the data warehouse, data marts can be loaded for a
specific area o r department. Alternatively , data marts can be created first, as needed, and
the n integrated into an EDW. Often , though, data marts are not developed, but data are
simply loaded o nto PCs or left in their original state for direct ma nipulation u sing BI tools .
In Figure 3.3, we show the data wareh o use con cept. The following a re the major
compone nts of the data warehousing process:
• Data sources. Data are sourced from multiple independen t operational “legacy”
system s and possibly from external data provide rs (such as the U.S. Cen sus). Data
may also come from an OLTP o r ERP system. Web data in the form of Web logs may
also feed a data ware house.
• Data extraction and transformation. Data are extracted and properly trans-
formed u sing custom-writte n or comme rcial software called ETL.
• Data loading. Data are loaded into a staging area, where they are transformed
a nd cleansed. The d ata are then ready to load into the data ware h ouse and/ or data
m arts.
• Comprehensive database. Essentially, this is the EDW to support a ll decision
a nalysis by providing relevant summarized and detailed informatio n o riginating
from many different sources.
• Metadata. Metadata are maintained so that they can be assessed by IT personnel
a nd users. Metadata include software programs about data and rules for organizing
d ata summaries that a re easy to index a nd search , especia lly w ith Web tools.

90 Pan II • Descriptive Analytics
Data
Sources
~
ETL
Process
~
Select
Extract
~ [ Transform
Integrate
Load
Metadata
Enterprise
Data
Warehouse
Replication
No data marts option
Access
Applications
(Visualization)
Data/text
mining
DLAP,
Dashboar d ,
Web
FIGURE 3.3 A Data Warehouse Framework and Views.
• Middleware tools. Middleware tools e n able access to th e data warehouse. Power
users su ch as analysts may w rite their own SQL queries. Others may employ a man-
aged query environment, su c h as Business Objects, to access data . There are many
fro nt-e nd applicatio ns tha t business u sers can u se to interact with data stored in the
data repositories, including data mining, OLAP, repo rting tools , and data visualiza-
tion tools.
SECTION 3 .3 REVIEW QUESTIONS
1. Describe the data warehousing process.
2 . Describe the m ajo r components of a data ware h ou se.
3. Identify and discuss the role o f middleware tools.
3.4 DATA WAREHOUSING ARCHITECTURES
There are several basic information system architectures that can be u sed for data ware-
housing. Generally speaking, these architectures are comm o nly called client/ server or
n-tier architectures , of which two-tier a nd three-tier architectures are the most common
(see Figures 3.4 and 3.5), but sometimes there is simply one tier. These types of mu lti-tiered
Tier 1: Tier 2: Tier 3:
Client workstation Application server Database server
FIGURE 3.4 Architecture of a Three-Tier Data Warehouse.

Tier 1:
Client workstation
Tier 2:
Application and
database server
FIGURE 3.5 Architecture of a Two-Tier Data Warehouse.
Chapter 3 • Data Warehousing 91
architectures are known to be capable of serving the n eeds of large-scale, performance-
de manding information systems such as data warehouses. Referring to the u se of n-tiered
architectures for data warehousing, Hoffer et al. (2007) distinguish ed a mon g these archi-
tectures by dividing the data warehouse into three p arts:
1. The data warehouse itself, which contains the data and associated software
2. Data acquisition (back-end) software, w hich extracts data from legacy systems and
external sources, consolida tes and summarizes them, and loads them into the data
ware house
3. Client (front-end) software, which allows users to access a nd analyze data from the
warehouse (a DSS/ Bl/business an alytics [BAJ e ng ine)
In a three-tie r architecture, operational syste ms contain the da ta and the software for
data acquisition in o ne tier (i.e., the server), the data ware house is an other tier, and th e
third tier includes the DSS/BI/BA engine (i.e. , the applicatio n server) and the client (see
Figure 3.4). Data from the wareh o use are processed twice and dep osited in an additio nal
multidimensional database, organized for easy multidimensional analysis and presenta-
tion, o r replicated in data marts. The advantage of the three-tie r architecture is its separa-
tio n of the functions of the data warehouse, w hic h e liminates resource constraints and
makes it possible to easily create data marts.
In a two-tier architecture, the DSS e ngine physically runs o n the same h ardware
platform as the data warehouse (see Figure 3.5). Therefore, it is more economical than
the three-tier structure. The two-tier architecture can have performance problems for large
data wareh o uses that work w ith data-intensive applicatio ns fo r decision suppo rt.
Mu ch of the common wisdom assumes an absolutist approach, maintaining that
one solution is better than the o ther, despite the o rganizatio n ‘s circumstances and unique
needs. To further complicate these architectural decisions, many con sultants and software
vendors focus on o ne portio n of the architecture, the refore limiting the ir capacity and
mo tivation to assist an organization through the o ptions based o n its needs. But these
asp ects are being questioned a nd a n alyzed. For example, Ball (2005) provided deci-
sion criteria for organizations tha t plan to implement a BI application and have already
determined their n eed for multidimensional data marts but need h elp determining th e
appropriate tie red architecture. His criteria revolve around forecasting needs for space
and speed of access (see Ball, 2005, for details).
Da ta ware housing and the Inte rnet are two key technologies that offer important
solutions for managing corporate data. The integratio n of these two technologies pro-
duces Web-based data ware h ou sing. In Figure 3.6, we show the arch itecture of Web-
based data ware ho using . The architecture is three tie red and includes the PC client, Web
server, and application server. On the clie nt side, the u ser needs an Internet connection
and a Web browser (preferably Java e n abled) throug h the familiar graphical u ser inter-
face (GUI) . The Inte rne t/ intrane t/ extra net is the communicatio n me dium between client

92 Pan II • Descriptive Analytics
Client
[Web browser) Internet/
Intranet/
Extra net
Web pages
Web
server
FIGURE 3.6 Architecture of Web-Based Data Warehousing.
Application
server
Data
warehouse
and servers. On the server side, a Web server is used to manage the inflow and outflow
of information between client and server. It is backed by both a data warehouse and an
application server. Web-based data warehousing offers several compelling advantages,
including ease of access, platform independence, and lower cost.
The Vanguard Group moved to a Web-based, three-tier architecture for its e nterprise
architecture to integrate all its data and provide customers with the same views of data
as internal users (Dragoon , 2003). Likewise, Hilton migrated all its independent client/
server systems to a three-tier data warehouse, using a Web design enterprise system. This
chan ge involved an investment of $3.8 million (excluding labor) and affected 1,500 users.
It increased processing efficiency (speed) by a factor of six. When it was deployed, Hilton
expected to save $4.S to $5 million annually. Finally, Hilton experimented with Dell ‘s clus-
tering (i.e., parallel computing) technology to enhance scalability and speed (see Anthes,
2003).
Web architectures for data warehousing are similar in structure to other data ware-
housing architectures, requiring a design choice for h ousing the Web data warehouse
w ith the transaction server or as a separate server(s). Page-loading speed is an important
consideration in designing Web-based applications; therefore, server capacity must be
planned carefully .
Several issues must be con sidered when deciding which architecture to use. Among
them are the following:
• Which database management system {DBMS) should be used? Most data
warehouses are built using relational database management systems (RDBMS). Oracle
(Oracle Corporation, oracle.com), SQL Server (Microsoft Corporation, microsoft.
com/sql), and DB2 (IBM Corporation, http://www-Ol.ibm.com/software/data/
db2/) are the o n es most commonly used. Each of these products supports both
client/server and Web-based architectures.
• Will parallel processing and/or partitioning be used? Parallel processing
enables multiple CPUs to process data warehouse query requests simultaneously
and provides scalability. Data warehouse designers need to decide whether the data-
base tables will be partitioned (i. e., split into smaller tables) for access efficiency and
what the criteria w ill be. This is an important consideration that is necessitated by

Chapter 3 • Data Warehousing 93
the large amounts of d ata contained in a typical data warehou se. A recent survey o n
parallel and distributed data ware ho uses can be fou nd in Furtado (2009). Teradata
(teradata.com) h as su ccessfully ad opted and ofte n commended on its novel imple-
mentation of this approach .
• Will data migration tools be used to load the data warehouse? Moving
data from a n existing system into a data warehouse is a tedious and laborious task.
Depending o n the diversity and the locatio n of the data assets, migration may be
a re latively simple procedure or (in contrast) a mo nths-lo ng project. The resu lts
of a thorough assessment of the existing data assets sh ould be used to determine
wheth er to use migration tools and, if so, what capabilities to seek in those com-
mercial tools.
• What tools will be used to support data retrieval and analysis? Often it
is necessary to use specialized tools to periodically locate, access, analyze, extract,
transform, and load n ecessary data into a data warehouse. A decision has to be
m ade o n (1) developing the migration tools in-hou se, (2) purchasing them from a
third-party provide r, or (3) using the ones provided w ith the data warehouse system .
Overly complex, real-time migrations warran t specialized third -part ETL tools.
Alternative Data Warehousing Architectures
At the highest level, data ware ho use architecture design viewpoints can be categorized
into enterprise-wide data warehouse (EDW) design and data mart (DM) design (Golfarelli
and Rizzi, 2009). In Figure 3.7 (parts a- e), we sh ow some alte rnatives to the basic archi-
tectural design types that are neither pure EDW n or pure DM, but in between or beyond
the traditional a rchitectural structures. Notable new o nes include hub -and -spoke and
federated architectures. The five architectures sh own in Figure 3.7 (parts a- e) are pro-
posed by Ariyachandra and Watson (2005 , 2006a, and 2006b). Previously, in an extensive
stud y, Sen and Sinha (2005) identified 15 different data wareh ousing me thodologies. Th e
sources of these me thodologies are classified into three broad categories: core -techn o logy
vendo rs, infrastructure vendors, and informatio n-modeling companies.
a. Independent data marts. This is argu ably the simp lest and the least costly archi-
tecture alte rna tive. The data marts are develope d to operate indepen dently of each
a nother to serve the needs of individual organizatio nal units . Because of their inde-
pendence, they may h ave incons iste nt data definitions and different dimensions and
measures, making it difficult to an a lyze data across the data marts (i.e., it is difficult,
if not impossible, to get to the “o ne versio n of the truth”).
b. Data mart bus architecture. This architecture is a viable alternative to the inde –
pendent data marts where the individual m arts are linked to each other via some
kind of middleware. Because the d ata are linked among the individual marts, there
is a better chan ce of maintaining data consistency across the en terprise (at least at
the metadata level). Even though it a llows fo r complex data que ries across data
m arts, the performance of these types of an alysis may n ot be at a satisfactory level.
c. Hub-and-spoke architecture. This is pe rhaps the most famous data wareh ous-
ing a rchitecture today. Here the attentio n is focused on building a scalable and
maintainable infrastructure (ofte n develo ped in an iterative way, subject area by
subject area) that includes a centralized data warehouse and several dependen t data
marts (each for an organizational unit). This architecture allows for easy customiza-
tion of user inte rfaces and reports. O n the n egative side, this architecture lacks th e
holistic e nterprise view, and may lead to data redundancy and data late ncy.
d. Centralized data warehouse. The centralized data warehouse architecture is
similar to the hub-a nd-spoke architecture except that there are n o d e pendent data
marts; instead, there is a g igantic e nterprise data wareh ouse that serves the needs

94 Pan II • Descriptive Analytics
(a) Independent Data Marts Architecture
Source
systems
ETL
Staging
area
Independent data marts
[atomic/summarized data]
(bl Data Mart Bus Architecture with Linked Dimensional Data Marts
Source
systems
ETL
Staging
area
Dimensionalized
data marts linked by
conformed dimensions
[atomic/summarized data]
(cl Hub-and-Spoke Architecture (Corporate Information Factory)
Source
systems
ETL
Staging
area
Normalized relational
warehouse [atomic data]
End-user
access and
applicat ions
End-user
access and
applications
access and
applications
Dependent data marts
[summarized/some atomic data)
(d) Centralized Data Warehouse Architecture
Source
systems
ETL
Staging
area
(e) Federated Architecture
Existing data warehouses
Data marts and
legacy systems
Normalized relational
warehouse [atomic/some
summarized data)
Data mapping/ metadata
Logical/physical integration
of common data elements
End-user
access and
applications
End-user
access and
applications
FIGURE 3.7 Alternative Data Warehouse Architectures. Source: Adapted from T. Ariyachandra and
H. Watson, “Which Data Warehouse Architecture Is Most Successful?” Business Intelligence Journal,
Vol. 11, No. 1, First Quarter, 2006, pp. 4–6.
of all organizatio nal units. This centralized approach provides users w ith access to
all data in the data warehouse instead of limiting them to data marts. In addition,
it reduces the amount of data the technical team has to transfer or ch ange, there-
fore simplifying data management and administration. If design ed and implemented
properly, this architecture provides a time ly and ho listic view of the enterprise to

Chapter 3 • Data Warehousing 95
Trnosactiooal Use~~ ~ ~ ~ ~
Transactional Data y T l T
Data Transformation l _
Operational Data Store [ODS]
“Enterprise” Data Warehouse
Data Replication
Data Marts
Decision Users
Strategic
Users
Tactical Reporting Data Event-driven/
Users OLAP Users M iners Closed Loop
(/J
:::,
CD
Q)
OJ
cu
(/J
(/J
Q)
~
Q)
(/J
·c:
Q.
‘-
Q)
~
C
w
” Q) ‘-cu
5
..<:!! "O "O ~ C OJ "ui - Q) ~o ·~ffi .r: cu 0.. .Cl cu ~ cu 0 ai "O 0 ~ cu ti 0 cu u ·01 _.:3 ~ C Q) E Q) OJ cu C cu ~ Q) (/J cu .Cl fl cu 0 "O C cu E tl ci5' Q) (/J T: Q. '- 2l C w C 0 ·.:; cu Ul ~ Q) "S .~ ~ c 0 Q) UUJ I c EJJ ·E O CU o ~ C "O -5w Q) "O f- C "O cu ffi ~ (/J Q. (/J Q. Q) :::, C UJ "ui :J CD FIGURE 3.8 Teradata Corporation's Enterprise Data Warehouse. Source: Teradata Corporation (teradata.com) . Used with permission. w h omever, whenever, and wherever they may be within the organizatio n . The central data warehou ses architecture, which is advocated mainly by Teradata Corp., advises using data warehouses without any data marts (see Figure 3.8). e. Federated data warehouse. The federated approach is a concession to the natu- ral forces that undermine the best p lans for developing a perfect system. It uses a ll possible means to integrate analytical resources from multiple sources to mee t changing n eeds or business conditions. Essentially, the federated approach involves integrating disparate systems. In a federated architecture, existing decision support structures are left in place, and data are accessed from th ose sources as needed. Th e federated approach is supported by middleware vendors that propose distributed query and join capabilities. These eXtensib le Markup Language (XML)-based tools offer users a g loba l view of distributed data sources, includi ng data warehouses, data marts, Web sites, documents, a nd operational systems. When users choose query objects from this view and press the submit button, the tool automatically que ries the distributed sources, joins the results, and presents them to the user. Because of performance and data quality issues, most experts agree that federated approaches work well to supplement data warehouses, not replace them (see Eckerson, 2005). Ariyach andra and Watson (2005) identified 10 factors that potentially affect th e architecture selectio n decision: 1. Informatio n interdependence between organizational units 2. Upper manageme n t's information needs 3. Urgen cy of need fo r a data wareh ouse 96 Pan II • Descriptive Analytics 4. Nature of end-user tasks 5. Constraints on resources 6. Strategic view of the data wa re house prior to implemen tation 7. Compatibility with existing systems 8. Perceived ability of the in-house IT staff 9. Technical issues 10. Social/political factors These facto rs are similar to many success factors described in the lite rature for info rmatio n syste m s projects and DSS a nd BI projects. Technical issues, beyond provid- ing technology that is feasibly ready for u se, are important, but often n ot as important as behavioral issu es, such as meeting upper management's information needs a n d u ser involvement in the development process (a social/political factor) . Each data warehousing architecture has specific applicatio n s for which it is most (and least) effective and thus provides maximal benefits to the organization. However, overall, the data mart structu re seems to be the least effective in practice. See Ariyachandra and Watson (2006a) for some addition al details. Which Architecture Is the Best? Ever since data wareh ou sing became a critical part of modern enterprises, the question of which data warehouse architecture is the best has been a topic of regular discus- sion . The two gurus of the data warehousing field, Bill Inmon and Ralph Kimball, are at the heart of this discussion. Inmon advocates the hub-and-spoke architecture (e.g., the Corporate Informatio n Factory), whereas Kimball promotes the data mart bus architectu re w ith conformed dimensions. Othe r architectures are possible, but these two options are fundamentally diffe re nt approaches, and each has stron g advocates . To shed light on this controversial question, Ariyachandra a nd Watson (2006b) condu cted an empirical study. To collect the data, they used a Web-based survey targeted at individuals involved in data ware house implementatio n s. Their survey included questions about the respondent, the respondent's company, the company's data warehouse, and the success of the data warehouse a rchitecture . In total, 454 respondents provided u sable info rmation. Surveyed companies ranged from small (less than $10 million in revenue) to large (in excess of $10 billion). Most of the companies were located in the United States (60%) and represented a variety of industries, w ith the financial services industty (15%) providing the most responses. The predominant architecture was the hub-and-spoke architecture (39%), followed by the bus architecture (26%), the centralized architecture (17%), indep e ndent data marts (12%), and the federated architecture C 4%). The most common platform for hosting the data warehouses was Oracle (41%), followed by Microsoft (19%), and IBM (18%). The average (mean) gross revenue var- ied from $3.7 billion for independen t data marts to $6 billion for the federated architecture. They used four measures to assess the su ccess of the a rchitectures: (1) information quality, (2) system quality, (3) individual impacts, and (4) organizatio nal impacts. The questions used a seven-po int scale, with the higher score indicating a more successful architecture. Table 3. 1 shows the average scores for the measures across the a rchitectures. As the results of the study indicate, independent data marts scored the lowest on all measures. This finding confirms the conventio nal w isdom that independent data marts are a poor architectural solution. Next lowest on all measures was th e federated architec- ture. Firms sometimes have disparate decision support platforms resulting from mergers and acquisitio ns, a nd they may ch oose a federated approach, at least in th e sh ort run. The findings suggest that the federated architecture is not an optimal long-term solution. What is interesting, h owever, is the similarity of the averages for the bus, hub-and-spoke, a nd centt·alized a rc hitectures. The differences a re sufficiently small that no claims can be Chapter 3 • Data Warehousing 97 TABLE 3.1 Average Assessment Scores for the Success of the Architectures Centralized Hub-and Architecture Independent Bus Spoke (No Dependent Data Marts Architecture Architecture Data Marts) Information Quality 4.42 5.16 5.35 5.23 System Quality 4.59 5.60 5.56 541 Individual Impacts 5.08 5.80 5.62 5.64 Organizational Impacts 4.66 5.34 5.24 5.30 made for a particular architecture's superiority over the others, at least based on a simple comparison of these success measures. They also collected data on the domain (e.g., varying from a subunit to company- wide) and the size (i.e., amount of data stored) of the warehouses. They found that the hub-and-spoke architecture is typically used with more enterprise-wide impleme ntations and larger warehouses. They also investigated the cost and time required to implement the different architectures. Overall, the hub-and-spoke architecture was the most expen- sive and time-consuming to imple ment. SECTION 3.4 REVIEW QUESTIONS 1. What are the key similarities and differences between a two-tie red architecture and a three-tiered architecture? 2. How has the Web influenced data warehouse design? 3. List the alternative data warehousing architectures discussed in this section. 4. What issues should be considered when deciding which architecture to use in devel- oping a data warehouse? List the 10 most important factors. 5. Which data warehousing architecture is the best? Why? 3.5 DATA INTEGRATION AND THE EXTRACTION, TRANSFORMATION, AND LOAD (Ell) PROCESSES Global competitive pressures, demand for return on investment (ROI), management and investor inquiry, and government regulations are forcing business managers to rethink how they integrate and manage their businesses. A decision m aker typically needs access to multiple sources of data that must be integrated. Before data warehouses, data marts , and BI software, providing access to data sources was a major, laborious process. Even with modern Web-based data management tools, recognizing what data to access and providing them to the decision maker is a nontrivial task that requires database specialists. As data warehouses grow in size, the issues of integrating data grow as well. The business analysis needs continue to evolve. Mergers and acquisitions, regula- tory requirements, and the introduction of new channels can drive changes in BI require- me nts . In addition to historical , cleansed, consolidated, and point-in-time data, business users increasingly demand access to real-time, unstructured, and/ or remote data. And everything must be integrated with the contents of an existing data warehouse. Moreover, access via PDAs and through speech recognition a nd synthesis is becoming more com- monplace , further complicating integration issues (Edwards , 2003). Many integration pro- jects involve enterprise-wide systems. Orovic (2003) provided a checklist of what works and what does not work when attempting such a project. Properly integrating d ata from Federated Architecture 4.73 4.69 5.15 4.77 98 Pan II • Descriptive Analytics various databases and other disparate sources is difficult. But when it is not done prop- erly, it can lead to disaster in enterprise-wide systems such as CRM, ERP, and supply chain projects (Nash, 2002). Data Integration Data integration comprises three major processes that, when correctly implemented, permit data to be accessed and made accessible to an array of ETL and analysis tools and the data warehousing environment: data access (i.e. , the ability to access and extract data from any data source), data federation (i.e. , the integration of business views across mul- tiple data stores), and change capture (based on the identification, capture, and delivery of the changes made to enterprise data sources). See Application Case 3.3 for an example of how BP Lubricant benefits from imp lementing a data warehouse that integrates data Application Case 3.3 BP Lubricants Achieves BIGS Success BP Lubricants established the BIGS program follow- ing recent merger activity to deliver globally con- sistent and transparent management information. As well as timely business intelligence, BIGS provides detailed, consistent views of performance across functions such as finance , marketing, sales, and sup- ply and logistics. BP is o ne of the world 's largest oil and pet- rochemicals groups. Part of the BP pie group, BP Lubricants is an established leader in the global automotive lubricants market. Perhaps best known for its Castro! brand of o ils, the business operates in over 100 countries and employs 10,000 people. Strategically, BP Lubricants is concentrating on fur- ther improving its customer focus and increasing its effectiveness in automotive markets. Following recent merger activity, the company is undergoing transfor- mation to become more effective and agile and to seize opportunities for rapid growth. Challenge Following recent merger actlVlty, BP Lubricants wanted to improve the consistency, transparency, and accessibility of management information and business intelligence. In order to do so, it needed to integrate data held in disparate source systems, without the delay of introducing a standardized ERP system. Solution BP Lubricants implemented the pilot for its Business Intelligence and Global Standards (BIGS) program, a strategic initiative for management information and business intelligence. At the heart of BIGS is Kalida, an adaptive enterprise data warehousing solution for preparing, implementing, operating, and managing data warehouses. Kalido's federated enterprise data warehous- ing solution supported the pilot program's com- plex data integration and diverse reporting require- ments. To adapt to the program's evolving reporting requirements, the software also enabled the under- lying information architecture to be easily modi- fied at high speed while preserving all information. The system integrates and stores information from multiple source systems to provide consolidated views for : • Marketing. Customer proceeds and mar- gins for market segments with drill down to invoice-level detail • Sales. Sales invoice reporting augmented with both detailed tariff costs and actual payments • Finance. Globally standard profit and loss, balance sheet, and cash flow statements- with audit ability; customer debt management sup- ply and logistics; consolidated view of order and movement processing across multiple ERP platforms Benefits By improving the visibility of consisten t, timely data, BIGS provides th e information needed to assist the business in ide ntifying a multitude of business opportunities to maximize m argins and/or ma n age associated costs. Typical responses to the benefits of consiste nt data resulting from th e BIGS pilot include: • Improved consistency a nd transparency of business data • Easier, faster, and more flexible reporting • Accommodatio n of both global a nd local sta nda rds • Fast, cost-effective, and flexible implem e nta - tion cycle • Minimal disruption of existing business pro- cesses and the day-to-day business Chapter 3 • Data Warehousing 99 • Identifies data quality issues and en courages their resolution • Improved ability to respond inte lligently to new business opportunities QUESTIONS FOR DISCUSSION 1. What is BIGS at BP Lubricants? 2. Wha t were the challenges, the proposed solu- tio n , and the obtained results w ith BIGS? Sources.- Kalido, "BP Lub ricants Achieves BIGS, Key IT Solutions," http://www.kalido.com/ customer-stories/bp-plc.htm (accessed on August 2013). Kalido, "BP Lubricants Achieves BIGS Success," kalido.com/collateraVDocuments/English-US/ CS-BPo/420BIGS (accessed August 2013); a nd BP Lubricant ho me page, bp.corn/lubricanthome.do (accessed August 2013). from many sources. Some vendors, such as SAS Institute, Inc., have deve loped strong data integratio n tools . The SAS enterprise data integration server includes customer data integration tools that improve data quality in the integration process. The Oracle Bu siness Inte lligence Suite assists in integrating data as well. A ma jo r purpose of a data warehouse is to integrate data fro m multiple systems. Various integration technologies e nable data a nd metadata integratio n : • Enterprise a pplicatio n integratio n (EAI) • Service-orie nted architecture (SOA) • Enterprise informatio n integ ratio n (Ell) • Extractio n , transformation, and load (ETL) Enterprise application integration (EAi) provides a vehicle for pushing data from source systems into the data warehouse. It involves integrating application function- ality a nd is focu sed o n sharing functionality (rather than da ta) across systems, thereby en abling flexibility and reuse. Traditio n ally, EAI solutio n s have focused on e nabling application reuse at the application programming inte rface (API) level. Recently, EAI is accomplished by u sing SOA coarse-grained services (a collection of business processes o r functions) that are well d efined a nd documented. Using Web services is a specialized way of implementing an SOA. EAI can be used to facilitate data acquisition directly into a n ear-real-time data warehouse or to deliver decisions to the OLTP systems. There are ma ny different approaches to and tools fo r EAI implementatio n . Enterprise information integration (Ell) is an evolving tool space that promises real-time data integration from a variety of sources, such as relational da tabases, Web services, a nd multidimensional databases . It is a mechanism for pulling data from source systems to satisfy a request for informatio n. Ell tools u se predefined metadata to p o pulate views that make integrated data appear rela tional to e nd users. XML may be the most important aspect of Ell because XML allows data to be tagged e ither at creation time o r late r. These tags can be extended and modified to accommodate almost any area of knowledge (see Kay, 2005). Physical data integration has conventionally been the main mechanism for creating an integrated view with data warehouses a nd d ata marts. With the advent of Ell tools (see Kay, 2005), new virtual data integratio n patterns are feasible. Manglik a nd Mehra (2005) 100 Pan II • Descriptive Analytics discussed the benefits and constraints of new data integration patterns that can expand traditional physical methodologies to present a comprehensive view for the enterprise. We next turn to the approach for loading data into the warehouse: ETL. Extraction, Transformation, and Load At the heart of the technical side of the data warehousing process is extraction, trans- formation, and load (ETL) . ETL technologies, which have existed for some time, are instrumental in the process and use of data warehouses. The ETL process is an integral component in any data-centric project. IT managers are often faced w ith challenges because the ETL process typically consumes 70 percent of the time in a data-centric project. The ETL process consists of extraction (i.e., reading data from one or more data- bases), transformation (i.e., converting the extracted data from its previous form into the form in which it needs to be so that it can be p laced into a data warehouse or simply another database), and load (i. e., putting the data into the data warehouse). Transformation occurs by using rules or lookup tables or by combining the data with other data . The three database functions are integrated into one tool to pull data out of one or more databases and p lace them into a n other, consolidated database or a data warehouse. ET L tools also transport data between sources and targets, document how data e lements (e.g ., metadata) change as they move between source and target, exchange metadata with other applications as needed, and administer a ll runtime processes and operations (e.g., scheduling, error management, audit logs, statistics). ETL is extremely important for data integration as well as for data warehousing . The purpose of the ETL process is to load the warehouse with integrated and cleansed data . The data used in ETL processes can come from any source: a mainframe application, an ERP application , a CRM tool, a flat file, an Excel spreadsheet, or even a message queue. In Figure 3.9, we outline the ETL process. Th e process of m igrating data to a data warehouse involves the extraction of data from all relevant sources. Data sources may consist of files extracted from OLTP databases, spreadsheets, personal databases (e.g., Microsoft Access) , or external files . Typically, all the input files are written to a set of staging tables, which are designed to facilitate the load process. A data warehouse contains numerous business rules that define such things as how the data will be used, summarization rules, stan dardization of encoded attributes, and calculation rules. Any data quality issues pertaining to the source files need to be corrected before the data are loaded into the data warehouse. One of the benefits of a Packaged application Legacy system Other internal applications Transient data source ~/~~ .-'-E_xt_r_a-ct~ Transform I Cleanse I ~-L-o-ad~~ FIGURE 3.9 The Ell Process. Data warehouse Data mart Chapter 3 • Data Warehousing 101 well-designed data warehouse is that these rules can be stored in a metadata repository and applied to the data warehouse centrally. This diffe rs from an OLTP approach, which typically has data and business rules scattered throughout the system. The process of loading data into a data warehouse can be performed either through data transformation tools that provide a GUI to aid in the development and maintenance of business rules or through more traditional methods, such as developing programs or utilities to load the data warehouse, using programming languages such as PL/SQL, C++, Java , or .NET Framework languages. This decision is not easy for organizations. Several issues affect whether an organization will purchase data transformation tools or build the transforma- tion process itself: • Data transformation tools are expensive. • Data transformation tools may have a long learning curve. • It is difficult to measure how the IT organization is doing until it has learned to use the data transformation tools. In the long run, a transformation-too l approach should simplify the m ainte nance of an organization's data warehouse. Transformation tools can also be effective in detecting and scrubbing (i.e., removing any anomalies in the data) . OLAF and data mining tools rely on how well the data are transformed. As an example of effective ETL, Motorola , Inc. , uses ETL to feed its data warehouses. Motorola collects information from 30 different proc urement systems and sends it to its global SCM data warehouse for analysis of aggregate company spending (see Songini, 2004). Solomon (2005) classified ETL technologies into four categories: sophisticated, ena- bler, simple, and rudimentary. It is generally acknowledged that tools in the sophisticated category will result in the ETL process being better docume nted and more accurately managed as the data warehouse project evolves. Even though it is possible for programmers to develop software for ETL, it is simpler to use an existing ETL tool. The following are some of the important criteria in selecting an ETL tool (see Brown, 2004): • Ability to read from and write to an unlimited number of data source architectures • Automatic capturing and delivery of metadata • A history of conforming to open standards • An easy-to-use interface for the developer and the functional user Performing extensive ETL m ay be a sign of poorly managed data and a fundamental lack of a coherent data management strategy. Karacsony (2006) indi- cated that the re is a direct correlation between the exte nt of re dunda nt data and the number of ETL processes. When data are managed correctly as an enterprise asset, ETL efforts are significantly reduced, and redundant data are completely eliminated. This leads to huge savings in mainte n a nce and greater e fficiency in new develop- ment while also improving data quality. Poorly designed ETL processes are costly to maintain, change, and update. Consequently, it is crucial to make the proper choices in terms of the technology and tools to use for developing and maintaining the ETL process. A number of packaged ETL tools are available. D atabase vendors curre ntly offer ETL capabilities that both enhance and compete with independent ETL tools. SAS acknowl- edges the importance of data quality and offers the industry's first fully integrated solu- tion that merges ETL and data quality to transform data into strategic valuable assets. Other ETL software providers include Microsoft, Oracle, IBM, Informatica, Embarcadero, and Tibco. For additional information on ETL, see Golfarelli and Rizzi (2009), Karaksony (2006), and Songini (2004). 102 Pan II • Descriptive Analytics SECTION 3.5 REVIEW QUESTIONS 1. Describe data integration. 2. Describe the three steps of the ETL process. 3. Why is the ETL process so important for data warehousing efforts? 3.6 DATA WAREHOUSE DEVELOPMENT A data warehousing project is a major undertaking for any organization and is more complicated than a simple, mainframe selection and implementation project because it comprises and influences many departments and many input and output interfaces and it can be part of a CRM business strategy. A data warehouse provides several benefits that can be classified as direct and indirect. Direct benefits include the following: • End users can perform extensive analysis in numerous ways. • A consolidated view of corporate data (i.e., a single version of the truth) is possible. • Better and more timely information is possible. A data warehouse permits informa- tion processing to be relieved from costly operational systems onto low-cost serv- ers; therefore, many more end-user information requests can be processed more quickly. • Enhanced system performance can result. A data warehouse fre es production processing because some operational system reporting requirements are moved to DSS. • Data access is simplified. Indirect benefits result from end users using these direct benefits. On the whole, these benefits enhance business knowledge, present competitive advantage, improve cus- tomer service and satisfaction, facilitate decision making, and h elp in reforming business processes; therefore, they are the strongest contributions to competitive advantage. (For a discussion of how to create a competitive advantage through data warehousing, see Parzinger and Fralick, 2001.) For a detailed discussion of how organizations can obtain exceptional levels of payoffs, see Watson et al. (2002) . Given the potential benefits that a data warehouse can provide and the substantial investments in time and money that such a project requires, it is critical that an organization structure its data warehouse project to maximize the chances of success. In addition, the organization must, obviously , take costs into consideratio n. Kelly (2001) described a ROI approach that considers benefits in the categories of keepers (i.e. , money saved by improving traditional decision support functions) ; gatherers (i.e., money saved due to automated collection and dissemination of information); and users (i.e. , money saved or gained from decisions made using the data warehouse). Costs include those related to hardware, software, network bandwidth, internal development, internal support, training, and external consulting. The net pre- sent value (NPV) is calculated over the expected life of the data warehouse. Because the benefits are broken down approximately as 20 percent for keepers, 30 percent for gatherers, and 50 percent for users, Kelly indicated that users should be involved in the development process , a success factor typically mentioned as critical for systems that imply change in an organization. Application Case 3.4 provides an example of a data warehouse that was deve loped and delivered inte n se competitive advantage for the Hokuriku (Japan) Coca-Cola Bottling Company. The system was so successful that plans are underway to expand it to encom- pass the more than 1 million Coca-Cola vending machines in Japan . Clearly defining the business objective, gathering project support from manage- ment end users, setting reasonable time frames and budgets, and managing expectations are critical to a successful data warehousing project. A data warehousing strategy is a Application Case 3.4 Things Go Better with Coke's Data Warehouse In the face of competitive pressures and consumer demand, how does a successful bottling company e nsure that its vending machines are profitable? The answer for Hokuriku Coca-Cola Bottling Company (HCCBC) is a data warehouse and analytical soft- ware implemente d by Teradata Corp. HCCBC built the system in response to a data warehousing system developed by its rival, Mikuni. The data warehouse collects not only historical data but also near-real- time data from each vending machine (viewed as a store) that could be transmitted via wireless con- nection to headquarters. The initial phase of the project was deployed in 2001. The data warehouse approach provides detailed product information, such as time and date of each sale, when a prod- uct sells out, whether someone was short-changed, and whether the machine is malfunctioning. In each case, an alert is triggered, and the vending machine immediately reports it to the data center over a wire- less transmission system. (Note that Coca-Cola in the United States has used modems to link vending machines to distributors for over a decade.) In 2002, HCCBC conducted a pilot test and put all its Nagano vending machines o n a wireless net- work to gather near-real-time point of sale (POS) data from each one. The results were astounding because they accurately forecasted demand and identified problems quickly. Total sales immediately Chapter 3 • Data Warehousing 103 increased 10 percent. In addition, due to the more accurate machine servicing, overtime and other costs decreased 46 percent. In additio n, each salesp e rso n was able to service up to 42 p ercent more vending machines. The test was so successful th at planning b egan to expand it to encompass the entire enterprise (60,000 machines), using an active data warehouse. Eventually, the data warehousing solution will ide- ally expand across corporate boundaries into the entire Coca-Cola Bottlers network so that the more than 1 million vending machines in Japan w ill be networked, leading to immense cost savings and highe r revenue. QUESTIONS FOR DISCUSSION 1. How did Coca-Cola in Japan use data warehous- ing to improve its business processes? 2. What were the results of their enterprise active data warehouse implementation? Sources: Adapted from K. D. Schwartz, "Decisions at the Touch o f a Button," Teradata Magazine, teradata.com/t/page/117774/ index.html (accessed June 2009); K. 0 . Schwartz, "Decisio ns at the Touch of a Button," DSS Resources, March 2004, pp. 28-31 , dssresources . com/ cases/ coca-colaja pan/index.html (accessed April 2006); and Te radata Corp. , "Coca-Cola Japan Puts the Fizz Back in Vending Machine Sales," teradata.com/t/ page/118866/index.html (accessed June 2009). blueprint for the successful introduction of the data warehouse. The strategy should describe where the company wants to go, why it wants to go the re, and w hat it will do when it gets the re . It needs to take into consideration the organization 's vision, structure, and culture. See Matney (2003) for the steps that can help in developing a flexible and efficie nt support strategy. When the plan and support for a data warehouse are estab- lished, the organization needs to examine data warehouse vendors. (See Table 3.2 for a sample list of vendors; also see The Data Warehousing Institute [twdi.org] and DM Review [information-management.com].) Many vendors provide software de mos of their data warehousing and BI products. Data Warehouse Development Approaches Many organizations need to create the data warehouses used for decision suppo rt. Two competing approaches are employed. The first approach is that of Bill Inmon, who is often called "the father of data warehousing. " Inmon supports a top-down development approach that adapts traditional relational database too ls to the develo pment needs of an 104 Pan II • Descriptive Analytics TABLE 3.2 Sample List of Data Warehousing Vendors Vendor Business Objects (businessobjects.com) Computer Associates (cai.com) DataMirror (datamirror.com) Data Advantage Group (dataadvantagegroup.com) Dell (dell.com) Embarcadero Technologies (embarcadero.com) Greenplum (greenplum.com) Harte-Hanks (harte-hanks.com) HP (hp.com) Hummingbird Ltd. (hummingbird.com, now is a subsidiary of Open Text.) Hyperion Solutions (hyperion.com, now an Oracle company) IBM lnfoSphere (www-01.ibm.com/software/data/ infosphere/) Informatica (informatica.com) Microsoft (microsoft.com) Netezza Oracle (including PeopleSoft and Siebel) (oracle.com) SAS Institute (sas.com) Siemens (siemens.com) Sybase (sybase.com) Teradata (teradata.com) Product Offerings A comprehensive set of business intelligence and data visuali- zation software (now owned by SAP) Comprehensive set of data warehouse (DW) tools and products DW administration, management, and performance products Metadata software DW servers DW administration, management, and performance products Data warehousing and data appl iance solution provider (now owned by EMC) Customer relationship management (CRM) products and services DW servers DW engines and exploration warehouses Comprehensive set of DW tools, products, and applications Data integration, DW, master data management, big data products DW administration, management, and performance products DW tools and products DW software and hardware (DW appliance) provider (now owned by IBM) DW, ERP, and CRM tool s, products, and applications DW tools, products, and applications DW servers Comprehensive set of DW tools and applications DW t ools, DW appliances, DW consultancy, and applications enterprise-w ide data warehouse, also known as the EDW approach. The second approach is that of Ralph Kimball, who proposes a bottom-up approach that employs dimensional modeling, also known as the data ma11 approach. Knowing how these two models are alike and how they differ helps us understand the basic data warehouse concepts (e.g., see Breslin, 2004). Table 3.3 compares the two approaches. We describe these approaches in detail next. THE INMON MODEL: THE EDW APPROACH Inmon's approach emphasizes top-dow n development, employing established database development methodologies and tools, such as entity-relationship diagrams (ERD), and an adjustment of the spiral development approach . The EDW approach does not preclude the creation of data marts. The EDW is the ideal in this approach because it provides a consistent and comprehensive view of the enterprise. Murtaza 0998) presented a framework for develo ping EDW. THE KIMBALL MODEL: THE DATA MART APPROACH Kimball's data mart strategy is a "plan big, build small" approach. A data mart is a subject-oriented or d epartment-oriented data warehouse. It is a scaled-down version of a data warehouse that focuses o n the requests Chapter 3 • Data Warehousing 105 TABLE 3.3 Contrasts Between the Data Mart and EDW Development Approaches Effort Scope Development time Development cost Development difficulty Data prerequisite for sharing Sources Size Time horizon Data transformations Update frequency Technology Hardware Operating system Databases Usage Number of simultaneous users User types Business spotlight Data Mart Approach One subject area Months $10,000 to $100,000+ Low to medium Common (within business area) Only some operational and external systems Megabytes to severa l gigabytes Near-current and historical data Low to medium Hou rly, daily, weekly Workstations and departmental servers Windows and Linux Workgroup or standard database servers 10s Business area analysts and managers Optimizing activities within the business area EDW Approach Severa l subject areas Years $1,000,000+ High Common (across enterprise) Many operationa l and external systems Gigabytes to petabytes Historica l data High Weekly, monthly Enterprise servers and mainframe computers Unix, l./05, 05/ 390 Enterprise database servers 1 OOs to 1,000s Enterprise analysts and sen ior executives Cross-functiona l optimization and decision making Sources: Adapted fro m J. Van d e n Hove n, "Da ta Marts: Pla n Big, Build Small ," in JS Management H andbook, 8th ed ., CRC Press, Boca Raton, FL, 2003; and T. Ariyachandra and H. Watson, "Which Data Warehouse Architecture Is Most Successful?" Business I ntelligence Journal, Vol. 11 , No. 1, First Quarter 2006, pp. 4-6. of a specific department, such as marketing or sales. This model applies dimensional data modeling, which starts w ith tables. Kimball advocated a development methodology that entails a bottom-up approach, which in the case of data warehouses means building one data mart at a time. WHICH MODEL IS BEST? There is no one-size-fits-a ll strategy to data warehousing. An enterprise's data warehousing strategy can evolve from a simple data mart to a complex data warehouse in response to user demands, the e nterprise's business re quireme nts , and the enterprise's maturity in managing its data resources. For many enterprises, a data mart is frequently a convenient first step to acquiring experience in constructing and manag- ing a data warehouse while presenting business users with the benefits of better access to their data; in addition, a data mart commonly indicates the business value of data warehousing. Ultimately , e ngineering an EDW that consolidates old data ma rts and data warehouses is the ideal solution (see Application Case 3.5). However, the development of individual data marts can often provide many benefits along the way toward develop- ing an EDW, especially if the organization is unable or unwilling to invest in a large-scale project. Data marts can also demonstrate feasibility and success in providing benefits. This could potentially lead to an investment in an EDW. Table 3.4 summarizes the most essential characteristic differences between the two models. 106 Pan II • Descriptive Analytics Application Case 3.5 Starwood Hotels & Resorts Manages Hotel Profitability with Data Warehousing Sta1wood Hotels & Resorts Worldwide, Inc., is one of the leading h otel and le isure companies in the world with 1,112 properties in nearly 100 countries and 154,000 employees at its owned and managed prop- erties. Starwood is a fully integrated owner, operator and franchisor of h otels, resorts, and residences with the fo llowing internationally renowned brands: St. Regis®, The Luxury Collection®, W®, Westin®, Le Meridien®, Sheraton®, Four Points® by Sheraton, Aloft®, and ElementSM. The Company boasts one of the industry's leading loyalty programs, Starwood Preferred Guest (SPG), allowing members to earn and redeem points for room stays, room upgrades, and flights, with no blackout dates. Starwood also owns Starwood Vacation Ownership Inc., a pre- mier provider of world-class vacation experiences through villa-style resorts and privileged access to Starwood brands. Challenge Sta1wood Hotels has significantly increased the num- ber of hotels it operates over the past few years through global corporate expansion , particularly in the Asia/ Pacific region. This has resulted in a dra- matic rise in the need for business critical informa- tion about Starwood's hotels and customers. All Starwood hotels g lobally use a single enterprise data warehouse to retrieve information critical to efficient hotel management, such an that regarding revenue, central reservations, and rate p lan rep01ts. In addi- tion, Starwood Hotels' management runs important daily operating repo1ts from the data warehouse for a wide range of business functions. Starwood's enter- prise data warehouse spans almost all areas within the company, so it is essential not only for central- reservation and consumption information, but also to Sta1wood's loyalty program, which relies on all guest information, sales information, corporate sales infor- mation, customer service, and other data that man- agers, analysts, and executives depend on to make operational decisions. The company is committed to knowing and ser- vicing its guests, yet, "as data growth and demands grew too great for the company's legacy system, it was falling short in delivering the information hotel managers and administrators required on a daily basis, since central reservation system (CRS) reports could take as long as 18 hours, " said Richard Chung, Starwood Hotels' director of data integration. Chung added that hotel managers would receive the tran- sient pace report- which presents market-segmented information on reservations-5 hours later than it was needed. Such delays prevented managers from adjusting rates appropriately, which could result in lost revenue. Solution and Results After reviewing several vendor offerings, Starwood Hotels selected Oracle Exadata Database Machine X2-2 HC Full Rack and Oracle Exadata Database Machine X2-2 HP Full Rack, nmning on Oracle Linux. "With the implementation of Exadata , Starwood Hotels can complete extract, transform, and load (ETL) operations for operational reports in 4 to 6 hours, as opposed to 18 to 24 hours previously, a six-fold improvement," Chung said. Real-time feeds, which were not possible before, now allow transac- tions to be posted immediately to the data ware- house, and users can access the changes in 5 to 10 minutes instead of 24 hours, making the process up to 288 times faster. Accelerated data access allows all Starwood properties to get the same, up-to-date data needed for their reports, globally. Previously, hotel managers in some a reas could not do same-day or next-day analyses. There were some locations that got fresh data and others that got older data . Hotel managers, worldwide, now have up-to-date data for their hotels, increasing efficiency and profitability, improving cus- tomer service by making sure rooms are available for premier customers, and improvin g the company's ability to manage room occupancy rates. Additional reporting tools, such as those used for CRM and sales and catering, also benefited from the improved pro- cessing. Other critical reporting has benefited as well. Marketing campaign management is also more effi- cient now that managers can analyze results in days or weeks instead of months. "Oracle Exadata Database Machine enables us to move forward with an environment that pro- vides our hotel managers and corporate executives with nea r-real-time information to make optimal Ch apte r 3 • Data Ware housing 107 business decisio ns and p rovide ideal ame nities for o ur guests. " -Gordon Lig ht, Business Re latio n ship Man ager, Starwoo d Ho te ls & Resorts Wo rldw ide, Inc . 2. How d id Starwood Hotels & Resorts u se data wareho using fo r be tter profitability? 3. What w ere the ch allenges, the proposed solu- tion , and the obtained results? QUESTIONS FOR DISCUSSION Source: O racle custo mer success story, www.oracle.com/us/ corporate/ customers/ customersearch/ starwood-hotels-1- exadata-sl-1855106.html; Starwood Hotels and Resorts, starwoodhotels.com (accessed July 2013). 1. How big and complex are the business o p era - tio ns of Starwood Ho tels & Resorts? Additional Data Warehouse Development Considerations Som e o rganizatio n s want to comple te ly o utsource the ir d a ta ware h ousing e ffo rts . They simply do n o t want to deal w ith software a nd hardware acquisitio ns, a nd they d o no t want to m a nage the ir info rmatio n syste ms. O n e alte rna tive is to use hosted da ta wa re h o uses. In this scena rio , a n o the r firm-i dea lly, o ne that has a lo t of exp e rie nce TABLE 3.4 Essential Differences Between lnmon's and Kimball's Approaches Characteristic Methodology and Architecture Overa ll approach Architecture structure Complexity of the method Comparison w ith established development methodologies Discussion of physical design Data Modeling Data orientation Tools End-user accessibility Philosophy Primary audience Place in the organization Objective Inmon Top-dow n Enterprise-wide (atomic) dat a w arehouse "feeds" departmental databases Quite complex Derived from the spira l methodology Fa irly thorough Subject or data driven Traditional (entity-relationship diagrams [ERD], data flow diagrams [DFD]) Low IT professionals Integral part of t he corporat e information factory Deliver a sound technical solution based on proven database methods and technologies Kimball Bottom-up Data marts model a single business process, and ent erprise consist ency is achieved through a data bus and conformed dimensions Fai rly simple Fou r-step process; a departu re from relation al dat abase management system (RDB M S) methods Fairly light Process oriented Dimensional modeling; a departure fro m relat ional m odeling High End users Transformer and retain er of operational data Deliver a solution that makes it easy for end users t o directly query t he data and still get reason able response times Sources: Adapted fro m M. Breslin, "Data Wa re ho using Battle of tl1e Gia n ts: Comparing the Basics o f Kimball and Inmo n Models," Business Intelligence j ournal, Vol. 9, No. 1, Winte r 2004, pp. 6-20; and T. Ariyacha ndra a nd H . Watson , "Which Data Wa rehouse Architecture Is Most Successful?" Busin ess Intelligence j ournal, Vol. 11 , No. 1, First Q uarte r 2006. 108 Pan II • Descriptive Analytics TECHNOLOGY INSIGHTS 3.1 Hosted Data Warehouses A hosted data wareho use has nearly the same, if not more, functionality as an on-site data ware- hou se, but it does not consume compute r resources o n client premises. A hosted data wa reh ouse offers the benefits of BI minus the cost of computer upgrades, network upgrades, software licenses, in-house development, and in-house suppoI1 and maintenance. A hosted data warehouse offers the following benefits: • Requires minimal investment in infrastructure • Frees up capacity o n in-house systems • Frees up cash flow • Makes powerful solutions affordable • Enables powerful solutions that provide for growth • Offers better quality equipme nt and software • Provides faster connections • Enables users to access data from remote locations • Allows a company to focus on core business • Meets storage needs for large volumes of data Despite its benefits, a hosted data warehouse is not necessarily a good fit for every o rgani- zatio n. Large companies with revenue upwards of $500 million could lose money if they already have underused internal infrastructure and IT staff. Fwthermore, companies that see the para- digm shift of outso urcing applications as loss of control of their data are not likely to use a business intelligence service provider (BISP). Finally, the most significant and common argument against implementing a hosted data warehouse is that it may be unwise to outsource sensitive applicatio ns for reasons of security and privacy. Sources: Compiled from M. Thornton a nd M. Lampa, "Hoste d Data Ware house ,"Journal of Data Warehousing, Vol. 7, No. 2, 2002, pp. 27-34; and M. Thornton, "What About Security? The Most Common, but Unwarra nted, Obje ction to Hosted Data Warehouses ," DM Review, Vol. 12, No. 3, Ma rch 18, 200 2, pp. 30-43. a nd expertise-develops and maintains the data warehouse. However , there are security a nd privacy con cerns w ith this approach. See Technology Insig hts 3.1 for some details. Representation of Data in Data Warehouse A typical data warehouse structure is shown in Figure 3.3. Many variatio n s of data ware- house architecture are possible (see Figure 3 .7). No m atter what the architecture was, the design of data representation in the data wareho u se h as a lways been based o n the concept of dime n s io n a l mode ling. Dimensional modeling is a retrieval-based system that supports high-volume query access. Representation and storage of data in a data warehouse s hould be designed in su ch a way that not o nly accommodates but a lso boosts the processing of complex multidimensional queries. Often, the star schema and the snowflakes schema are the means by which dimensional modeling is implemented in data warehouses. The star schema (sometimes referenced as star join schema) is the most commonly used a nd the s implest style o f dimensional m odeling. A s tar schema contain s a central fact table surrounded by and connected to several dimension tables (Adam son , 2009). The fact table contains a large number o f rows that correspond to observed facts and external links ( i.e. , foreign keys). A fact table con tains the descriptive attributes n eeded to perform decision analysis and query reporting , a nd foreign keys are u sed to link to d ime nsion Chapter 3 • Data Warehousing 109 tables . The decision analysis attributes consist of performance measures, operational met- rics, aggregated measures (e.g., sales volumes, customer retention rates, profit margins, production costs, crap rates, and so forth), and all the other metrics needed to analyze the organization's performance. In other words, the fact table primarily addresses what the data warehouse supports for decision analysis . Surrounding the central fact tables (and linked via foreign keys) are dimension tables. The dimension tables contain classification and aggregation information about the central fact rows. Dimension tables contain attributes that describe the d ata contained within the fact table ; they address how data will be analyzed and summarized. Dimension tables have a one-to-many relationship with rows in the central fact table. In que1y ing, the dimensions are used to slice and dice the numerica l values in the fact table to address the requirements of an ad hoc information need. The star schema is designed to provide fast query-response time , simplicity, and ease of maintenance for read-only database structures. A simple star schema is shown in Figure 3.10a. The star schema is considered a special case of the snowflake schema. The snowflake sche m a is a logical arrangement o f tables in a multidimensio nal database in such a way that the entity-relationship diagram resembles a snowflake in shape. Closely related to the star schema, the snowflake schema is represented by central- ized fact tables (usually only one) that are connected to mu ltiple dimensions . In the snow- flake schema, however, dimensions are normalized into multiple related tables whereas the star schema's dimensions are denormalized with each dime nsion being represented by a single table. A simple snowflake schema is shown in Figure 3.10b. Analysis of Data in the Data Warehouse Once the data is properly stored in a data warehouse , it can be used in various ways to support organizational decision making. OLAF (online analytical processing) is arguably the most commonly used data analysis technique in data warehouses, and it has been growing in popularity due to the exponential increase in data volumes and the recogni- tion of the business value of data-driven analytics. Simply, OLAF is an approach to quickly answer ad hoc questions by executing multidimensional analytical queries against organi- zational data repositories (i.e ., d ata warehouses, data marts) . (a) Star Schema (bl Snowflake Schema Dimension Dimension Dimension time product month I Quarter - - I Brand I I Fact table I M_Name ri I Dimension Dimension LJ date product I Date - - I Lineltem sales ~ I UnitsSold -r+ I - Dimension lf I I 1 quarter I Q_Name Fact table I sales Dimension Dimension - I UnitsSold -people geography - I -I Division I Country I f-- - I Dimension Dimension people store I Division I LoclD I-+ I - - I ~ FIGURE 3.10 (a) The Star Schema, and (b) the Snowflake Schema. Dimension brand I Brand I Dimension category I Category I Dimension location I State I 110 Pan II • Descriptive Analytics OLAP Versus OL TP OLTP (online transaction processing system) is a term used for a transaction system, which is primarily responsible for capturing and storing data related to day-to-day busi- ness functions such as ERP, CRM, SCM, point of sale, and so forth. The OLTP system addresses a critical business need, automating daily business transactions and running real-time reports and routine analyses. But these systems are not designed for ad hoc analysis and complex queries that deal with a number of data items. OLAP, on the other hand, is designed to address this need by providing ad hoc analysis of organizational data much more effectively and efficiently. OLAP and OLTP rely heavily on each other: OLAP uses the data captures by OLTP, and OLTP automates the business processes that are managed by decisions supported by OLAP. Table 3.5 provides a multi-criteria comparison between OLTP and OLAP. OLAP Operations The main operational stmcture in OLAP is based on a concept called cube. A cube in OLAP is a multidimensional data structure (actual or virtual) that allows fast analysis of data. It can also be defined as the capability of efficiently manipulating and analyzing data from multiple perspectives. The arrangement of data into cubes aims to overcome a limita- tion of relational databases: Relational databases are not well suited for near instantaneous analysis of large amounts of data. Instead, they are better suited for manipulating records (adding, deleting, and updating data) that represent a series of transactions. Although many report-writing tools exist for relational databases, these tools are slow when a multi- dimensional query that encompasses many database tables needs to be executed. Using OLAP , an analyst can navigate through th e database and screen for a par- ticular subset of the data (and its progression over time) by changing the data's orienta- tions and defining analytical calculations. These types of user-initiated navigation of data through the specification of slices (via rotations) and drill down/ up (via aggregation and disaggregation) is sometimes called "slice and dice ." Commonly used OLAP operations include s lice a nd dice, drill down, roll up, and pivot. • Slice. A slice is a subset of a multidimensional array (usually a two-dimensional representation) corresponding to a single value set for one (or more) of the dimen- sions not in the subset. A simple slicing operation on a three-dimensional cube is shown in Figure 3.11. TABLE 3.5 A Comparison Between OLTP and OLAP Criteria Purpose Data source Reporting Resource requirements Execution speed OLTP To carry out day-to-day business functions Transaction database (a normalized data repository primarily focused on efficiency and consistency) Routine, periodic, narrowly focused reports Ordinary relational databases Fast (recording of business transactions and routine reports) OLAP To support decision making and provide answers to business and management queries Data warehouse or data mart (a nonnormalized data repository primarily focused on accuracy and completeness) Ad hoc, multidimensional, broadly focused reports and queries Multiprocessor, la rge-capacity, specialized databases Slow (resource intensive, complex, large-sca le queries) A three-dimensional OLAP cube with slicing operations Product Ce/ls are filled with numbers representing sales volumes >,
.c
a.
[‘:’
Ol
a
QJ
(!)
Sales volumes of
a specific product
on variable time
Sales volumes of
a specific region
on variable time
and products
Sales volumes of
a specific time on
variable region
and products
FIGURE 3.11 Slicing Operations on a Simple Three-Dimensional Data Cube.
Chapter 3 • Data Warehousing 111
• Dice. The d ice operatio n is a s lice on more than two dimensions of a data cube.
• Drill Down/Up Drilling down or up is a specific OLAP technique whereby the
user navigates among levels of data ranging from the most summarized (up) to the
most detailed (down) .
• Roll-up. A roll-up involves computing all of the data relationships for one or more
d ime n sions. To do this, a computatio nal relationship o r formula might be defined.
• Pivot: A pivot is a means of ch anging the dimensional orientatio n of a report or
ad hoc query-page d isplay.
VARIATIONS OF OLAP OLAP has a few variations; among them ROLAP, MOLAP, and
HOLAP are the most common ones.
ROLAP stands for Relational Online An a lytical Processing. ROLAP is an alternative
to the MOLAP (Multidimensional OLAP) technology. Although both ROLAP and MOLAP
analytic tools are designed to allow analysis of data through the use of a multidimensio nal
data model, ROLAP differs significantly in that it does n ot require the precomputation
and storage of information. Instead, ROLAP tools access the data in a re lational database
an d generate SQL queries to calculate information at the appropriate level w hen an end
user requests it. With ROLAP, it is possible to create additional database tables (summary
tables or aggregations) that summarize the data at any desired combination of dimen sio n s.
While ROLAP uses a relational database source, generally the database must be carefully
designed for ROLAP use. A database that was designed for OLTP will not function well as
a RO LAP database. Therefore, RO LAP still involves creating a n additio na l copy of the data.

112 Part II • Descriptive Analytics
MOLAP is an alternative to the ROLAP technology. MOLAP differs from ROLAP
significantly in that it requires the precomputation and storage of information in the
cube-the operation known as preprocessing. MOLAP stores this data in an optimized
multidime nsio nal array storage, rather than in a relational database (which is often the
case for ROLAP).
The undesirable trade-off between ROLAP and MOLAP w ith regards to the addi-
tional ETL (extract, transform, and load) cost and slow query performance has led to
inquiries for better approaches where the pros and cons of these two approach es are
optimized. These inquiries resulted in HOLAP (Hybrid Online An alytical Processing),
w hich is a combin atio n of ROLAP and MOLAP. HOLAP allows storing p art of the data in
a MOLAP store and another part of the data in a ROLAP store. The degree of control that
the cube designer has over this partitioning varies from product to product. Technology
Insights 3.2 provides an opportunity for conducting a simple hands-on analysis w ith the
MicroStrategy BI tool.
TECHNOLOGY INSIGHTS 3.2 Hands-On Data Warehousing
with MicroStrategy
MicroStrategy is the leading indepe ndent provider of business intelligence, data warehousing
performance management, and bu siness reporting solutions . The other big players in this market
were recently acquired by large IT firms: Hyperion was acquired by O racle; Cognos was acquired
by IBM; and Business Objects was acquired by SAP. Despite these recent acquisitio ns, the busi-
ness intelligence and data ware h o using market remains active, vibrant, and full of opportunities.
Following is a step-by-step approach to using MicroStrategy software to analyze a hypo-
thetical business situation. A more compreh e nsive version of this h ands-on exercise can be
fou nd at the TUN We b site . According to this hypothetical scenario, you (the vice president of
sales at a global telecommunications company) are planning a bu siness visit to the Eu ropean
region. Before meeting with the regio na l salespeople on Monday, you want to know the sale
representatives’ activities for the last quarter (Qu arter 4 of 2004). You a re to create su ch an
ad hoc report using MicroStrategy’s Web access. In order to create this and many other OLAF
reports , you w ill need the access code for the TeradataUniversityNetwork.com Web site. It
is free of ch arge fo r edu cational use and only your professor will be able to get the necessary
access code fo r you to utilize not only MicroStrategy software but also a large collection of oth er
business intelligence resources at this site.
Once you are in TeradataUniversityNetwork, you need to go to “APPLY & DO” and select
“MicroStrategy BI” from the “Software” section. On the “MicroStrategy/ BI” Web page, follow
these steps:
1. Click on the link for “MicroStrategy Application Modules.” This w ill lead you to a page that
shows a list of previously built MicroStrategy applications.
2. Select the “Sales Force An alysis Module. ” This module is designed to provide you w ith in-
depth insight into the e ntire sales p rocess. This insight in turn allows you to increase lead
conversions, optimize product lines, take advantage of your organization’s most successful
sales practices, and improve you r sales organizatio n’s effectiveness.
3 . In the “Sales Force Analysis Module ” site you w ill see three section s: View, Create, and
Tolls. In the View section, click on the link for “Shared Reports. ” This link w ill take you to
a place w here a number of previously created sales reports are listed for everybody ‘s u se.
4. In the “Sh ared Reports” page, click on the folder named “Pipeline Analysis. ” Pipeline
Analysis reports provide insight into all open opportunities and deals in the sales p ipeline.
These reports measure the current statu s of the sales pipeline , detect changing tren ds and
key events, and identify key open opportunities. You want to review what is in the pipe-
line for each sales rep, as well as whether o r not they hit the ir sales quota last quarter.
5. In the “Pipeline Analysis” page, click on the report named “Current Pipeline vs. Quota by
Sales Regio n and District. ” This report presents the current p ipeline status for each sales

Chapter 3 • Data Warehousing 113
district within a sales region. It also projects whether target quotas can be achieved for the
current quarter.
6. In the “Current Pipeline vs. Quota by Sales Region and District” page, select (with single
click) “2004 Q4” as the report parameter, indicating that you want to see how the repre-
sentatives perfo rmed against their quotas for the last quarter.
7. Run the report by clicking on the “Run Report” button at the bottom of the page. This will
lead you to a sales report page whe re the values fo r each Metric are calculated fo r all three
European sales regions. In this interactive report, you can easily change the region from
Europe to United States or Canada using the pull-down combo box, or you can drill-in
one of the three European regions by simply clicking on the appropriate region’s heading
to see more detailed analysis of the selected region.
SECTION 3.6 REVIEW QUESTIONS
1. List the benefits of d ata wareho u se s.
2. List several crite ria for selecting a d ata wareh o u se ven dor, a nd describe w hy they are
impo rta nt.
3. What is OLAP a nd h ow d oes it diffe r fro m OLTP?
4. What is a cube? Wh a t d o drill down , roll up, a nd slice and dice mean?
5. What are ROLAP , MOLAP , and HOLAP? How do they d iffer fro m OLAF?
3.7 DATA WAREHOUSING IMPLEMENTATION ISSUES
Implem e nting a data wareh o use is gene ra lly a m assive e ffo rt that must be planne d and
execute d according to establish e d m e tho ds . Howeve r, the p roject life cycle h as m any
facets , a nd n o single p e rson can be an exp e rt in each area. Here we discuss specific ideas
and issues as they relate to d ata warehousing .
People want to know how su ccessful the ir BI and data ware h o u sing initiatives
are in comparison to those o f o the r com panies. Ariyacha ndra and Watson (2006a) p ro-
posed som e benchmarks for BI and d ata warehou sing success. Watson e t al. 0999)
research ed d ata ware ho u se failures. Th e ir results sh owed that p eople d efine a “fa ilure” in
diffe re nt ways, and this w as confirme d by Ariyachandra a nd Watson (2006a). The Data
Wa reh o using Institute (tdwi.org) has d evelo ped a data ware housing m aturity m o de l
that an e nte rprise can a pply in o rder to be nchmark its evolutio n . Th e m ode l offers a fast
me ans to gauge w h e re the o rganizatio n ‘s data wareh o using initia tive is now a n d where
it needs to go n ext. The m aturity m o de l con sists of six stages : pren atal, infa n t, ch ild,
teen ager, a dult, a nd sage. Business value rises as the data ware h o use p rogresses th rough
each su cceeding stage. The stages a re ide ntified by a number of ch a racteristics, inclu d ing
scop e, a na lytic structure, executive p e rceptio ns, types of an alytics, stewardship, fu nding ,
techno logy pla tfo rm, cha nge ma n agem e nt, a nd administra tio n . See Eckerson et al. (2009)
and Eckerson (2003) for mo re de tails.
Da ta ware h o u se projects h ave m an y risks. Most of them are also found in o ther IT
p rojects, but data ware ho using risks are m o re serio u s b ecause data ware hou ses are expen-
sive, time -and-resource d em a nding, la rge-scale projec ts . Each risk sho u ld be assessed at
the inceptio n of the p roject. Whe n develo ping a su ccessful data ware ho u se, it is im p ortant
to care fully conside r vario us risks and avoid the fo llowing issu es:
• Starting with the wrong sponsorship chain. Yo u n eed an executive sp o nsor
w h o h as influe nce over the necessary resources to suppo rt and invest in the data
ware ho use. Yo u a lso need an executive p roject driver, someone w ho has earne d

114 Pan II • Descriptive Analytics
the respect of other executives, h as a healthy skepticism about technology, and is
decisive but flexible. You also need an IS/ IT manager to head up the project.
• Setting expectations that you cannot meet. You do n o t want to frustrate exec-
utives at the m o ment o f truth. Every data warehou sing project h as two phases:
Phase 1 is the selling phase, in which you inte rnally market the project by selling
the benefits to those w ho have access to needed resources. Phase 2 is the struggle to
meet the expectatio n s described in Phase 1. For a mere $1 to $7 million, hopefully,
you can de liver.
• Engaging in politically naive behavior. Do no t simply state that a data ware-
ho u se w ill help managers make better decisions. This may imply that you feel they
have been making bad decisions until now. Sell the idea that they w ill be able to get
the information they need to help in decision making.
• Loading the warehouse with information just because it is available. Do
not let the data warehou se become a data landfill. This would unnecessarily slow
the u se of the system. There is a trend toward real-time computing and a n alysis.
Data wareh ouses must be shut down to load data in a timely way.
• Believing that data warehousing database design is the same as transac-
tional database design. In gene ral, it is n ot. The goal of data wareh ousing is to
access aggregates rath er tha n a s ingle o r a few records , as in transaction-processing
systems . Content is also different, as is evident in how d ata are organized. DBMS
tend to be no nredundant, normalized, and relational, whereas data warehou ses are
redundant, not normalized, and multidimensional.
• Choosing a data warehouse manager who is technology oriented rather
than user oriented. One key to data wareh ouse su ccess is to understand that the
users must get what they need, n ot advanced technology for techn o logy’s sake.
• Focusing on traditional internal record-oriented data and ignoring the
value of external data and of text, images, and, perhaps, sound and
video. Data come in many formats and must be made accessible to the right p eo-
ple at the rig ht time and in the right form at. They must be cataloged properly.
• Delivering data with overlapping and confusing definitions. Data cleans-
ing is a critical aspect of data warehousing. It includes reconciling conflicting data
definitions and formats o rganization-w ide . Politically, this may be difficult because
it involves change, typically at the executive level.
• Believing promises of performance, capacity, and scalability. Data ware-
houses gen erally require more capacity and speed th an is originally budgeted for.
Plan ahead to scale up.
• Believing that your problems are over when the data warehouse is up and
running. DSS/ BI projects tend to evolve continually. Each deployment is an iteration
of the prototyping process. There will always be a need to add more and different data
sets to the data ware hou se, as well as addition al an alytic tools for existing and addi-
tio na l groups of decision makers. High e ne rgy and annual budgets must be planned for
because success breeds su ccess. Data warehousing is a continuous process.
• Focusing on ad hoc data mining and periodic reporting instead of
alerts. The natural progression of informatio n in a data warehouse is (1) extract
the data from legacy systems, cleanse them, and feed them to the warehouse; (2)
support ad hoc reporting until you learn what people want; and (3) convert the ad
hoc reports into regularly scheduled reports. This process of learning w hat people
want in order to provide it seems n atural, but it is n ot optimal or even practical.
Managers a re busy a nd need time to read reports. Alert systems a re better than
periodic reporting systems and can make a data warehou se mission critical. Alert
systems monitor the data flowing into the warehou se and inform all key people who
have a need to know as soon as a critica l event occurs.

Chapter 3 • Data Warehousing 115
In many organizations, a data warehouse will be successful only if there is strong
senior management support for its development and if there is a project champion who
is high up in the organizational chart. Although this would likely be true for any large-
scale IT project, it is especially important for a data warehouse realizatio n. The successful
implementation of a data warehouse results in the establishment of an architectural frame-
work that may allow for decision analysis throughout an organization and in some cases
also provides comprehensive SCM by granting access to information on an organization’s
customers and suppliers. The implementa tion of Web-based data warehouses (sometimes
called Webhousing) has facilitated ease of access to vast amounts of data, but it is dif-
ficult to determine the hard benefits associated with a data warehouse. Hard benefits are
defined as benefits to an organization that can be expressed in monetary terms. Many
organizations have limited IT resources and must prioritize projects. Management support
and a strong project champion can help ensure that a data warehouse project will receive
the resources necessary for successful implementation. Data warehouse resources can
be a significant cost, in some cases requiring high-end processors and large increases in
direct-access storage devices (DASD). Web-based data warehouses may also have sp ecial
security requirements to ensure that only authorized users have access to the data .
User participation in the developme nt of data and access modeling is a critical suc-
cess factor in data warehouse development. During data mode ling, expertise is required
to determine what data are needed, define business rules associated with the data , and
decide what aggregations and other calculations may be necessary. Access modeling is
needed to determine how data are to be re trieved from a data warehouse, and it assists in
the physical definition of the warehouse by helping to define which data require index-
ing. It may also indicate whether dependent data marts are needed to facilitate informa-
tion retrieval. The team skills needed to develop and implement a data warehouse include
in-depth knowledge of the database technology and d evelopment tools used. Source sys-
tems and development technology, as m e ntioned previously, refere nce the ma ny inputs
and the processes used to load and maintain a data warehouse.
Application Case 3.6 presents an excellent example for a large-scale imple mentation
of an integrated data warehouse by a state government.
Application Case 3.6
EDW Helps Connect State Agencies in Michigan
Through customer service, resource optimization,
and the innovative use of information and tech-
nology, the Michigan Departme nt of Technology,
Management & Budget (DTMB) impacts every
area of government. Nearly 10,000 users in five
major departments, 20 agencies, and mo re than
100 bureaus rely on the EDW to do their jobs more
effectively and better serve Michigan residents . The
EDW achieves $1 million per business day in finan-
cial benefits.
per year within the Department of Human Services
(DHS). These savings include program integrity ben-
efits, cost avoidance due to improved outcomes,
sanction avoidance, operational efficiencies, and
the recovery of inappropriate payments within its
Medicaid program.
The EDW helped Michigan achieve $200 million
in annual financial benefits within the Department of
Community Health alone, plus another $75 million
The Michigan DHS data warehouse (DW) pro-
vides unique and innovative information critical to
the efficient operation of the agency from both a
strategic and tactical level. Over the last 10 years,
the DW has yielded a 15:1 cost-effectiveness ratio.
Consolidated information from the DW now con-
tributes to nearly every function of DHS , including
(Continued)

116 Pan II • Descriptive Analytics
Application Case 3.6 (Continued}
accurate delivery of an d accounting for benefits
delivered to almost 2.5 million DHS public assis-
tance clients.
Michigan has been ambitious in its attempts to
solve real-life problems through the innovative shar-
ing a nd comprehensive analyses of data. Its approach
to BI/ DW has always b een “enterprise ” (statewide) in
nature, ra the r tha n h aving sep a rate BI/DW platforms
for each business area or state agency. By remov-
ing barrie rs to sharing ente rprise data across business
units, Michigan has leveraged massive amounts of
data to create innovative approaches to the use of
BI/DW, delivering efficient, re liable e nterprise solu-
tions using multiple channels.
QUESTIONS FOR DISCUSSION
1. Why would a state invest in a large and expen-
sive IT infrastructure (such as a n EDW)?
2. What are the size a nd complexity of EDW u sed
by state agen cies in Michigan?
3. What were the ch alle nges, the proposed solu-
tion, and the obtained results of the EDW?
Source: Compiled from TDWI Best Practices Awards 2012 Winne r,
Ente rprise Data Ware housing, Gove rnme nt a nd Non-Profit
Category, “Michigan De partme nts o f Technolo gy, Manage me nt
& Budge t (DTMB), Community Health (OCH) , and Human
Se rvices (OHS),” featured in 7DWJ What Works, Vo l. 34, p . 22;
a nd michigan.michigan.gov.
Massive Data Warehouses and Scalability
In additio n to flexibility, a data warehouse needs to support scalability. The main issues
pertaining to scalability are the am o unt of data in the wareho use , how quickly th e ware-
house is expected to grow, the numbe r of con curre nt u sers, and the complexity of u ser
queries. A data wareh o use must scale both h orizontally a nd vertically. The warehouse will
grow as a function of data growth a nd the n eed to expand the warehouse to supp ort n ew
business function ality. Data growth may be a result of the additio n of current cycle data
(e.g., this mo nth’s results) a nd/ or historica l data.
Hicks (2001) described huge databases and data warehouses. Walmart is con tinually
increasing the size of its m assive data warehouse. Walmart is believed to u se a warehou se
w ith hundreds of terabytes of data to study sales trends, track inventory, and p erform
othe r tasks. IBM recently publicized its 50-terabyte wareho u se benchmark (IBM, 2009).
The U.S . Department o f Defe nse is using a 5-petabyte data ware h ou se a n d re pository to
hold medical records for 9 million military personnel. Because of the storage requ ired to
archive its news footage, CNN also has a petabyte-sized data ware house.
Given that the size of data wareh o uses is expanding at an expon ential rate, scalabil-
ity is an important issue. Good scalability means that queries and othe r data-access func-
tio n s will grow (ideally) linearly with the size of the wareho u se. See Rosenberg (2006) for
approaches to improve query performance. In practice, specia lized meth ods have been
developed to create scalable data warehouses. Scalability is difficult when managing hun-
dreds of terabytes or more. Terabytes of data h ave con siderable inertia, occupy a lot of
physical s pace, and require powerful compute rs . Some firms use paralle l processing, and
othe rs u se clever indexing and search schemes to manage their data . Some spread their
data across different physical data stores. As more da ta wareh ouses approach the petabyte
size, better and better solutions to scalability continue to be developed.
Hall (2002) also addressed scalability issu es. AT&T is an industry leader in deploy-
ing and u sing massive data w arehouses . With its 26-terabyte data warehouse, AT&T can
detect fraudulent use of calling cards a nd investigate calls related to kidnapp ings and
o the r crimes. It can also compute millions of call-in votes from television viewers select-
ing the next Ame rican Idol.

Ch apte r 3 • Da ta Ware housing 117
For a sample of successful d ata wareho u sing impleme ntations, se e Edwards (2003).
Jukic a nd Lang ( 2004) examined the tre nds a nd sp ecific issu es related to the u se of off-
sh ore resources in the develo pmen t and su p port of data w are ho u sing and BI applica-
tio n s. D avison (2003) indicated that IT-re lated o ffsh o re outsourcing h ad b een growin g
at 20 to 25 p e rcent pe r year. Wh en con side ring offsh o ring d ata ware h o u sing projects,
ca reful conside ra tio n must be given to culture and security (for deta ils, see Ju kic and
Lan g, 2004).
SECTION 3. 7 REVIEW QUESTIONS
1. What are the m ajo r DW imple mentatio n tasks that can be p erformed in para llel?
2. List and discuss the most pron ounced DW imple m e ntatio n guidelines.
3. Whe n develo ping a successful data ware ho use, w h at are the most important risks an d
issues to consider and p o tentially avoid?
4. What is scala bility~ How does it apply to DW?
3.8 REAL-TIME DATA WAREHOUSING
Data ware h ousing a nd BI tools traditio n a lly focu s on assisting ma nagers in making stra-
tegic a nd tactical decisio ns. Increased data volumes a nd accele rating u p date sp eeds are
fundame ntally changing the role of the d ata ware ho u se in mo de rn business. Fo r many
bu sinesses, ma king fa st and consiste nt decisio ns across the ente rprise requires mo re than
a traditional da ta warehou se or da ta m art. Traditio n al d ata ware h ou ses a re no t bu si-
ness critical. Data a re commo nly updated o n a weekly basis, and th is does n o t allow for
resp o nding to tra nsactio ns in near-real-time.
Mo re d ata , coming in faste r a nd requiring immediate conversio n into decision s ,
means that o rganizatio n s are confro n ting the need fo r real-time d ata ware ho u sing . This
is b ecau se decisio n support has b ecome o p e ratio nal, integrate d BI requires closed-loop
an alytics, and yeste rday’s O DS w ill n o t suppo rt existing re quire me n ts .
In 2003, w ith the advent of real-time da ta warehou sing, th e re was a shift toward
using these techno logies for o peratio nal d ecision s . Real-time data warehousing
(RDW), also known as active data warehousing (ADW) , is the p rocess of loading
and p roviding d ata via the d ata wareho u se as they become available. It evolved from
the EDW concept. The active traits of an RDW / ADW suppleme nt and expand traditio n al
data ware h o use functio n s into the realm of tactical decisio n m aking . People throug h out
the organizatio n w ho interact directly w ith cu sto me rs an d supp liers w ill be e mpowered
w ith informa tio n-based d ecisio n making at the ir fingertips . Even fu rthe r leverage results
w h e n a n ADW p rovides info rmation directly to cu sto m e rs and supplie rs. The reach an d
impact of informatio n access for decis ion making can positively affect almost a ll aspects
of custo me r service, SCM, logistics, a nd b eyond. E-business h as become a m ajo r catalyst
in the dem and for active da ta ware h o using (see Armstrong, 2000) . Fo r examp le, o nlin e
retailer Overstock. com, Inc. (overstock.com) connected data use rs to a real-time data
wareh o u se. At Egg pie, the world’s largest purely o nline ba nk, a cu stom e r d ata wareh o use
is refreshed in n ear-real-time . See Applicatio n Case 3.7 .
As business needs evolve , so d o the re quireme nts o f the data ware h ou se. At this
basic level, a da ta ware h ou se simply re p o rts w h at h a ppe ned. At the next level, some
analysis occurs . As the syste m evolves, it p rovides prediction cap abilities, w h ich lead
to the next level of o peratio nalizatio n. At its hig hest evolu tio n , the ADW is capable of
ma king events h appe n (e.g ., activities su ch as creating sales and ma rke ting cam paig ns o r
ide ntifying a nd explo iting opportunities). See Figure 3. 12 for a gra phic descrip tio n of this
evolutio nary process . A recent su rvey on ma n aging evolutio n of da ta ware h o u ses can b e
found in Wrem bel (2009).

118 Pan II • Descriptive Analytics
Application Case 3.7
Egg Pie Fries the Competition in Near Real Time
Egg pie, now a p art of Yorkshire Building Society
(egg.com) is the world’s largest online bank. It pro-
vides banking, insurance, investme nts, and mort-
gages to more than 3.6 millio n customers through its
Internet site. In 1998, Egg selected Sun Microsystems
to create a re liable, scalable, secure infrastructure to
suppo rt its more than 2.5 million daily transactions.
In 2001, the system was upgraded to eliminate
late n cy problems . This n ew customer data ware –
ho u se (CDW) use d Sun, Oracle, and SAS software
products. The initial data warehouse had a bout 10
terabytes of data a nd u sed a 16-CPU server. The sys-
tem provides near-real-time data access. It provides
data warehouse and da ta mining services to inte r-
n al users, and it provides a requisite set of cu s-
tomer d a ta to the customers themselves. Hundreds
of sales and ma rketing campaigns are constructed
u sing near-real-time data (within several minutes).
And better, the system enables faster d ecisio n m ak-
ing about sp ecific cu stom ers and customer classes.
QUESTIONS FOR DISCUSSION
1. Why kind of business is Egg pie in? What is the
competitive la ndscape?
2. How did Egg pie u se near-real-time data ware-
h o using for competitive advantage?
Sources: Compiled from “Egg’s Cu stom er Data Warehouse Hits th e
Mark,” DM Review, Vol. 15, No. 10, October 2005, pp. 24-28; Sun
Microsystems, “Egg Banks on Sun to Hit the Mark with Cu stom ers,”
September 19, 2005, sun.com/smi/Press/sunflash/2005-09/
sunflash.20050919.1.xml (accessed April 2006); and ZD Net UK,
“Sun Case Study: Egg’s Customer Data Warehouse,” whitepapers.
zdnet.co.uk/0,39025945 ,60159401p-39000449q,00.htm
(accessed June 2009).
Real-Time Decisioning
Applications
OPERA TIONALIZING
Enter prise Decisioning
Management
ACTIVATING
MAKE it happen!
~—-~ WHAT IS
Predictive
Models .1
~
a.
E
0
u
“O
C
C1I
“O
C1I
0
:i ..
0
3:
REPORTING
WHAT
happened?
Primarily
Batch and
Some Ad Hoc
Reports
Segmentation
and Profiles
ANALYZING
WHY
did it happen?
Increase in
Ad Hoc
Analysis
PREDICTING
WHAT WILL
happen?
Analytical
Modeling
Grows
happening now?
Continuous
Update and Time-
Sensitive Queries
Become
Important
Batch
OAd Hoc
•Analytics
Event-Based
Triggering Takes Hold
D Continuous Update/Short Queries
• Event-Based Triggering
Data Sophistication
FIGURE 3.12 Enterprise Decision Evolution. Source: Courtesy of Teradata Corporation. Used with permission .

Active Access
Front-Line operational
decisions or services
supported by NRT access;
Service Level Agreements of
5 seconds or less
Active Load
Intra-day data acquisition;
Mini-batch to near-real-time
[NRTJ trickle data feeds
measured in minutes or
seconds
Active Event s
Proactive monitoring of
business activity initiating
intelligent actions based on
rules and context; to
systems or users supporting
an operational business
process
Chapter 3 • Data Warehousing 119
Active W orkloa d
Management
Dynamically manage system
resources for optimum
performance and resource
utilization supporting a
mixed-workload environment
Active Enterprise
Integration
Integration into the
Enterprise Architecture for
delivery of intelligent
decisioning services
Active Availability
Business Continuity to
support the requirements of
the business
[up to 7 x 24)
FIGURE 3.13 The Teradata Active EDW. Source: Courtesy of Teradata Corporation . Used w ith permission.
Terad ata Corp oratio n provides the baseline requirem e nts to su p p o rt a n EDW. It also
provides the new traits of active data warehousing required to deliver data freshness, per-
forma nce, and availability an d to e nable e nterprise decisio n ma nagemen t (see Figure 3.1 3
for an example).
An ADW offe rs an integrated informatio n repository to d rive strategic an d tactical
decisio n support w ithin an organizatio n. With real-time data wareh ousing, instead of
extracting o peratio n al data from an O LTP system in nig h tly batches into an ODS, data
are assemb led fro m OLTP systems as and w he n even ts happ en an d are moved at o nce
into th e data wareho use. This p e rmits the instant u p da ting of the d ata ware h ou se and the
eliminatio n of an ODS. At this p o int, tactical and strategic q u eries ca n be made against th e
RDW to use im mediate as well as historical data.
Accord ing to Basu (2003), the m ost distinctive d iffere n ce between a traditio nal data
wareh ouse and an RDW is th e shift in the data acquisitio n paradigm. Some of the b usi-
ness cases a nd e nte rprise re quire me n ts that led to the need for data in real time include
the following :
• A bu siness ofte n canno t afford to wait a w h o le d ay for its operation al data to load
into th e data wareh ouse fo r an alysis .
• Until now, d ata ware h ou ses h ave cap tured sn apshots of a n organizatio n ‘s fixed
states instead of incre m ental real-time data sh owing every state chan ge an d almost
a n alogou s patterns over time.
• With a traditio nal hub-and-sp o ke architecture , k eeping the me tadata in syn c is dif-
ficu lt. It is also costly to develop, ma intain, a n d secure m any systems as opposed to
o ne huge d ata warehouse so tha t data a re centralized fo r Bl/BA tools.
• In cases o f huge nightly ba tch loads, the necessary ETL setup a nd processing p ower
for la rge nightly data warehou se loading m igh t be ve1y high, an d the processes
migh t take too long. An EAi w ith real-time data collection can redu ce or eliminate
the nightly batch processes.

120 Pan II • Descrip tive Analytics
Despite the ben efits of an RDW, developing o ne can create its own set of issu es.
These proble ms re late to architecture, data modeling, physical database design, storage
and scalability, a nd maintainability. In additio n , depen ding o n exactly w h e n d ata are
accessed , even d own to the microsecon d, d ifferent versio ns of the truth may be extracted
and created, w hich can confuse team m em bers. For details, refer to Basu (2003) an d Terr
(2004).
Real-time solutio n s present a rem arkable set of ch allenges to BI activities. Althou gh
it is not ideal for a ll solutio ns, real-t ime data wa re housing may be su ccessful if the organ i-
zation develops a sound metho do logy to handle p roject risks, incorporate p roper p lan-
ning, an d focu s o n q u ality assura nce activities. Un derstan d ing the common ch allenges
and a pp lying best practices can redu ce the exte nt of the p roblems that are often a p art
of imple m enting com p lex data wareh o using systems that incorporate Bl/BA m ethods.
Details an d real im plementatio ns are d iscussed by Bu rdett and Singh (2004) an d Wilk
(2003). Also see Akbay (2006) and Ericson (2006).
See Techno logy Insights 3.3 for some details o n h ow the real-time concept evolved.
The flig h t manageme n t dash board application at Contin en tal Airlines (see the End-of-
Chapter Applicatio n Case) illustrates the p ower of real-time BI in accessing a data ware-
house fo r use in face-to-face custo m er interactio n situatio ns. The operatio ns staff u ses th e
real-time system to identify issu es in the Continental flig ht network. As a noth e r example,
UPS invested $600 millio n so it could use real-time data an d p rocesses . The investm ent
was expected to cut 100 m illio n delivery miles and save 14 millio n gallons of fuel annu-
ally by ma naging its real-time package-flow technologies (see Malykhina, 2003). Table 3.6
compares traditio nal and active data ware housing e nvironm ents.
Real-time data warehousing, near- real-time data warehousing, zero-latency ware-
housing, and active data warehousing are different nam es used in practice to describe
the same con cep t. Gonzales (2005) presen ted d ifferent definitio n s fo r ADW. According
to Gonzales, ADW is o nly o ne option th at provides blen ded tactical a nd strategic data o n
dem and. The architecture to build an ADW is very similar to the corporate info rmatio n
fac tory architecture developed by Bill Inmo n. The o nly differen ce between a corporate
info rmatio n factory a nd a n ADW is the imp le me ntatio n of both data stores in a single
TECHNOLOGY INSIGHTS 3.3 The Real-Time Realities of Active
Data Warehousing
By 2003, the role of data warehousing in practice was growing rapidly. Real-time systems, though
a novelty, were the latest buzz, along with the ma jo r complications of providing data and infor-
matio n instantaneously to those w ho need the m . Many expeits, inclu ding Peter Coffee, eWeek’s
technology edito r, believe that real-time systems must feed a real-time decision-making process.
Stephen Brobst, CTO of the Teradata division of NCR, ind icated that active data wareho using is a
process of evolutio n in how a n ente rp rise uses data. Active means that the da ta warehouse is also
used as an operational an d tactical tool. Brobst provided a five-stage model that fits Coffee’s
experie nce (2003) o f how o rganizatio ns “grow ” in their data utilization (see Brobst e t al. , 2005).
These stages (and the questio ns they p urpoit to answer) are repoiting (What happened?) , an alysis
(Why d id it happen?), prediction (What will happen1) , operationalizing (What is happening?), and
active wareho u sing (What do I want to happen?). The last stage, active ware housing, is w here the
greatest ben efits may be obtained. Many organizations are enha ncing centralized data wareh ouses
to serve both o peratio nal and strategic decisio n making.
Sources: Adapted from P. Coffee, ‘”Active’ Ware housing,” eWeek, Vol. 20, No. 25, Ju ne 23, 2003, p. 36; a nd
Teradata Corp., “Active Data Wa re housing,” teradata.com/active -data-warehousing/ (accessed August 2013).

Ch apte r 3 • Data Ware housing 121
TABLE 3.6 Comparison Between Traditional and Active Data Warehousing Environments
Traditional Data Warehouse Environment
Strategic decisions only
Results sometimes hard to measure
Daily, weekly, monthly data currency
acceptable; summaries often appropriate
Moderate user concurrency
Highly restrictive reporting used to confirm
or check existing processes and patterns;
often uses predeveloped summary tables
or data marts
Power users, know ledge workers, internal
users
Active Data Warehouse Environment
Strategic and t actical decisions
Results measured with operat ions
Only com prehensive detailed data available
wit hin minut es is acceptable
High number (1 ,000 or more) of users
accessing and querying the syst em
simultaneously
Flexible ad hoc reporting , as w ell as
machi ne-assisted modeling (e. g., data
mining) to discover new hypotheses
and relationships
Operational staffs, call centers, externa l users
Sources: Adapted fro m P. Coffee, ‘”Active’ Wa rehousing ,” eWeek, Vol. 20, No. 25, Ju ne 23, 2003, p. 36; and
Te radata Corp., “Active Data Wa re ho us ing,” teradata.com/active-data-warehousing/ (accessed August 2013).
environme nt. However, an SOA based o n XML and Web services p rovides an o the r o ption
for ble n ding tactical and strategic data o n de m and.
One critical issue in real-time d ata wareho using is tha t not all data sh ould b e u pdated
continuo usly. This may certainly cau se p roble ms w he n re p o rts are gen erated in real time ,
becau se o ne p e rson ‘s results may not match a no the r person’s. For example, a compa ny
using Business Objects Web Intelligence noticed a significant problem w ith real-time
intelligence. Real-time re p o 1ts p roduced at slightly diffe re nt times diffe r (see Pete rson ,
2003). Also, it may no t be n ecessary to update certa in d ata continuo u sly (e.g ., course
grades that a re 3 o r mo re years old) .
Real-time re quirem e nts change the way we view the d esig n o f databases, data ware-
ho u ses , OLAP, a nd data mining tools becau se they a re literally updated concurre ntly
w hile queries are active . But the substantial business value in doing so h as been de mo n-
stra ted , so it is crucial that o rganizatio n s ad opt these m e thods in the ir business processes.
Careful planning is critical in su ch impleme ntations .
SECTION 3.8 REVIEW QUESTIONS
1. What is an RDW?
2. List the benefits of an RDW.
3. What are the m ajor diffe ren ces between a traditio n al d ata ware h ou se an d a n RDW?
4. List som e of the drivers for RDW .
3.9 DATA WAREHOUSE ADMINISTRATION, SECURITY ISSUES,
AND FUTURE TRENDS
Data ware h o uses p rovide a distinct competitive e dge to e nte rprises that effectively cre –
ate an d use the m. Due to its huge size an d its intrinsic na ture , a data ware h ouse req uires
especially strong mo nitoring in o rder to sustain satisfactory efficiency and productivity.
The su ccessful administratio n and ma nageme nt o f a data wareh o u se e ntails skills an d
pro ficie ncy that go p ast w hat is required of a traditio nal database administrato r (DBA).

122 Pan II • Descriptive Analytics
A data warehouse administrator (DWA) should be familiar w ith high-performan ce
software, hardware, and n etworking technologies. He or she should also p ossess solid
business insight. Because data warehouses feed BI systems and DSS that help manag-
e rs w ith their decision-making activities, the DWA should be familiar w ith the decision-
m aking processes so as to suitably desig n and m aintain the data warehou se structure. It is
particularly sig nificant for a DWA to keep the existing requ irements and capabilities of the
data ware ho u se stable while simultaneously providing flexibility fo r rapid improvements.
Finally, a DWA must possess excelle nt communicatio ns skills. See Benander et a l. (2000)
for a description of the key differences between a DBA a nd a DW A.
Security and privacy of information are main and significant con cerns for a data ware-
house professional. The U.S. government has passed regulatio ns (e.g., the Gramm-Leach-
Bliley privacy and safeguards rules, the Health In surance Portability and Accountability
Act of 1996 [HIPAAJ) , instituting obligatory requirements in the management of customer
informatio n . Hence, companies must create secu rity procedures that are effective yet flex-
ible to conform to numerous privacy regulations. According to Elson and Leclerc (2005),
effective security in a data warehouse sh ould focus o n fou r main areas:
1. Establishing effective corporate and security policies and procedures. An effective
security policy should start at the top, with executive management, and should be
communicated to all individua ls within the organization.
2. Implementing logical security procedures and techniques to restrict access. This
includes user authenticatio n , access controls, and encryption technology.
3. Limiting physical access to the data center environme nt.
4. Establishing an effective inte rnal control review process w ith an emphasis on security
and privacy.
See Techno logy Insights 3.4 for a description of Ambeo’s importan t software tool
that m onitors security and privacy of data warehouses. Finally, keep in mind that access-
ing a data warehouse via a mobile device should always be performed cautiously. In this
insta nce, data sh ould o nly be accessed as read-o nly.
In the near term, data ware h ousing developments w ill be determined by n otice-
able factors (e.g. , data volumes , increased intoleran ce for latency, the diversity an d com-
plexity of data types) a nd less noticeable factors (e.g., unmet end-user requirements for
TECHNOLOGY INSIGHTS 3.4 Ambeo Delivers Proven Data-Access
Auditing Solution
Since 1997, Arnbeo (ambeo.com; now Embarcadero Technologies, Inc.) h as deployed techno l-
ogy that provides performance manageme nt, data usage tracking , data privacy auditing, and
monitoring to Fortune 1000 companies. These firms have some of the largest database e nviron-
ments in existence . Ambeo data-access auditing solutions play a majo r role in an enterprise
information security infrastructure.
The Ambeo technology is a re lative ly easy solution that records eve1ything that happens
in the databases, w ith low or zero overhead. In addition, it provides data-access auditing that
identifies exactly w ho is looking at data, w hen they are looking, and w hat they are doing w ith
the d ata. This real-time monitoring helps quickly and effectively identify security breaches .
Sources.- Adapted fro m “Ambeo De livers Proven Data Access Auditing Solution,” Database Trends and
Applications, Vol. 19, No. 7, July 2005; and Ambeo, “Keep ing Data Private (a nd Knowing It): Moving
Beyond Conventional Safeguards to Ensure Data Privacy,” am-beo.com/why_ambeo_white_papers.html
(accessed May 2009).

Chapter 3 • Data Warehousing 123
dashboards, balanced scorecards, master data management, information quality). Given
these drivers, Moseley (2009) and Agosta (2006) suggested th at data warehousing trends
w ill lean toward simplicity, valu e, and performance.
The Future of Data Warehousing
The field of data warehousing h as been a vibrant area in information technology in the
last couple of decades, and the evide nce in the BI/BA and Big Data world sh ows that
the importan ce of the field w ill only get even more inte resting. Following are some of th e
recently popularized concepts a nd techno logies that w ill play a sig nificant role in defining
the future of data wareho using .
Sourcing (mechanisms for acquisition of data from diverse and dispersed sources):
• Web, social media, and Big Data. The recent upsurge in the use of the Web
for personal as well as business purposes coupled w ith the tremendous interest in
social media creates opportunities fo r analysts to tap into very rich data sources.
Because of the sheer volume, velocity, and variety of the data, a n ew term, Big Data,
has been coined to name the phenomenon. Taking advantage of Big Data requires
development of n ew a nd dramatically improved BI/BA technologies, which w ill
result in a revolutionized data warehousing world .
• Open source software. Use of open source software tools is in creasing at an
unprecedented level in wareh ousing, business intelligence, and data integration. There
are good reasons for the upswing of open source software used in data warehous-
ing (Russom, 2009): (1) The recession has driven up interest in low-cost open source
software; (2) open source tools are coming into a new level of maturity, and (3) open
source software augments traditional enterprise software without replacing it.
• Saas (software as a service), “The Extended ASP Model. ” Saas is a creative
way of deploying information system application s where the provider licenses
its applications to customers for use as a service o n d e mand (usually over the
Inte rnet). Saas software vendo rs may h ost the applicatio n o n their own servers
o r upload the application to the consumer site . In essence, Saas is the new and
improved versio n of the ASP model. For data warehouse customers, finding SaaS-
based software a pplicatio n s a n d resources that meet specific n eeds and require –
m e nts can be ch allen ging. As these software offerings become more agile, th e
app eal and the actual use of Saas as the choice of data warehousing platform w ill
also increase.
• Cloud computing. Cloud computing is perhaps the n ewest and the most inno –
vative platform cho ice to come along in years . Numerous h ardware and software
resources are pooled a nd virtualized, so that they can be freely allocated to appli-
cations a nd software platforms as resources are needed. This enables information
system applications to dynamically scale up as workloads increase. Although cloud
computing and similar virtualizatio n techniques a re fairly well esta blished for opera-
tional applications today, they are just now starting to be used as data warehouse
platforms of choice. The dynamic allocation of a cloud is p articularly useful when
the data volume of the warehouse varies unpredictably, making cap acity planning
difficult.
Infrastructure (architectural-hardware and software-enhancements):
• Columnar (a new way to store and access data in the database). A column-
oriented database manage ment system (also commonly called a columnar data-
base) is a system that stores data tables as sections of columns of data rather than
as rows of data ( w hich is the way most relational database managemen t systems
do it). That is, these columna r databases store data by columns instead of rows

124 Pan II • Descriptive Analytics
(all values of a single column a re stored con secutively on disk memory). Such a
structure gives a much finer g rain of control to the relational d atabase management
syste m. It can access only the columns required for the query as opposed to being
forced to access all columns o f the row. It performs significantly better for queries
that need a small percentage of the columns in the tables they are in but p erforms
significantly worse w hen you need m ost of the columns due to the overhead in
attaching all of the columns together to form the result sets. Comp arisons between
row-oriented a nd column-o rie nted data layouts are typically concerned with the
efficiency of hard-disk access for a given workload (which happens to be one of
the most time-consuming operatio ns in a computer). Based o n the task at hand,
one may be significantly advantageous over the other. Column-oriented organiza-
tions are more efficient w hen (1) an aggregate needs to be computed over many
rows but only for a n otably smalle r subset of a ll columns of data, because reading
that smaller subset of data can be faster than reading all data, and (2) new values of
a column are supplied for all rows at o nce, becau se that column data can be writte n
efficiently and replace o ld column data without touching any oth er columns fo r the
rows. Row-oriented organizations are mo re efficient when (1) many columns of a
single row are required at the sam e time, and w h e n row size is re latively small, as
the e ntire row can be retrieved with a s ing le disk seek , a nd (2) w riting a new row if
all of the column data is supplied at the same time , as the entire row can be written
w ith a single disk seek. Additionally, since the data stored in a column is of uniform
type, it le nds itself better for compression . That is, significant storage size optimiza-
tion is available in column-o riented d ata that is not available in row-oriented data.
Such optimal compressio n of data redu ces storage size, m aking it more economi-
cally justifiable to pursu e in-memory or solid state storage alternatives.
• Real-time data warehousing. Real-time data wareh ousing implies that the
refresh cycle of a n existing data wareh ouse updates the data more frequen tly (almost
at the same time as the data becomes available at operational databases) . These
real-time data ware ho use systems can achieve n ear-real-time update of data, w h ere
the data laten cy typically is in the range from minutes to hours. As the latency gets
smaller, the cost of data update seems to increase exponentially . Future advan ce-
me nts in man y techno logical fronts (ran ging from automatic data acquisition to inte l-
ligent softwa re agents) are needed to make real-time data wareh ousing a reality w ith
an affordable price tag .
• Data warehouse appliances (all-in-one solutions to DW). A d ata warehouse
appliance consists of an integrated set of servers, storage, operating system(s), data-
base managem e nt syste ms, and software specifically preinstalled a nd preoptimized
fo r data wareh ousing. In practice, data warehouse appliances provide solutions
for the mid-to-big data warehouse market, offering low-cost performance on data
volumes in the terabyte to petabyte range. In o rder to improve performance, most
data wareho u se appliance vendors use massively parallel processing architectures.
Even tho ugh most database and data warehouse vendors p rovide appliances nowa-
days, m any believe that Teradata was the first to provide a commercial data ware-
house applia nce product. What is often observed now is the e mergence of data
warehouse bundles, where vendors combine their hardware and database software
as a data ware h ouse platform. From a benefits standpoint, data wareh ouse appli-
ances h ave significantly low total cost o f ownership, which includes initial purchase
costs, o ngo ing maintena nce costs, and the cost of ch anging capacity as the data
grows. The resou rce cost for monitoring and tuning the da ta warehouse makes up
a large part of the total cost of ownership, often as much as 80 percent. DW appli-
an ces reduce administratio n fo r day-to-day operatio ns, setup , and integration. Since
data ware ho use appliances provide a single-vendo r solutio n , they tend to better

Chapter 3 • Data Warehousing 125
optimize the h ardware and software w ithin the appliance. Su ch a unified integration
maximizes the chances of successful integratio n and testing of the DBMS storage
a nd operating system by avoiding some of the compatibility issues that a rise from
multi-vendor solutio ns. A data warehou se applia n ce also provides a single point of
contact for problem resolution a nd a much simpler upgrade path for both software
a nd h ardware .
• Data management technologies and practices. Some of the most p ressing
needs for a next-ge neratio n data warehouse platform involve technologies and
practices that we gen e rally don’t think of as part of the platform. In particular,
many users need to update the data management tools that process data for use
through data warehou sing. The future holds strong growth for master data man-
agement (MDM). This relatively new, but extremely impo rta nt, concept is gaining
popularity for many reasons , including the following: (1) Tighte r integration w ith
operatio nal systems demands MDM; (2) most data warehou ses still lack MDM and
data quality functions; and (3) regulatory and financial reports must be perfectly
clean and accurate.
• In-database processing technology (putting the algorithms where the
data is). In-database processing (also called in-database analytics) refers to th e
integratio n of the algorithmic extent of data a nalytics into data warehouse. By doing
so, the data and the analytics that work off the data live w ithin the same environ-
ment. Having the two in close proximity increases the efficiency of the com puta-
tionally intensive analytics procedures. Today, many large database-driven decision
support systems, su ch as those used for credit card fraud detection and investment
risk management, u se this technology becau se it provides significant performance
improvements over traditional methods in a decision environment w here time is
of the essence. In-database processing is a complex e ndeavor compared to th e
traditional way of conductin g analytics, w h e re the data is m oved out of the data-
base (often in a flat file format that consists of rows a nd columns) into a sepa-
rate an alytics e nvironment (su ch as SAS Enterprise Modeler, Statistica Data Miner,
or IBM SPSS Modeler) for processing. In-database processing makes more sense
for high-throughput, real-time applicatio n e nv ironments , including frau d detec-
tion, credit scoring, risk management, transaction processing, pricing and ma rgin
a nalysis, usage-based micro-segmenting, behavioral ad targeting, and recommenda-
tio n e ngines, such as those u sed by customer service organizatio ns to determine
next-best action s. In-da tabase processing is performed a nd promoted as a feature
by many of the m ajor data wareh o using vendors, including Teradata (integrating
SAS an alytics cap abilities into the data warehouse appliances), IBM Netezza, EMC
Greenplum, and Sybase, among others.
• In-memory storage technology (moving the data in the memory for faster
processing). Conven tio n al database systems, such as relational database man-
agement syste ms, typically use p hysical hard drives to store data for an extended
period of time. When a data-related process is requested by an application , the
database manage ment system loads the data (or parts of the da ta) into the main
memory, processes it, and responds back to the application. Although data (or parts
of the data) is temporarily cached in the main memory in a database management
system, the primary storage locatio n re mains a magnetic hard disk. In contrast, an
in-memory database system keeps the data permanently in the main m emory. When
a data-related process is requested by an application, the database man agement
system directly accesses the data, which is already in the ma in me mory, processes
it, and responds back to the requesting application. This direct access to data in
main memory makes the processing of data o rders much faste r than the traditional
me thod. The main benefit of in-me mory techno logy (maybe the o nly be nefit of it) is

126 Pan II • Descriptive Analytics
the incredible speed at which it accesses the d ata . The disadvantages include cost of
paying for a very large m ain memory (even tho ug h it is getting cheaper, it still costs
a g reat deal to have a large e nough main memory that can hold all of com pany’s
data) and the need for sophisticated data recovery strategies (since main me m ory is
vola tile and can be w iped out accidentally) .
• New database management systems. A data warehouse platform consists of sev-
eral basic compo nents, of which the most critical is the database management system
(DBMS). This is o nly natural, given the fact that DBMS is the component of the platform
where the most work must be done to implement a data model and optimize it for
query performance. Therefore, the DBMS is wh ere many next-generation innovations
are expected to happe n.
• Advanced analytics. Users can choose different analytic methods as they move
beyond basic OLAP-based methods a n d into advanced analytics . Some u sers choose
advanced analytic methods based on data mining, predictive analytics, statistics,
artificial inte llige nce, and so o n. Still, the majority of users seem to be ch oosing SQL-
based meth ods. Either SQL-based or n ot, advan ced an alytics seem to be among the
most important promises of next-generation data warehousing.
The future of data warehousing seems to be full of promises and significant
challe nges. As the world of business becomes more global and complex, the n eed for
business inte llige nce a nd d ata warehousing tools will also become more prominent. The
fast-improving informa tion techno logy tools a nd techniques seem to be moving in the
right direction to address the needs of future business intelligen ce systems.
SECTION 3.9 REVIEW QUESTIONS
1. What ste p s can an orga nizatio n take to e nsure the security and confide ntiality of cu s-
to me r data in its d ata ware ho u se?
2. What skills should a DWA possess? Why?
3. What recent technologies may shape the future of d ata warehousing? Why?
3.10 RESOURCES, LINKS, AND THE TERADATA UNIVERSITY
NETWORK CONNECTION
The use of this chapter an d most other chapte rs in this book can be e n han ced by the tools
described in the following sections.
Resources and Links
We recommend looking at the following resources and links fo r further reading and
explanatio ns:
• The Da ta Warehouse Institute (tdwi.org)
• DM Review (information-management.com)
• DSS Resources (dssresources.com)
Cases
All major MSS vendors (e.g., MicroStrategy, Microsoft, Oracle, IBM, Hyperion , Cognos, Exsys,
Fair Isaac, SAP, Info rmatio n Builders) provide interesting cu stomer success stories. Academic-
oriented cases are available at the Harvard Business Sch ool Case Collection (harvardbu
sinessonline.hbsp.harvard.edu) , Business Performance Improvement Resource (bpir.
com), IGI Glo bal Disseminator of Knowledge (igi-global.com), Ivy League Publishing
(ivylp.com) , ICFAI Cente r for Management Research (icmr.icfai.org/casestudies/

Chapter 3 • Data Warehousing 127
icmr_case_studies.htm), KnowledgeStorm (knowledgestonn.com), and other sites. For
additional case resources, see Teradata University Network (teradatauniversitynetwork.
com). For data warehousing cases, we specifically recommend the following from the
Teradata University Network (teradatauniversitynetwork.com): “Continental Airlines Flies
High with Real-Time Business Intelligence,” “Data Warehouse Governance at Blue Cross
and Blue Shield of North Carolina, ” “3M Moves to a Customer Focus Using a Global Data
Warehouse,” “Data Warehousing Supports Corporate Strategy at First American Corporation,”
“Harrah’s High Payoff from Customer Information,” and “Whirlpool. ” We also recommend
the Data Warehousing Failures Assignment, which consists of eight short cases on data
warehousing failures.
Vendors, Products, and Demos
A comprehensive list of vendors, products, and demos is available at DM Review
(dmreview.com) . Vendors are listed in Table 3.2. Also see technologyevaluation.com.
Periodicals
We recommend the following p e riodicals :
• Baseline (baselinemag.com)
• Business Intelligence journal (tdwi.org)
• CIO (do.com)
• CIO Insight (cioinsight.com)
• Computerworld (computerworld.com)
• Decision Support Systems (elsevier.com)
• DM Review (dmreview.com)
• eWeek (eweek.com)
• Info Week (infoweek.com)
• Info World (infoworld.com)
• InternetWeek (internetweek.com)
• Management Information Systems Quarterly (MIS Quarterly; misq.org)
• Technology Evaluation (technologyevaluation.com)
• Teradata Magazine (teradata.com)
Additional References
For additional information o n data warehousing, see the following:
• C. Imhoff, N. Gale mmo, and]. G. Geiger. (2003). Mastering Data Warehouse Design:
Relational and Dimensional Techniques. New York: Wiley.
• D. Marco and M. Je nnings. (2004). Universal Meta Data Models. New York: Wiley.
•]. Wa ng. (2005). Encyclopedia of Data Warehousing and Mining. Hershey, PA: Idea
Group Publishing.
For more on databases, the structure on which data warehouses are developed, see
the follow ing:
• R. T. Watson. (2006). Data Management, 5th ed ., New York: Wiley.
The Teradata University Network (TUN) Connection
TUN (teradatauniversitynetwork.com) provides a wealth of information and cases
on data warehousing. One of the best is the Continental Airlines case, w hich we require
you to solve in a later exercise. Other recommended cases are mentioned earlie r in this

128 Pan II • Descriptive Analytics
chapter. At TUN, if you click the Courses tab and select Data Warehousing, you will see
links to many relevant a1ticles, assignments, book chapters, course Web sites, PowerPoint
presentations, projects, research reports, syllabi, and Web seminars. You will also find
links to active data warehousing software demonstrations. Finally, you will see links to
Teradata (teradata.com), where you can find additional information, including excel-
lent data warehousing success stories, white papers, Web-based courses, and the online
version of Teradata Magazine.
Chapter Highlights
• A data warehouse is a specially constructed data
repository where data are organized so that they
can be easily accessed by end users for several
applications.
• Data marts contain data on one topic (e.g., market-
ing). A data mart can be a replication of a subset
of data in the data warehouse. Data marts are a
less expensive solution that can be replaced by
or can supplement a data warehouse . Data marts
can be independent of or dependent on a data
warehouse .
• An ODS is a type of customer-information-file
database that is often used as a staging area for a
data warehouse.
• Data integration comprises three major pro-
cesses: data access, data federation, and change
Key Terms
active data warehousing
(ADW)
cube
data integration
data mart
data warehouse CDW)
data warehouse
administrator CDW A)
dependent data mart
dimensional modeling
dimension table
drill down
enterprise application
integration (EAi)
enterprise data
warehouse (EDW)
Questions for Discussion
1. Compare data integration and ETL. How are they related?
2. What is a data warehouse, and what a re its benefits? Why
is Web accessibility important with a data warehouse?
3. A data mart can replace a data warehouse or comple-
ment it. Compare and discuss these options.
capture. When these three processes are correctly
implemented, data can be accessed and made
accessible to an array of ETL and analysis tools
and data warehousing environments.
• ETL technologies pull data from many sources,
cleanse them, and load them into a data ware-
house. ETL is an integral process in any data-
centric project.
• Real-time or active data warehousing supple-
ments and expands traditional data warehousing,
moving into the realm of operational and tacti-
cal decision making by loading data in real time
and providing data to users for active decision
making.
• The security and privacy of data and information
are critical issues for a data warehouse professional.
enterprise information
integration (Ell)
extraction,
transformation,
and load (ETL)
independent data mart
metadata
OLTP
aper mart
operational data store
(ODS)
real-time data
warehousing (RDW)
snowflake schema
star schema
4. Discuss the major drivers and benefits of data warehous-
ing to end use rs.
5. List the differences and/ or similarities between the roles
of a database administrator and a data warehouse ad-
ministrator.

6. Describe how data integration can lead to higher levels
of data quality.
7. Compare the Kimball and Inmo n approaches toward
data ware house development. Ide ntify when each one is
most effective.
8. Discuss security concerns involved in building a data
wa re house.
Exercises
Teradata University and Other Hands-On Exercises
1. Conside r the case describing the development and appli-
cation o f a data warehouse fo r Coca-Cola J apan (a sum-
ma1y appears in Application Case 3.4), available at the
DSS Resources Web site, http://dssresources.com/
cases/coca-colajapan/. Read the case and answer the
nine questions for further analysis and discussion.
2 . Read the Ball (2005) article a nd rank-order the criteri a
(ideally for a real o rganizatio n). In a report, explain how
important each criterion is a nd w hy.
3. Explain when you sho uld imple ment a two- o r three-
tie red architecture when conside ring developing a data
warehouse.
4. Read the full Continental Airlines case (summa-
rized in the End-of-Chapter Application Case) at
teradatauniversitynetwork.com a nd answer the
questions.
5. At teradatauniversitynetwork.com, read and answer
the questions to the case “Harrah’s High Payoff from
Customer Information.” Relate Harrah’s results to how
airlines and othe r casinos use their customer data.
6. At teradatauniversitynetwork.com , read a nd answer
the questions of the assig nment “Data Warehousing
Failures. ” Because e ig ht cases are described in that
assignment, the class may be divided into eight groups,
with one case assigned per group. In addition , read
Ariyachandra and Watson (2006a), a nd for each case
identify how the failure occurred as related to not focu s-
ing on one o r more of the reference’s success factor(s).
7. At teradatauniversitynetwork.com, read and answer
the questions with the assig nment “Ad-Vent Technology:
Using the MicroStrategy Sales Analytic Model. ” The
MicroStrategy software is accessible from the TUN site.
Also, you might wa nt to use Barbara Wixom’s PowerPoint
presentation about the MicroStrategy software (“Demo
Slides for MicroStrategy Tutorial Script”) , w hich is also
available at the TUN site.
8. At teradatauniversitynetwork.com, watch the Web
semina rs titled “Real-Time Data Warehousing: The Next
Generation of Decision Support Data Management” and
“Building the Real-Time Enterprise.” Read the article
“Te rada ta ‘s Real-Time Enterprise Refe rence Architecture :
A Blueprint for the Future of IT, ” also available at this
site . Describe how real-time concepts a nd technologies
Chapter 3 • Data Warehousing 129
9. Investigate current data warehou se development imple-
mentation through offshoring. Write a report about it. In
class, debate the issue in terms of the benefits and costs,
as well as social factors.
work and h ow they can be u sed to extend existing data
warehousing and BI architectures to support day-to-day
d ecision making. Write a report indicating how real-time
data warehousing is specifically providing competitive
advantage for organizations. Describe in de tail the dif-
ficulties in su ch implementa tions and operations and
d escribe how they are being addressed in practice.
9. At teradatauniversitynetwork.com, watch the Web
seminars “Data Integration Renaissance: New Drivers and
Emerging Approaches,” “In Search of a Single Version of
the Truth: Strategies for Consolidating Analytic Silos,” and
“Data Integration: Using ETL, EAi, and Ell Tools to Create
an Integrated Enterprise. ” Also read the “Data Integration”
research report. Compare and contrast the presentations.
What is the most important issue described in these semi-
nars? What is the best way to handle the strategies and
challenges of consolidating da ta marts and spreadsheets
into a unified data warehousing a rchitecture? Perform a
Web search to identify the latest developments in the
field. Compare the presentation to the mate rial in the text
and the new material that you found.
10. Consider the future of data warehousing. Pe1form a Web
search on this topic. Also, read these two articles: L. Agosta,
“Data Warehousing in a Flat World: Trends for 2006,” DM
Direct Newsletter, March 31, 2006; and ]. G. Geiger, “CIFe:
Evolving w ith the Times,” DM Review, November 2005,
pp. 38-41. Compare and contrast your findings.
11. Access teradatauniversitynetwork.com. Identify the
latest articles, research repo1ts, a nd cases on data w a re-
housing. Describe recent developments in the field .
Include in your report how data warehousing is used in
BI and DSS.
Team Assignments and Role-Playing Projects
1. Kath1yn Avery has bee n a DBA w ith a nationwide retail
chain (Big Chain) for the past 6 years . She has recently
b een asked to lead the deve lopment of Big Chain’s first
data warehouse. The project has the sponsorship of sen-
ior management a nd the CIO. The rationale fo r devel-
oping the data warehou se is to advance the reporting
systems, particularly in sales and marketing, and, in the
lo nger term, to improve Big Chain’s CRM. Kathryn has
been to a Data Wareh ousing Institute conference and
has been doing some reading, but she is still mystified

130 Part II • Descriptive Analytics
about development methodologies. She knows there are
two groups-EDW (Inmon) and architected data marts
(Kimball)-that have robust features.
Initia lly, she believed that the two methodologies
were extremely dissimilar, but as she has e xamined the m
more carefully, she isn ‘t so certain. Kath1yn has a num-
ber of questio ns that she would like a nswered:
a. What are the real differences between the me thodolo-
gies?
b. What factors are important in selecting a particular
me thodology?
c. What should be he r next steps in thinking about a
methodology?
Help Kathryn a nswer these q u estio ns. (This exercise
was adapted from K. Duncan , L. Reeves, and J. Griffin,
“BI Experts’ Perspective,” Business Intelligence Journal,
Vol. 8, No. 4, Fall 2003, pp. 14-19.)
2. Jeer Kumar is the administrato r of data warehousing at
a b ig regio na l bank. He was appointed 5 yea rs ago to
impleme nt a data warehouse to suppo rt the bank’s CRM
business strategy. Using the data wa re ho use, the bank
has been su ccessful in integrating cu stome r information,
understanding customer profitability, attracting cu stom-
e rs, e nhancing custome r relationships, and retaining
cu stomers.
Over the years, the bank’s data warehouse has
moved closer to real time by moving to more frequent
refreshes of the data warehouse. Now, the bank wants
to imple ment customer self-service and call center appli-
catio ns that require even freshe r data tha n is curre ntly
ava ilable in the wa re house.
Jeer wants some support in conside ring the pos-
sibilities for prese nting fresher data. On e alte rnative is
to e ntire ly commit to imple me nting real-time data ware-
housing . His ETL vendor is prepared to assist him make
this cha nge. Nevertheless, Jeer has been informed about
EAI and Ell technologies and won ders how they might
fit into his plans.
In particular, he has the following questions:
a. What exactly a re EAI a nd Ell technologies?
b. How are EAI and Ell re lated to ETL?
c. How are EAI and Ell rela ted to real-time data
wa re hou sing?
d. Are EAI and Ell required, comple mentary, or alte rna-
tives to real-time data wareho using?
Help Jeer answer these questions. (This exercise was
adapted from S. Brobst, E. Levy, and C. Muzilla, “Ente rprise
Application Integratio n a nd Enterprise Informatio n
Integratio n ,” Business Intelligence Journal, Vol. 10, No. 2,
Spring 2005, pp. 27-33 .)
3. Interview administrators in your college o r executives
in your organization to determine how data warehous-
ing could assist them in their work. Write a proposal
describing your findings. Include cost estimates and ben-
efits in you r report.
4. Go through the list of data warehousing risks described
in this chapter and find two examples of each in practice.
5. Access teradata.com and read the w hite papers “Measuring
Data Warehouse ROI” and “Realizing ROI: Projecting
and Harvesting the Business Valu e of an Ente rprise Data
Warehouse. ” Also, watch the Web-based course “The ROI
Factor: How Leading Practitioners Deal with the Tough
Issue of Measuring DW ROI.” Describe the most important
issues described in the m. Compare these issues to the suc-
cess factors described in Ariyachandra and Watson (2006a).
6. Read the article by K. Liddell Ave1y and Hugh J.
Watson, “Training Data Warehouse End Users,” Business
Intelligence Journal, Vol. 9, No. 4, Fall 2004, pp. 40-51
(which is available at teradatauniversitynetwork.com).
Consider the different classes of e nd u sers, describe their
difficulties, and discuss the benefits of appropriate train-
ing for each g roup. Have each member of the group take
on o ne of the roles and have a discussion about h ow an
appropriate type of data warehousing training would be
good for each of you .
Internet Exercises
1. Search the Internet to find information about data ware-
housing. Identify some newsgroups that have an interest in
this concept. Explore ABI/INFORM in your library, e-library,
and Google for recent articles on the topic. Begin with
tdwi.org, technologyevaluation.com, and the major
vendors: teradata.com, sas.com, oracle.com, and ncr.
com. Also check do.com, information-management.
com, dssresources.com, and db2mag.com.
2. Survey some ETL tools and vendors. Start w ith fairisaac.
com and egain.com. Also consult information-
management.com.
3. Contact some data wa re hou se vendors and obtain info r-
mation about their products . Give special attention to
vendors that provide tools for multiple purposes, such as
Cognos, Software A&G, SAS Institute, and Oracle. Free
online demos are availa ble from some of these vendors.
Download a demo or two and try them. Write a report
describing your experie n ce.
4. Explore teradata.com for deve lopments and success
sto ries about data warehou sing. Write a report about
what you have discovered .
5. Explore teradata.com for w h ite papers and Web-based
courses on data warehousing. Read the former and watch
the latter. (Divide the class so that a ll the sources are
covered.) Write what you have discovered in a report.
6. Find recent cases of su ccessful data warehousing appli-
cations. Go to data ware house vendors’ sites a n d look
for cases o r success stories. Select one and write a brief
summary to present to your class.

Ch apte r 3 • Da ta Ware housing 131
End-of-Chapter Application Case
Continental Airlines Flies High with Its Real-Time Data Warehouse
As business intelligence (BI) becomes a critical compone nt o f
daily operations, real-time data warehouses that provide end
users w ith rapid updates and alerts generated from transactional
syste ms are increasingly be ing de ployed. Real-time data ware –
housing and BI, suppo rting its aggressive Go Fo1ward business
pla n , have he lped Continental Airlines alte r its industry status
from “worst to first” and the n fro m “first to favorite.” Continental
airlines (now a p art of United Airlines) is a leader in real-time
DW and BI. In 2004, Continental won the Data Warehousing
Institute’s Best Practices and Leadership Award. Even though it
has been a while since Continental Airlines de ployed its hugely
su ccessful real-time DW and BI infrastructure, it is still regarded
as o ne of the best examples and a seminal success story for
real-time active data warehousing.
Problem(s)
Continental Airlines was founded in 1934, with a single-e ngine
Lockheed aircraft in the Southweste rn United States. As of
2006, Continental was the fifth largest airline in the United
States and the seventh la rgest in the world. Continental had the
broadest global ro ute network of a ny U.S. airline , w ith mo re
than 2,300 d aily depa1tures to more than 227 destinations .
Back in 1994, Contine ntal was in deep financial trouble .
It had file d for Chapter 11 ba nkruptcy protection twice and was
heading for its third, and probably final , bankruptcy. Ticket
sales were hurting because pe rformance o n factors that a re
impo rtant to customers was dismal, including a low p e rcent-
age of o n-time de pa1tures, freque nt baggage a rrival problems,
and too many cu stome rs turned away due to overbooking.
Solution
The revival of Continental began in 1994, w he n Gordon
Bethune became CEO and initiated the Go Forward pla n ,
which consisted of four interrelated parts to be impleme nted
simultaneously. Bethune targe te d the need to improve cus-
to me r-valued performance measures by bette r unde rsta nding
cu stome r needs as well as customer perceptions of the value
of services tha t were and could be offered. Financial ma nage-
me nt practices were also ta rgete d for a sig nificant overhaul. As
early as 1998, the airline h ad separate databases for marke ting
and operations, a ll hosted a nd managed by outside vendors.
Processing queries a nd instigating marke ting programs to its
high-value cu sto me rs were time-consuming and ineffective .
In additio nal, informatio n that the workforce needed to make
quick decisio ns was simply no t available . In 1999, Contine ntal
chose to integrate its marketing, IT, revenue, a nd operational
data sources into a single, in-ho use, EDW. The data ware –
ho use provided a variety of early, major benefits.
As soon as Continental returned to profitability and
ranked first in the airline industry in ma ny performance met-
rics, Be thune and his manage ment team ra ised the b ar by
escalating the vision. Instead of just p e rforming best, they
wanted Contine nta l to be their cu stomers’ favorite airline. The
Go Forward plan establishe d more actio nable ways to move
fro m first to favorite among customers. Technology beca me
increasingly critical for supp orting these n ew initiatives. In
the early days, h aving access to historical, integrated informa-
tio n was sufficie nt. This produced substantial strategic value .
Bu t it became increasingly imperative for the data ware ho u se
to provide real-time, actionable information to support e nte r-
p rise-w ide tactical decision making and bu siness processes.
Luckily, the warehouse team had expected and arranged
for the real-time shift. From the ve1y beginning, the team had
created an architecture to handle real-time data feeds into the
ware house, extracts of data from legacy systems into the ware-
house, and tactical queries to the warehouse that required almost
inUTiediate respo nse times. In 2001, real-time data became avail-
able fro m the wareho use, and the amount stored g rew rapidly.
Contine ntal moves real-time data (ranging from to-the-minute
to hourly) about customers, reservatio ns, check-ins , operations,
and flights from its main operational syste ms to the warehouse.
Contine ntal’s real-time ap plications include the following:
• Revenue ma nageme nt a nd accounting
• Custo mer relationship man agement (CRM)
• Crew operations a nd payroll
• Security and fraud
• Flight o p eratio ns
Results
In the first year alo ne, after the data warehouse project was
de ployed , Continental ide ntified and eliminated over $7 million
in fraud and reduced costs by $41 million. With a $30 million
investment in hardware and software over 6 years, Contine ntal
has reached over $500 million in increased revenues and cost
savings in marketing, fraud detection, demand forecasting and
tracking, and improved data cente r management. The single,
integrated , trusted view of the business (i.e., the single version
of the truth) has led to better, faster decision making.
Because of its tremendous success, Continental’s DW
implementatio n has been recognized as an excellent examp le
for real-time BI, based on its scalable and exten sible archi-
tecture, practical decisions o n what data are captured in real
time, strong relatio nships with end users, a small and highly
competent data ware house staff, sensible weighing of strategic
and tactical decisio n support requirements, understanding of
the synergies between decision support and operations, and
changed business processes that use real-time data .
QUESTIONS FOR THE END-OF-CHAPTER
APPLICATION CASE
1. Describe the be nefits of impleme nting the Continental
Go Forward strategy.
2. Explain w hy it is impo rtant for an airline to use a real-
time data wareho use.

132 Pan II • Descriptive Analytics
3. Ide ntify the major differe n ces between the traditiona l
data wa re house a nd a re al-time data warehouse, as was
imple me nte d a t Contine n tal.
4. What strategic advantage can Contine nta l derive from
the real-time system as opposed to a tra ditio nal infor-
mation system?
Sources: Adapted from H. Wixom, J. Ho ffe r, R. Ande rson-Le hman,
and A. Reynolds, “Real-Time Business Inte lligence: Best Practices
at Continenta l Airlines,” Infonnation Systems Management Journal,
Winte r 2006, pp. 7-18; R. Anderson-Le hma n, H. Watson, B. Wixom,
and]. Hoffe r, “Contine ntal Airlines Flies Hig h w ith Real-Time Business
References
Adamson , C. ( 2009). Tbe Star Schema Handbook: Tbe
Complete Reference to Dimensional Data Warehouse
Design. Hobo ke n , NJ: Wiley.
Ade lma n , S. , and L. Moss. (2001, Winte r) . “Da ta Warehou se
Risks. ” Journa l of Data Warehousing, Vol. 6 , No. 1.
Agosta , L. (2006, Ja nuary). “The Data Stra tegy Advise r:
The Year Ahead-Data Warehousing Tre nds 2006. ” DM
Review, Vol. 16, No. 1.
Akbay, S. (2006, Qua1ter 1). “Data Ware ho using in Real
Time .” Business Intelligence Journal, Vol. 11 , No. 1.
Ambeo. (2005 , July). “Ambeo Delivers Prove n Data Access
Auditing Solution .” Database Trends and Applications,
Vol. 19, No. 7 .
Anthes, G. H. (2003,Ju ne 30). “Hilton Checks into New Suite .”
Computerworld, Vol. 37, No. 26.
Ariyacha ndra , T. , and H. Watson. ( 2005). “Key Factors in
Selecting a Data Warehouse Architecture .” Business
lntelligenceJournal, Vol. 10, No. 3.
Ariyachandra, T. , and H. Watson. (2006a, Janua1y) . “Be nchmarks
for BI and Data Wa re ho using Success. ” DM Review,
Vol. 16, No. 1.
Ariyacha ndra, T. , and H. Watson. (2006b). “Which Da ta
Ware h ou se Architecture Is Most Su ccessful?” Business
l ntelligenceJournal, Vol. 11 , No. 1.
Armstrong, R. (2000, Q u a1te r 3). “E-na lysis for the E-business.”
Teradata Magaz ine Online, teradata.com.
Ball, S. K. (2005, November 14) . “Do Yo u Need a Data
Ware house Layer in Your Bus iness Inte lligence Architecture?”
datawarehouse.ittoolbox.com/documents/industry-
articles/do-you-need-a-data-warehouse-layer-in-your-
business-intelligencearchitecture-2729 (accessed June
2009) .
Barquin, R. , A. Faller, and H. Edelste in. 0997). “Ten Mistakes
to Avoid for Data Ware h ousing Managers. ” In R. Barquin
a nd H. Edelstein (eds.). Building, Using, and Managing the
Data Warehouse. Uppe r Saddle River, NJ: Pre ntice Hall.
Basu, R. (2003, Nove mbe r). “Challe nges o f Real-Time Data
Warehousing. ” DM Review.
Be ll, L. D. ( 2001 , Spring) . “MetaBus iness Meta Da ta fo r the
Masses: Administering Knowledge Sh a ring for Your Data
Ware house. ” Journal of Data Warehousing, Vol. 6, No. 3.
Intelligence,” MIS Quarterly Executive, Vol. 3, No. 4, Decem ber
2004, pp. 163-176 (ava ilable at teradatauniversitynetwork.com);
H . Watson, “Real Time: The Next Gene ratio n of Decision-Su pport
Data Ma nageme nt,” Business Intelligence Journal, Vol. 10, No. 3,
2005, pp. 4-6; M. Edwa rds , “2003 Best Practices Awards Winners:
Innovators in Business Inte lligence a nd Data Wa re ho us ing ,” Business
IntelligenceJournal, Fall 2003, pp. 57-64; R. Westervelt, “Contine ntal
Airlines Builds Rea l-Time Data Warehouse,” August 20, 2003,
searchoracle.techtarget.com; R. Clayton , “Ente rprise Business
Performance Manageme nt: Business Intelligence + Data Warehouse
= Optimal Business Performance,” Teradata Magazine, Septe mber
2005, a nd The Data Warehousing Institute , “2003 Best Practices
Summaries: Enterprise Data Warehouse,” 2003.
Benander, A., B. Be nande r, A. Fadlalla, and G. James. ( 2000,
Winter) . “Data Warehouse Administration a nd Management. ”
Information Systems Management, Vol. 17, No. 1.
Bonde, A. , and M. Kuckuk. (2004, April) . “Rea l World
Business Inte llig e nce: Th e Implementa tion Persp ective.”
DM Review, Vol. 14, No. 4.
Breslin, M. ( 2004, Winte r) . “Data Wa re housing Battle of
the Giants: Comparing the Basics of Kimba ll and In mo n
Models. ” Business In telligence Journal, Vol. 9, No. 1.
Brobst, S., E. Levy, a nd C. Mu zilla. (2005, Spring).
“Ente rprise Application Integratio n and Enterprise
Informa tio n Integra tion.” Business In telligence Journal,
Vo l. 10, No. 3.
Brody , R. (2003, Summer). “Information Ethics in the Design
a nd Use o f Metadata. ” IEEE Technology and Society
Magazine, Vol. 22, No. 3.
Brow n , M. (2004 , May 9-1 2). “8 Characteristics of a Successful
Data Wa re hou se.” Proceedings of the Twenty-Ninth
Annual SAS Users Group International Conference (SUGI
29). Mo ntreal, Canada.
Burde tt, ]. , and S. Singh. (2004) . “Cha lle nges and Lessons
Learned from Real-Time Data Warehousing.” Business
l ntelligenceJournal, Vol. 9, No. 4.
Coffee, P. (2003, Ju ne 23) . ‘”Active’ Ware hou sing. ” eWeek,
Vol. 20, No. 25.
Cooper, B. L. , H.]. Watson, B. H. Wixom, a nd D. L. Goodhue.
(1999, August 15-1 9). “Data Wa re housing Supports
Corporate Strategy a t First Ame rican Corporation.” SIM
Internationa l Confere nce, Atla nta .
Cooper, B. L., H . ]. Watson, B. H. Wixom, and D. L. Goodhue .
( 2000). “Data Wareh ousing Suppo1ts Corporate Strategy at
First Ame rican Corpo ratio n. ” MIS Quarterly, Vol. 24, No.
4 , pp. 547- 567.
Dasu , T. , and T. Johnson. (2003). Exploratory Data Mining
and Data Cleaning. New York: Wiley.
D avison, D. (2003, Novembe r 14). “Top 10 Risks of Offsh o re
Outsourcing. ” META G roup Research Re p o rt, now Gartner,
Inc. , Sta mford, CT.
Devlin, B. (2003, Quaiter 2). “Solving the Data Warehouse
Puzzle. ” DB2 Magazine.
Dragoon , A. (2003, July 1) . “All for One View. ” C/0.

Eckerson, W . (2003, Fall) . “The Evolutio n of ETL. ” Business
l ntelligenceJournal, Vol. 8, No. 4 .
Eckerson, W. (2005, April 1). “D ata Warehouse Builders
Advocate for Different Architectures. ” Application
Development Trends.
Eckerson, W. , R. Hackatho rn, M. McGivern, C. Twogood,
and G. Watson. (2009) . “Data Wa re ho u sing Appliances.”
Business Intelligence Journal, Vol. 14, No. 1, pp. 40–48.
Edwards, M. (2003, Fall) . “2003 Best Practices Awards
Winne rs: Innovators
D ata Warehousing .”
Vol. 8, No . 4.
in Business Inte llige nce and
Business Intelligence Journal,
“Egg’s Customer Data Ware ho u se Hits the Mark. ” (2005 ,
October). DM Review, Vol. 15, No. 10, pp. 24-28.
Elson , R. , a nd R. Lecle rc. (2005). “Security and Privacy
Concerns in the Data Warehouse Environme nt. ” Business
l ntelligenceJournal, Vol. 10, No. 3.
Ericson , J. (2006, March) . “Real-Time Realities.” Bl Review.
Frntado, P. (2009). “A Survey of Parallel a nd Distributed Data
Ware houses.” International Journal of Data Warehousing
and Mining, Vol. 5, No. 2, p p. 57-78.
Golfarelli, M., a nd Rizzi, S. (2009). Data Warehouse Design:
Modern Principles and Methodologies. San Fra ncisco:
McGraw-Hill Osborne Media .
Gonzales, M. (2005, Quarter 1) . “Active Data Warehouses Are
Just One Approach for Combining Strategic and Technical
Data .” DB2 Magazine.
Hall, M. (2002, April 15). “Seeding for Data Growth. ”
Computerworld, Vol. 36, No. 16.
Hammergren, T. C., and A. R. Simon. (2009). Data Warehousing
for Dummies, 2nd ed. Hoboken, NJ: Wiley.
Hicks, M. (2001, Novembe r 26). “Getting Pricing Just Right. ”
eWeek, Vol. 18, No. 46.
Ho ffe r, J. A., M. B. Prescott, and F. R. McFadden. (2007).
Modern Database Management, 8th ed. Upper Saddle
Rive r, NJ: Pre ntice Hall .
Hwang, M., and H. Xu . (2005, Fall) . “A Survey o f Data
Warehousing Su ccess Issues.” Business l n telligenceJournal,
Vol. 10, No. 4.
IBM. (2009). 50 Tb Data Warehouse Benchmark on IBM
System Z . Armonk, NY: IBM Redbooks.
Imhoff, C. (2001 , May). “Power Up Your Ente rprise Po rtal. ”
E-Business Advice.
Inmo n , W. H. (2005). Building the Data Warehouse, 4th ed.
New York: Wiley.
Inmon, W. H. (2006, January). “Informa tion Management:
How Do You Tune a Data Warehouse?” DM Review, Vol.
16, No. 1.
Jukic, N., and C. Lang . (2004, Summer). “Using Offshore
Resources to Develop a nd Suppo rt Data Warehousing
Applications .” Business Intelligence Journal, Vol. 9, No. 3.
Kalido. “BP Lu bricants Achieves BIGS Success. ” kalido.com/
collateral/Documents/English-US/CS-BP%20BIGS .
pelf (accessed August 2009) .
Karacsony, K. (2006, January) . “ETL Is a Symptom of the
Proble m , not the Solutio n. ” DM Review, Vol. 16, No. 1.
Ch apter 3 • Da ta Warehousing 133
Kassam, S. (2002, April 16). “Freedom of Information.”
Intelligent Enterprise, Vol. 5, No. 7.
Kay, R. (2005 , Septe mber 19). “Ell. ” Computerworld, Vol. 39,
No. 38.
Kelly, C. (2001, June 14). “Calculating Data Ware h ousing
ROI.” SearchSQLServer.com Tips.
Malykhina , E. ( 2003, Ja nuary 3). “The Real-Time Impe rative.”
l nformationWeek, Issue 1020.
Manglik, A. , and V. Mehra . (2005, Winter). “Extending Enterprise
BI Capabilities: New Patterns for Data Integration. ” Business
l ntelligenceJournal, Vol. 10, No. 1.
Ma rtins, C. (2005, December 13). “HP to Consolidate D ata
Marts into Single Warehouse. ” Computerworld.
Matney, D . (2003, Sprin g). “End-User Support Strategy.”
Business Intelligence Journal, Vol. 8 , No. 3.
Mccloskey, D. W. (2002). Choosing Vendors and Products
to Maximize Data Warehousing Success. New York:
Auerbach Publicatio ns .
Mehra, V. (2005, Summe r). “Building a Metadata-Driven
Enterprise: A Ho listic Approach.” Business Intelligence
Journal, Vol. 10, No. 3.
Moseley, M. (2009). “Eliminating Data Warehou se Pressures
w ith Master Data Services and SOA. ” Business Intelligence
Journal, Vol. 14, No. 2, pp . 33–43.
Murtaza, A. 0998, Fall). “A Framework for Developing Enterprise
Data Wa re ho uses. ” Information Systems Management, Vol.
15, No. 4.
Nash , K. S. (2002, July). “Chemical Reactio n. ” Baseline.
Orovic , V. (2003, June). “To Do & Not to Do. ” eAIJournal.
Parzinger, M. ]. , and M. N. Fralick. (2001, July). “Creating
Competitive Adva ntage Through Data Wa re housing. ”
Information Strategy, Vol. 17, No. 4 .
Pe te rson, T. (2003, April 21). “Getting Real About Real Time .”
Computerworld, Vol. 37, No. 16.
Raden, N. (2003, June 30). “Real Time: Get Real, Part II. ”
Intelligent Enterprise.
Reeves, L. (2009) . Manager’s Guide to Data Warehousing.
Hoboke n, NJ: Wiley.
Ro mero, 0., and A. Abell6 . (2009). “A Survey of Multidim ensional
Modeling Methodologies.” International Journal of Data
Warehousing and Mining, Vol. 5, No. 2, pp. 1-24.
Rosenberg, A. (2006, Qua1ter 1) . “Improving Que1y
Pe rfo rma nce in Data Warehouses.” Business In telligence
Journal, Vol. 11 , No. 1.
Russom, P . (2009). Next Generation Data Warehouse
Platforms. TDWI Best Practices Report, available at www.
tdwi.org (accessed J anuary 2010).
Sa mmon, D. , a nd P. Finnegan. ( 2000, Fall). “The Ten
Commandme nts of Data Warehousing. ” Database fo r
Advances in Information Systems, Vol. 31, No. 4.
Sapir, D. (2005, May). “Data Integration: A Tuto ria l.” DM
Review, Vol. 15, No. 5.
Saunders, T. (2009). “Cooking u p a Data Ware hou se.”
Business Intelligence Journal, Vol. 14, No. 2, pp. 16- 23.
Schwa rtz, K. D . “Decisio ns at the Touch of a Button.” Teradata
Magazine, (accessed June 2009) .

134 Pan II • Descriptive Analytics
Schwa nz, K. D. (2004, March). “Decisions at the Tou ch of a
Button. ” DSS Resources, pp. 28-31. dssresources.com/
cases/coca-colajapan/index.html (accessed April 2006).
Sen, A. (2004, April). “Metadata Management: Past, Present
and Future.” Decision Support Systems, Vol. 37, No. 1.
Sen, A. , and P. Sinha . (2005) . “A Comparison of Data
Warehousing Methodologies. ” Communications of the
ACM, Vol. 48, No. 3 .
Solomon , M. (2005, Winter). “Ensuring a Successful Data
Warehouse Initiative.” Information Systems Management
Journal.
Son g ini, M. L. (2004, February 2). “ETL Quickstudy. ”
Computerworld, Vol. 38, No. 5.
Sun Microsystems. (2005, September 19). “Egg Banks on
Sun to Hit the Mark with Customers.” sun.com/smi/
Press/ sunflash/2005-09/ sunflash.20050919 .1.xml
(accessed April 2006; no lo nger available online).
Tann enbaum, A. (2002, Spring). “Identifying Meta Data
Requirements.”JournalofData Warehousing, Vol. 7, No. 3 .
Tennant, R. (2002, May 15). “The Impo rta n ce of Being
Granular. ” LibraryJournal, Vol. 127, No. 9.
Teradata Corp. “A Large US-Based Insurance Company
Maste rs Its Fina nce Data. ” (accessed July 2009).
Teradata Corp. “Active Data Wa re h o u s ing.” teradata.com/
active-data-warehousing/ (accessed April 2006).
Teradata Corp. “Coca-Cola J apan Puts the Fizz Back in
Vending Machine Sales. ” (accessed June 2009).
Teradata. “Enterprise Data Warehouse Delivers Cost Savings
and Process Efficiencies. ” teradata.com/t/resources/case-
studies/NCR-Corporation-eb4455 (accessed June 2009).
Terr, S. (2004, Febru ary) . “Real-Time Data Warehousing:
Hardware and Software. ” DM Review, Vol. 14, No. 3 .
Thornton, M. (2002, March 18). “What About Security? The
Most Common, but Unwa rranted, Objection to Hosted
Data Warehouses. ” DM Review, Vol. 12, No. 3, pp. 30- 43.
Thornton, M., and M. Lampa. (2002). “Hoste d Data Warehouse.”
Journal of Data Warehousing, Vol. 7, No. 2, pp. 27- 34.
Turban, E. , D. Leidner, E. McLean, and J. Wetherbe. (2006).
Information Technology for Management, 5th ed. New
York: Wiley.
Vaduva, A. , a n d T. Vetterli. (2001, September). “Metadata
Management for Data Warehou sing: An Overview. ”
International Journal of Cooperative Information Systems,
Vol. 10, No. 3.
Van den Hoven, J. 0998) . “Data Marts: Plan Big, Build Small. ”
Information Systems Management, Vol. 15, No. 1 .
Watson , H. J. (2002). “Recent Developments in D ata
Warehousing. ” Communications of the ACM, Vol. 8 ,
No. 1.
Watson , H.J., D. L. Goodhu e , and B. H. Wixom. (2002). “The
Ben efits of Data Ware housing: Wh y Some Organ izations
Realize Exceptional Payoffs. ” Information & Management,
Vol. 39.
Wa tson , H ., J. Ge ra rd, L. Gonzalez, M. Haywoo d , and D.
Fenton. 0999). “Data Warehouse Failures: Case Studies
and Findings. ” Journal of Data Warehousing, Vo l. 4 ,
No. 1.
Weir , R. (2002, Winte r). “Best Practices for Impleme nting a
Data Ware h ouse. ” Journal of Data Warehousing, Vol. 7,
No. 1.
W ilk , L. (2003, Spring). “Data Warehousin g a nd Re al-
T ime Computing .” Business Intelligence Journal,
Vol. 8 , No. 3.
Wixom, B., a nd H. Watson . (2001, March). “An Empirical
Investigation of the Factors Affecting Data Warehousing
Su ccess. ” MIS Quarterly, Vol. 25, No. 1.
W rembel, R. (2009). “A Survey of Managing the Evolution
of Data Wa re houses. ” International Journal of Data
Warehousing and Mining, Vol. 5, No. 2, pp . 24-56.
ZD Net UK. “Sun Case Study: Egg’s Customer Data Warehou se .”
whitepapers.zdnet.co.uk/0,39025945,60159401p-
39000449q,OO.htm (accessed June 2009) .
Zhao, X. (2005 , October 7). “Meta Data Man agement Maturity
Model. ” DM Direct Newsletter.

CHAPTER
Business Reporting,
Visual Analytics, and Business
Performance Management
LEARNING OBJECTIVES
• Define business reporting and
understand its historical evolution
• Recognize the need for and the power
of business reporting
• Understand the importance of d ata/
information visualization
• Learn different types of visualization
techniques
• Appreciate the value that visual analytics
brings to BI/BA
• Know the capabilities and limitations of
dashboards
• Understand the nature of business
p erformance management (BPM)
• Learn the closed-loop BPM methodology
• Describe the basic elements of the
balanced scorecard
A
report is a communication artifact prepared with the specific intention of relaying
information in a presentable form. If it concerns business matte rs, then it is
called a business report. Business reporting is an essential part of the business
intelligence movement toward improving managerial decision making. Nowadays, these
reports are more visu ally oriented, often using colors a nd graphical ico n s that collectively
look like a dashboard to enhance the information content. Business reporting and
business performance management (BPM) are both enablers of business intelligence and
analytics. As a decision support tool, BPM is more tha n just a rep orting techno logy. It is
an integrated set of processes, methodologies , metrics , and applications designed to drive
the overall financial and operational performance of an enterprise. It helps enterprises
translate the ir strategies and objectives into pla ns, monito r performance against those
plans, analyze variations between actual results and planned results, and adjust their
objectives and actions in response to this analysis.
This chapter starts with examining the n eed for a nd the power of business report-
ing. With the emergence of analytics, business reporting evolved into dashboards and
visual analytics, which, compared to traditio n al descriptive repo rting, is much mo re pre-
dictive and prescriptive. Coverage of dashboa rds and visual analytics is followed by a
135

136 Pan II • Descriptive Analytics
comprehensive introduction to BPM. As you will see and appreciate, BPM and visual
analytics have a symbiotic relationship (over scorecards and dashboards) where they
benefit from each other’s strengths.
4.1 Opening Vignette: Self-Service Reporting Environment Saves Millions for
Corporate Customers 136
4.2 Business Reporting Definitions and Concepts 139
4.3 Data and Information Visualization 145
4.4 Differe nt Types of Charts and Graphs 150
4.5 The Emergence of Data Visualization and Visual Analytics 154
4.6 Performance Dashboards 160
4. 7 Business Performance Management 166
4.8 Performance Measurement 170
4.9 Balanced Scorecards 172
4.10 Six Sigma as a Performance Measurement System 175
4.1 OPENING VIGNETTE: Self-Service Reporting Environment
Saves Millions for Corporate Customers
Headquartered in Omaha, Nebraska , Travel and Tra nsport, Inc. , is the sixth largest travel
management company in the United States, with more than 700 employee-owners located
nationwide. The company has extensive experience in multiple verticals, including travel
manage me nt, loyalty solutions programs, mee ting a nd incentive planning, and leisure
travel services.
CHALLENGE
In the field of employee travel services, the ability to effectively communicate a value
proposition to existing and potential customers is critical to w inning and retaining
business. With travel arrangements often made on an ad hoc basis, customers find it
difficult to analyze costs or instate optimal purchase agreements. Travel and Transport
wanted to overcome these challenges by implementing an integrated reporting and
analysis system to enhance relationships with existing clie nts, while providing the kind of
value-added services that would attract new prospects.
SOLUTION
Travel and Transport impleme nted Informatio n Builders’ WebFOCUS business intelligence
(BI) platform (called eTTek Review) as the foundation of a dynamic customer self-
service BI environment. This dashboard-driven expense-management application helps
more than 800 external clients like Robert W. Baird & Co., MetLife, and American Family
Insurance to p lan, track, analyze, and budget their travel expenses more efficiently and
to benchmark the m against similar companies, saving the m millions of dollars. More than
200 internal employees, including customer service specialists, also have access to the
system, using it to generate more precise forecasts for clients and to streamline and accel-
e rate other key support processes such as quarterly reviews.
Thanks to WebFOCUS, Travel and Transport doesn’t just tell its clients how much
they are saving by using its services-it shows them. This has helped the company to
differentiate itself in a market defined by aggressive competitio n. Additionally, WebFOCUS

Chapter 4 • Business Reporting, Visual Analytics, and Business Performance Management 137
eliminates manual report compilation for client service specialists, saving the company
close to $200,000 in lost time each year.
AN INTUITIVE, GRAPHICAL WAY TO MANAGE TRAVEL DATA
Using stunning graphics created with WebFOCUS and Adobe Flex, the business intelli-
gence system provides access to thousands of reports that show individual client metrics,
benchmarked information against aggregated market data, and even ad hoc reports that
users can specify as needed. “For most of our corporate customers, we thoroughly manage
their travel from planning and reservations to billing, fulfillment, and ongoing analysis, ”
says Mike Kubasik, senior vice president and CIO at Travel and Transport. “WebFOCUS
is important to our business. It helps our custome rs monitor employee spending, book
travel with preferred vendors, and negotiate corporate purchasing agreements that can
save them millions of dollars per year. ”
Clients love it, and it’s giving Travel and Transport a competitive edge in a crowded
marketplace. “I use Travel and Transport’s eTTek Review to automatically e-mail reports
throughout the company for a variety of reasons, such as monitoring travel trends and
company expendin1res and assisting with airline expense reconciliation and allocations, ”
says Cathy Moulton, vice president and travel manager at Robert W. Baird & Co. ,
a prominent financial services company. What she loves about the WebFOCUS-enabled
Web portal is that it makes all of the company’s travel information available in just a few
clicks. “I have the data at my fingertips ,” she adds. “I don’t have to wait for someone to
go in and do it for me. I can set up the reports on my own. Then we can go to the hotels
and preferred vendors armed with detailed information that gives us leverage to negotiate
our rates.”
Robert W. Baird & Co. isn’t the only firm benefiting from this advan ced access to
reporting. Many of Travel and Transport’s other clients are also happy w ith the technol-
ogy. “With Travel and Transport’s state-of-the-art reporting technology , MetLife is able
to measure its travel program through data analysis, standard reporting, and the ability
to create ad hoc reports dynamically, ” says Tom Molesky, director of travel services at
MetLife. “Metrics derived from actionable data provide direction and drive us toward our
goals. This is key to helping us negotiate with our suppliers, e nforce our travel policy,
and save our company money. Travel and Transport’s leading-edge product has helped
us to mee t and, in some cases, exceed our travel goals. ”
READY FOR TAKEOFF
Travel and Transport used WebFOCUS to create an online system that allows clients
to access information directly, so they won’t have to rely on the IT department to nm
reports for them. Its objective was to give customers online tools to monitor corporate
travel expenditures throughout their companies. By giving clients access to the right
data, Travel and Transport can help make sure its cu stome rs are getting the best pricing
from airlines, hotels, car rental companies, and other vendors . “We needed more than
just pretty reports, ” Kubasik recalls, looking back on the early phases of the BI project.
“We wanted to build a reporting e nvironment that was powerful enough to handle
transaction-intensive operations, yet simple enough to deploy over the Web. ” It was a
winning formula. Clients and customer service specialists continue to use eTTek Review
to create forecasts for the coming year and to target specific areas of business travel
expenditures. These u sers can choose from dozens of management reports. Popular
reports include travel summary, airline compliance, hotel analysis, and car analysis.
Travel managers at about 700 corporations use these reports to analyze corporate travel
spending on a daily, weekly, monthly, quarterly, and annual basis. About 160 standard
reports and more than 3,000 custom repo1ts are currently set up in eTTek Review,

138 Pan II • Descriptive Analytics
including everything from noncompliance reports that reveal why an employee did not
obtain the lowest airfare for a particular flight to executive overviews that summarize
spending patterns. Most reports are parameter driven w ith Information Builders’ unique
guided ad hoc reporting technology.
PEER REVIEW SYSTEM KEEPS EXPENSES ON TRACK
Users can also run reports that compare their own travel metrics w ith aggregated travel
data from other Travel and Transport clients. This benchmarking service lets them gauge
whether their expenditures, preferred rates, and other metrics are in line with those of
other companies of a similar size or within the same industry. By pooling the data, Travel
and Transport helps protect individual clients’ information while also enabling its e ntire
customer base to achieve lower rates by giving them leverage for th eir negotiations.
Reports can be run interactively or in batch mode, with results displayed on the
screen, stored in a library, saved to a PDF file, loaded into a n Excel spreadsheet, or
sent as an Active Report that permits additional an alysis. “Our clients love the visual
metaphors provided by Information Builders’ graphical displays, including Adobe Flex
and WebFOCUS Active PDF files,” explains Steve Cords, IT manager at Travel and
Transport and team leader for the eTTek Review project. “Most summary reports h ave
drill-down capability to a detailed report. All reports can be run for a particular hierarchy
structure, and more than o ne hierarchy can be selected. ”
Of course, u sers never see the code that makes all of this possible. They operate
in an intuitive dashboard environment w ith drop-down menus and drillable graphs, all
accessible through a browser-based interface that requires no client-side software. This
architecture makes it easy and cost-effective for users to tap into eTTek Review from any
location. Collectively, customers nm an estimated 50,000 reports p er month. Abo ut 20,000
of those reports are autom atically generated and distributed via WebFOCUS ReportCaster.
AN EFFICIENT ARCHITECTURE THAT YIELDS SOARING RESULTS
Travel a nd Transport captures travel information from reservation systems known as
Global Distribution Systems (GDS) via a proprietary back-office system that resides in
a DB2 database o n an IBM iSeries computer. They use SQL tables to store user IDs
and passwords, and use other databases to store the information. “The database can be
sorted according to a specific hierarchy to match the breakdown of reports required by
each company,” continues Cords. “If they want to see just marketing and accounting
information, we can deliver it. If they want to see the particular level of detail reflecting a
given cost center, we can deliver that, too. ”
Because all data is securely stored for three years, clients can generate trend reports
to compare current travel to previous years. They can also use the BI system to monitor
w here employees a re traveling at any point in time. The reports are so easy to use that
Cords and his team have started replacing outdated processes w ith new automated ones
using the same WebFOCUS technology. The company also uses WebFOCUS to streamline
their quarterly review process. In the past, client service managers had to manually create
these quarterly reports by aggregating data from a variety of clients. The 80-page report
took o ne week to create at the end of every quarter.
Travel and Transport has completely automated the quarterly review system using
WebFOCUS so the managers can select the pages, percentages, and specific data they
want to include. This gives them more time to do further an alysis and make better use of
the information. Cords estimates that the time savings add up to about $200,000 eve1y year
for this project alo ne. “Metrics derived from actionable data are key to helping us negotiate
w ith our suppliers, enforce our travel policy, and save our company money, ” continues
Cords. “During the recessio n , the travel industry was hit particularly hard, but Travel a nd

Chapter 4 • Business Reporting, Visual Analytics, and Business Performance Management 139
Transport managed to add new multimillion dollar accounts even in the worst of times. We
attribute a lot of this growth to the cutting-edge reporting technology we offer to clients. ”
QUESTIONS FOR THE OPENING VIGNETTE
1. What does Travel and Transport, Inc., do1
2. Describe the complexity and the competitive nature of the business environment in
which Travel and Transport, Inc ., functions.
3. What were the main business challenges?
4. What was the solution? How was it implemented?
5. Why do you think a multi-vendor, multi-tool solution was implemented?
6. List and comment on at least three main benefits of the implemented system. Can
you think of other potential benefits that are n ot mentioned in the case?
WHAT WE CAN LEARN FROM THIS VIGNETTE
Trying to survive (and thrive) in a highly competitive industry, Trave l and Transport,
Inc. , was aware of the need to create and effectively communicate a value proposition
to its existing and potential customers. As is the case in many industries, in the travel
business, success or mere survival depends on continuously winning new customers
while retaining the existing ones. The key was to provide value -added services to the
client so that they can efficie ntly an alyze costs and other options to quickly instate
optimal purchase agreements . Usin g WebFOCUS (an integrated reporting and information
visualization environment by Information Builders), Travel and Transport empowered
their clients to access information whenever and wherever they need it. Information is
the power that decision makers need the most to make better and faster decisions. When
economic conditions are tight, every managerial decision-every business transaction-
counts. Travel and Transport used a variety of reputable vendors/ products (hardware
and software) to create a cutting-edge repo1ting technology so that their clients can make
better, faster decisions to improve their financial well-being.
Source: Information Builde rs, Custome r Success Sto ry, informationbuilders.com/applications/travel-and-
transport (accessed February 2013).
4.2 BUSINESS REPORTING DEFINITIONS AND CONCEPTS
Decision makers are in need of information to make accurate and timely decisions.
Information is essentially the contextualization of data. Information is often provided in
the form of a written report (digital or o n paper), although it can also be provided orally.
Simply put, a report is any communicatio n artifact prepared w ith the specific intention of
conveying information in a presentable form to whoever needs it, wheneve r and wherever
they may need it. It is u sually a document that contains information (usually driven from
data and personal experiences) organized in a narrative , graphic, and/ or tabular form,
prepared periodically (recurring) or on an as-required (ad hoc) basis, refe rring to specific
time periods, events, occurrences, or subjects.
In business settings, types of reports include memos, minutes, lab reports , sales
repo1ts, progress reports, justification reports, compliance re po1ts, annual reports, and
policies and procedures. Reports can fulfill many different (but often related) functions .
Here are a few of the most prevailing ones:
• To ensure that all departments are functioning properly
• To provide information

140 Pan II • Descriptive Analytics
• To provide the results of an a nalysis
• To persuade others to act
• To create an organizational memory (as part of a knowledge m anagement system)
Reports can be lengthy at times. For those reports, there usually is an executive
summary for those w ho do not have the time and interest to go through it all. The
summary (or abstract, o r more commonly called executive brief) sh ould be crafted
carefully, expressing only the important points in a very con cise and precise manner, and
lasting no more than a page or two .
In additio n to business reports, examples of other types of reports include crime
scene reports, police reports, credit reports , scientific reports, recommendation reports,
white papers, annual reports, auditor’s reports, workplace reports, census reports, trip
reports, progress reports, investigative reports, budget reports, policy reports, demographic
reports, credit repo rts , appraisal reports, inspection reports, and military reports, among
others . In this chapter we are particularly interested in business reports.
What Is a Business Report?
A business report is a writte n document that contains information regarding business
matters. Business reporting (also ca lled e nterprise reporting) is a n essential part of the
larger drive toward improved managerial decision making and organizational knowledge
management. The foundation of these reports is various sources of data coming from
both inside and outside the organization. Creation of these reports involves ETL (extract,
transform, and load) procedures in coordination w ith a data wareh ouse a nd then using
o ne or mo re reporting tools. While reports can be distributed in print form or via e -ma il,
they are typically accessed via a corporate intranet.
Due to the expansio n of information technology coupled w ith the need for improved
competitiven ess in businesses, there h as been a n increase in the use of computing power
to produce unified reports that join different views of the enterprise in one place. Usually ,
this reporting process involves querying structured data sources, most of which are created
by using different logical data models and data dictionaries to produce a human-readable,
easily digestible report. These types of business reports allow managers and coworkers
to stay informed and involved, review options and alternatives, and make informed
decisions. Figure 4.1 shows the continuous cycle of data acquisition -+ information
generation-+ decision making -+ business process management. Perhaps the most critical
task in this cyclic process is the reporting (i.e., informatio n generation)- converting data
from different sources into actionable informatio n .
The key to a ny su ccessful report is clarity, brevity, completen ess, and correctn ess.
In terms of content and format, there are only a few categories of business report: infor-
mal , formal , and short. Informal reports are usually up to 10 pages long; are routine and
inte rnal; follow a letter o r memo format; and use personal pronouns and contractions.
Forma l reports are 10 to 100 pages lo ng; do not use personal pronouns o r contractio ns;
include a title page, table of contents, and an executive summary; are based o n deep
research o r an analytic study; and are distributed to external or internal people w ith a
need-to-know designation. Short reports are to inform people about events or system
status changes and are often periodic, investigative, compliance, and situ ation al focused.
The nature of the report also cha nges sig nificantly based on whom the report is
created for. Most of the research in effective reporting is dedicated to internal reports
that inform stakeh olders a nd decision makers w ithin the organization. There are also
external reports between businesses and the government (e.g., for tax purposes o r for
regular filings to the Securities and Exchange Commission) . These formal reports are
mostly standardized and periodically filed e ither nationally or internationally. Standard
Business Reporting , w hich is a collectio n of internatio na l programs instigated by a

Chapter 4 • Business Repo rting, Visual Analytics , and Business Performance Manage ment 141
,———

Data :
Transactional Records
Exception Event
Symbol I Count !Description
~ I 1
I
I
I
I
I
I
I
t
I Machine
Failure
Data
Repositories
Business Functions
0
Information
(reporting)
FIGURE 4.1 The Role of Information Reporting in Managerial Decision Making.
Action
• (decision)
I
I
I
I
I
I
I
I
I
number of governments , aims to reduce the regulatory burden for business by simplifying
and standardizing reporting requirements. The idea is to make business the e picenter
when it comes to managing business-to-government reporting obligations. Businesses
conduct their own financial administration; the facts they record and decisions they make
should drive their reporting. The governme nt should be able to receive and process
this information without imposing undue constraints on how businesses administer
the ir finances. Application Case 4.1 illustrates an excellent example for ove rcoming the
challenges of financial reporting.
Application Case 4.1
Delta Lloyd Group Ensures Accuracy and Efficiency in Financial Reporting
Delta Lloyd Group is a financial services provider
based in the Netherlands. It offers insurance, pen-
sions, investing, and banking se1vices to its private
and corporate clients through its three strong brands:
to €3 .9 billion and investments under management
worth nearly €74 billion.
Delta Lloyd, OHRA, and ABN AMRO Insurance.
Since its founding in 1807, the company has grown
in the Netherlands, Germany, and Belgium, and
now employs around 5,400 pe1manent staff. Its 2011
full-year financial reports show €5.5 billion in gross
written premiums, with shareholders’ funds amounting
Challenges
Since Delta Lloyd Group is publicly listed on the
NYSE Euronext Amsterdam, it is obliged to produce
annual and half-year repo1ts. Various subsidiaries in
Delta Lloyd Group must also produce reports to fulfill
local legal requirements: for example , banking and
( Continued)

142 Pan II • Descriptive Analytics
Application Case 4.1 (Continued}
insurance reports are obligatory in the Netherlands.
In addition, Delta Lloyd Group must provide reports
to meet international requirements, such as the
IFRS (International Financial Reporting Standards)
for accounting and the EU Solvency I Directive for
insurance companies. The data for these reports is
gathered by the group’s finance department, which
is divided into small teams in several locations, and
then converted into XML so that it can be published
on the corporate Web site.
Importance of Accuracy
The most challenging part of the reporting process is
the “last mile”-the stage at which the consolidated
figures are cited, formatted, and described to form
the final text of the report. Delta Lloyd Group was
using Microsoft Excel for the last-mile stage of the
repolting process. To minimize the risk of errors,
the finance team needed to manually check all
the data in its reports for accuracy. These manual
checks were vety time-consuming. Arnold Honig,
team leader for reporting at Delta Lloyd Group,
comments: “Accuracy is essential in financial
reporting, since errors could lead to penalties,
reputational damage, and even a negative impact
o n the company’s stock price. We n eeded a new
solution that would automate some of the last mile
processes and reduce the risk of manual error. ”
Solution
The group decided to implement IBM Cognos
Financial Statement Reporting (FSR) . The implemen-
tation of the software was completed in just 6 weeks
during the late summer. This rapid implementation
gave the finance department enough time to prepare
a trial draft of the annual report in FSR, based on
figures from the third financial quarter. The success-
ful creation of this draft gave Delta Lloyd Group
enough confidence to use Cognos FSR for the final
version of the annual report, which was published
shortly after the end of the year.
Results
Employees are delighted with the IBM Cognos FSR
solution. Delta Lloyd Group has divided the annual
report into chapters, and each member of the report-
ing team is responsible for one chapter. Arnold Honig
says, “Since employees can work on documents
simultaneously, they can share the huge workload
involved in repolt generation. Before, the reporting
process was inefficient, because only one person
could work on the report at a time.”
Since the workload can be divided up, staff can
complete the report with less overtime. Arnold Honig
comments, “Previously, employees were putting in
2 weeks of overtime during the 8 weeks required to
generate a report. This year, the 10 members of staff
involved in the report generation process worked
25 percent less overtime, even though they were still
getting used to the new software. This is a big w in
for Delta Lloyd Group and its staff.” The group is
expecting further reductions in employee overtime
in the future as staff becomes more familiar with
the software.
Accurate Reports
The IBM Cognos FSR solution automates key stages
in the report-writing process by popu lating the
final report with accurate , up-to-date financial data .
Wherever the text of the report needs to mention
a specific financial figure, the finance team s imply
inserts a “variable “- a tag that is linked to an under-
lyin g data source. Wherever the variable appears
in the document, FSR will pull the figure through
from the source into the report. If the value of the
figure needs to be changed, the team can simply
update it in the source, and the new value w ill
automatically flow through into the text, maintain-
ing accuracy and consistency of data throughout
the report.
Arnold Honig comments, “The ability to
update figures automatically across the whole report
reduces the scope for manual error inherent in
spreadsheet-based processes and activities. Since
we have full control of our reporting processes,
we can produce better quality reports more effi-
ciently and reduce our business risk. ” IBM Cognos
FSR also provides a comparison feature , which
highlights any ch anges made to reports. This featu re
makes it quicker and easier for users to review new
versions of documents and ensure th e accuracy of
their reports.

Chapter 4 • Business Reporting, Visual Analytics, and Business Performance Manage ment 143
Adhering to Industry Regulations
In the future, Delta Lloyd Group is planning to extend
its use of IBM Cognos FSR to generate internal man-
agement reports. It w ill also help Delta Lloyd Group
to meet industry regulatory standards, which are
becoming stricter. Arnold Honig comments, “The EU
Solvency II Directive w ill come into effect soon, and
our Solvency II reports will need to be tagged w ith
extensible Business Reporting Language [XBRL]. By
implementing IBM Cognos FSR, which fully suppo1ts
XBRL tagging, we have equipped ourselves to meet
both current and future regulatory requirements. ”
QUESTIONS FOR DISCUSSION
1. How d id Delta Lloyd Group improve accuracy
and efficiency in financial reporting?
2. What were the challenges, the proposed solution,
and the obtained results?
3. Why is it important for Delta Lloyd Group to
comply with industry regulations?
Source: IBM, Customer Success Story, “Delta Lloyd Group
Ensures Accuracy in Fina ncial Repo rting,” public.dhe.ibm.com/
common/ssi/ecm/en/ytc03561nlen/YTC03561NLEN.PDF
(accessed Febrnary 2013); and www.deltalloydgroep.com .
Even though there are a w ide variety of business repo1ts, the o nes that are often
used for managerial purposes can be grouped into three major categories (Hill, 2013).
METRIC MANAGEMENT REPORTS In many organizations, business performance is
managed through o utcome-oriented metrics. For external groups, these are service-level
agreements (SLAs) . For inte rnal management, they are key performance indicators (KPis).
Typically, there are e nterprise-w ide agreed targets to be tracked over a period of time .
They may be used as part of oth er management strategies such as Six Sig ma or Total
Quality Man agement (TQM).
DASHBOARD-TYPE REPORTS A popular idea in business reporting in recent years has
bee n to present a range of different performance indicators o n o ne page, like a dash-
board in a car. Typically, dashboard vendors would provide a set of predefined reports
with static e lements and fixed structure, but also allow for customization of the dashboard
widgets , views, a nd set targets for various metrics. It’s common to have color-coded traf-
fic lights defined for performance (red, orange , green) to draw management attention to
particular areas. More details on dashboards are given later in this chapter.
BALANCED SCORECARD-TYPE REPORTS This is a method developed by Kaplan an d
Norton that attempts to present an integrated view of success in an organization. In addi-
tion to financial p e rformance , bala nced scorecard-type re ports also include customer,
business process, and learning and growth perspectives. More details on balanced score-
cards are provided la ter in this chapter.
Components of the Business Reporting System
Although each business reporting system has its unique characteristics , there seems to
be a gen e ric patte rn that is common across organizations an d technology a rchitectures.
Think of this generic pattern as hav ing the business user o n o n e e nd of the reporting
continuum an d the d ata sources on the other e nd . Based on the n eeds an d requirements
of the business user, the data is captured, stored , consolidated, and con verted to
desired reports using a set of predefined business rules . To be successful, su ch a
system n eeds an overarching assurance process that covers the e ntire valu e chain and
moves back a n d forth , e nsu ring that reporting requirements and information delivery

144 Pan II • Descriptive Analytics
are properly aligned (Hill, 2008). Following are the most common comp onents of a
business reporting system.
• OLTP (online transaction processing). A system that measures some asp ect
of the real world as events (e.g., transactions) and records them into e nterprise
databases. Examples include ERP systems, POS systems, Web servers, RFID readers,
handheld inventory readers, card reade rs, and so forth.
• Data supply. A system that takes recorded events/ transactions and delivers them
reliably to the reporting system. The data access can be push o r pull, depending
on w hether or not it is responsible for initiating the delivery process. It can a lso be
polled (or batched) if the data are transferred periodically, or triggered (or online)
if data are transfe rred in case of a specific event.
• ETL (extract, transform, and load). This is the intermediate step where these
recorded transactions/events are ch ecked for quality, put into the appropriate
fo rmat, and inse1ted into the desired data format.
• Data storage. This is the storage area for the data and metadata. It could be a
flat file o r a spreadsheet, but it is usually a relatio n al database managem ent system
(RDBMS) set up as a data mart, data warehouse, or operatio nal data store (ODS); it
often employs online analytical processing (OLAP) functions like cu bes.
• Business logic. The explicit steps for h ow the recorded transactions/ events are
to be converted into metrics , scorecards, and dashboards .
• Publication. The system that builds the various reports and h osts them (for
users) or disseminates the m (to u sers) . These systems may also provide notification,
annotation, collaboration, and oth er services.
• Assurance. A good business reporting system is exp ected to offer a quality
service to its u sers. This inclu des determining if an d when the right information is to
be delivered to the right people in the right way/format.
Application Case 4.2 is a n excelle nt example to illustrate the power and the util-
ity of automated report gen eration for a large (and, at a time of n atural crisis, som ewhat
chaotic) organization like FEMA.
Application Case 4.2
Flood of Paper Ends at FEMA
Staff at the Fe deral Emergency Manage ment Agency
(FEMA), a U.S. federal agen cy that coordinates
disaster response w h en the President declares a
natio nal disaster, always got two floods at o nce.
First, water covered the land. Next, a flood of paper,
required to administer the National Flood Insurance
Program (NFIP), covered their desks-pallets
and pallets of green-striped reports poured off a
ma inframe printer and into their offices. Individual
reports were sometimes 18 inches thick, w ith a
nugget of informatio n about insura nce claims,
premiums, or payments buried in them somewhere.
Bill Barton and Mike Miles don’t claim to be
able to do anything about the weather, but the
project manager a nd computer scientist, respectively,
from Computer Sciences Corporation (CSC) have
used WebFOCUS software from Information
Builde rs to turn back the flood of paper generated
by the NFIP . The program allows the government
to work together w ith national insurance companies
to collect flood insurance premiums and pay claims
for flooding in communities that adopt flood con trol
measures. As a result of CSC’s work, FEMA staff no
lo nger leaf through p aper reports to find the d ata
they need. Instead, they browse insuran ce data
posted on NFIP’s BureauNet intran et site, select just
the informatio n they want to see, and get an on-
screen report or download the data as a spreadsheet.

Chapter 4 • Business Reporting, Visual Analytics, and Business Performance Manage ment 145
And that is only the start of the savings that
WebFOCUS has provided. The number of times that
NFIP staff asks CSC for special reports has dropped
in half, because NFIP staff can generate many of the
special reports they need without calling on a pro-
grammer to develop them. Then there is the cost
of creating BureauNet in the first place. Barton esti-
mates that using conventional Web and database
software to export data from FEMA’s mainframe,
store it in a new database, and link that to a Web
server would have cost about 100 times as much-
more than $500,000- and taken about two years
to complete, compared w ith the few months Miles
spent on the WebFOCUS solutio n .
When Tropical Storm Allison, a huge slug of
sodden, swirling clouds, moved out of the Gulf of
Mexico onto the Texas and Louisiana coastline in June
2001 , it killed 34 people, most from drowning; dam-
aged or destroyed 16,000 homes and businesses; and
displaced more than 10,000 families . President George
W. Bush declared 28 Texas counties disaster areas,
and FEMA moved in to help. This was the first serious
test for BureauNet, and it delivered. This first compre-
hensive use of BureauNet resulted in FEMA field staff
readily accessing w h at they needed and w hen tl1ey
SECTION 4.2 REVIEW QUESTIONS
1. What is a report? What are they used for,
needed it, and asking for many new types of reports.
Fortunately, Miles and WebFOCUS were up to the
task. In some cases, Barton says, “FEMA would ask for
a new type of report one day, and Miles would have it
on BureauNet the next day, ilianks to ilie speed wiili
which he could create new reports in WebFOCUS. ”
The sudden demand o n the system had little
impact on its performance, notes Barton. “It h a ndled
ilie demand just fine, ” he says. “We had no prob-
lems with it at all.” “And it made a huge difference
to FEMA and the job they had to do. They h ad never
had that level of access before, never had been able
to just click on their desktop an d generate such
detailed and specific reports. ”
QUESTIONS FOR DISCUSSION
1. What is FEMA and w h at does it do?
2. What are the main challenges that FEMA faces?
3. How d id FEMA improve its inefficient reporting
practices?
Sources: Infomiatio n Builders, Custome r Success Story, “Useful
Inf0m1ation Flows at Disaster Response Agency,”
infonnationbuilders.com/applications/fema (accessed Januaiy
2013); and fema.gov.
2. What is a business report? What are the main characteristics of a good business
report?
3. Describe the cyclic process of management and comment on the role of business
reports .
4. List and describe the three major categories of business reports .
5. What are the main components of a business reporting system?
4.3 DATA AND INFORMATION VISUALIZATION
Data visualization (or more appropriately, information visualization) has been defined
as, “the use of visual representations to e xplore, make sense of, and communicate data”
(Few, 2008). Although the name that is commonly used is data visualization, usually
what is meant by this is information visualizatio n . Since information is the aggrega-
tion, summarizations, a nd contextualization of data (raw facts), w hat is portrayed in
visualizations is the informatio n a nd not the data. However, sin ce the two terms data
visualization and information visualization are used interchangeably and synonymously,
in this chapter we w ill follow suit.
Data visualization is closely related to the fields of information graphics, information
visu alizatio n , scie ntific visua lizatio n , and statistical graphics. Until recently, the major

146 Pan II • Descriptive Analytics
forms of data visualization available in both business intelligence applications h ave
included charts and graphs, as well as the other types of visual elements used to create
scorecards and dashboards. Application Case 4 .3 shows how visual reporting tools can
help facilitate cost-effective business information creations and sharing.
Application Case 4.3
Tableau Saves Blastrac Thousands of Dollars with Simplified Information Sharing
Blastrac, a self-proclaimed global leader in portable
surface preparation technologies and equipment
(e.g., shot blasting, grinding, polishing, scarifying,
scraping, milling, and cutting equipment), depended
on the creation and distribution of reports across the
organization to make business decisions. However,
the company did not have a consistent reporting
method in place and, consequently, preparation of
reports for the company’s various needs (sales data,
working capital, inventory, purchase analysis, etc.)
was tedious. Blastrac’s analysts each spent nearly
one whole day per week (a total of 20 to 30 hours)
extracting data from the multiple enterprise resource
planning (ERP) systems, loading it into several Excel
spreadsheets, creating filtering capabilities and
establishing predefined pivot tables.
Not only were these massive spreadsheets
often inaccurate and consistently hard to under-
stand, but also they were virtually useless for the
sales team, which couldn’t work with the complex
format. In addition, each consumer of the reports
had different needs.
Blastrac Vice President and CIO Dan Murray
began looking for a solution to the company’s report-
ing troubles . He quickly ruled out the rollout of a
single ERP system, a multimillion-dollar proposition.
He also eliminated the possibility of an enterprise-
wide business intelligence (BI) platform deployment
because of cost-quotes from five different ven-
dors ranged from $130,000 to over $500,000. What
Murray needed was a solution that was affordable,
could deploy quickly without disrupting current sys-
tems , and was able to represent data consistently
regardless of the multiple currencies Blastrac oper-
ates in.
The Solution and the Results
Working with IT services consultant firm, Interworks,
Inc., out of Oklahoma, Murray and team finessed
the data sources. Murray then deployed two data
visualization tools from Tableau Software: Tableau
Desktop, a visual data analysis solution that allowed
Blastrac analysts to quickly and easily create intui-
tive and visually compelling reports, and Tableau
Reader, a free application that enabled everyone
across the company to directly interact with the
reports, filtering, sorting, extracting, and printing
data as it fit their needs-and at a total cost of less
than one-third the lowest competing BI quote.
With only one hour per week now required to
create reports-a 95 percent increase in productiv-
ity-and updates to these reports happening auto-
matically through Tableau, Murray and his team are
able to proactively identify major business events
reflected in company data- such as an exception-
ally large sale-instead of reacting to incoming
questions from employees as they had been forced
to do previously.
“Prior to deploying Tableau, I spent countless
hours customizing and creating new reports based
on individual requests, which was not efficient or
productive for me,” said Murray. “With Tableau ,
we create one report for each business area, and,
with very little training, they can explore the d ata
themselves. By deploying Tableau, I not only
saved thousands of dollars and endless months
of deployment, but I’m also now able to create a
product that is infinitely more valuable for people
across the organization.
QUESTIONS FOR DISCUSSION
1. How did Blastrac achieve significant cost savin-
gin reporting and information sharing?
2. What were the challenge, the proposed solution,
and the obtained results?
Sources: tableausoftware.com/learn/stories/spotlight-blastric;
blastrac.com/about-us; and interworks.com.

Chapter 4 • Business Reporting, Visual Analytics, and Business Performance Management 147
To better understand the current and future trends in the field of data visualization,
it helps to begin with some historical context.
A Brief History of Data Visualization
Despite the fact that predecessors to data visualization date back to the second century
AD , most developments have occurred in the last two and a half centuries, predominantly
during the last 30 years (Few, 2007). Although visualization has not been widely
recognized as a discipline until fairly recently, today’s most popular visual forms date
back a few centuries. Geographical exploration, mathematics, and popularized history
spurred the creation of early maps, graphs, and timelines as far back as the 1600s, but
William Playfair is widely credited as the inventor of the modern cha1t, having created
the first widely distributed line and bar charts in his Commercial and Political Atlas of
1786 and what is generally considered to be the first pie chart in his Statistical Breviary,
published in 1801 (see Figure 4.2).
Perhaps the most notable innovator of information graphics during this period was
Charles Joseph Minard, who graphically portrayed the losses suffered by Napoleon’s army
in the Russian campaign of 1812 (see Figure 4 .3). Beginning at the Polish-Russian border,
the thick band shows the size of the army at each position. The path of Napoleo n ‘s
retreat from Moscow in the bitterly cold winter is depicted by the dark lower band, which
is tied to temperature and time scales. Popular visualization expert, author, and critic
Expo rts and Imports t o and from DENMARK Sc NOR”\V.AY from 1700 to 178Q
/100
l—-+—+—-!——l—+—i—-r-7.,,,,..—1&
l—-+——l—–+—+—t——1—1:;;.,.,L–‘f’ — 170
1——–+– –+—+—–r–· -+—-+—–=-.,.,~ … ,, – – 16o
1——-+—–+—-+—–l—-+—-t-“”‘i:i,,. _ _ — – 1l,o
l—- _J~–_…j——if—–+—–t—–1-:tr’-;;-:: — —+14.0 I .DALAKC.E [!!._ tllo
1—–1——+—+—-i—–j-·- –J’-·v.F.c;.~;:Yi;_;-;Ol!JLqJ_ 120
I J .ENGLAND. uo
i—-+—-!l-~o.—-~1~~~~–+—-,—I7l,—-·—-“”a;i,llf2!5iiiE:j.’g
,/), ~~- .::..:.~~~===-:;;:,.,,k–1- — -, 9’>
1—–+i-,.,,,—~-~-f – JJ · -· 1,.risr ‘-~~:;__iiiiiii;.;;;i;;;;~~~~~2.-·_i&
1-~—::::a~-/~=~~……._;,//NCE fl(iJV~ 7″
III-E:”—–+,/-Ji<...-- - -- ·- ~ .... _ ~, 60 ~ 00 -~ 40 , ~-----l-----+----1----,ao ~ - --+----+--- -;----- IL----l..----\.----+-----l----t-----t----r----12 0 l~---+----~---{----t----+----t----,---1·0 FIGURE 4.2 The First Pie Chart Created by William Playfair in 1801. Source: en.wikipedia.org. 148 Pan II • Descriptive Analytics .r t .._._.,.,.,._,.. ,.r .. .,.__,, • J .. ' • • ' ,._ u a!'" ~ )IOSC OIJ ' .,_.,._.. FIGURE 4.3 Decimation of Napoleon's Army During the 1812 Russian Campaign. Source: en.wikipedia.org. Edward Tufte says that this "may well be the best statistical graphic ever drawn." In this g raphic Minard managed to simultaneously represent several data dimensions (the size of the army, direction of moveme nt, geographic locations, outside temperature, etc.) in an artistic and informative manner. Many more great visualizations were created in the 1800s, and most of the m are chronicled in Tufte's Web site (edwardtufte.com) and his visua lization books. The 1900s saw the rise of a more formal, empirical attitude toward visualization, which tended to focus on aspects such as color, value scales, and labeling. In the mid- 1900s, cartographer and theorist Jacques Bertin published his Semiologie Graphique, which some say serves as the theoretical foundation of modern information visualization. While most of his patterns are e ither outdated by more recent research or completely inapplicable to digital media, many are still very relevant. In the 2000s the Internet has emerged as a new medium for visualization and brought with it a whole lot of new tricks and capabilities. Not only has the worldwide, digital distribution of both data and visualization made them more accessible to a broader audience (raising visual literacy along the way), but it has also spurred the design of new forms that incorporate interaction, animation, graphics-rendering technology unique to screen media, and real-time data feeds to create immersive environments for communi- cating and consuming data. Companies and individuals are, seemingly all of a sudden, interested in data; that interest has, in turn, sparked a need for visual tools th at help them understand it. Cheap hard- ware sensors and do-it-yourself frameworks for building your own system are driving down the costs of collecting and processing data . Countless other applications, software tools, and low-level code libraries are springing up to help people collect, o rganize, manipulate, visualize, and unde rstan d data from practically any source. The Internet has also served as a fantastic distribution channel for visualizations; a diverse community of designers, program- mers, cartographers, tinke rers, and data wonks has assembled to disseminate all sorts of new ideas and tools for working with data in both visual and nonvisual forms . Chapter 4 • Business Reporting, Visual Analytics, and Business Performance Management 149 Google Maps has also single-handedly democratized both the interface conventions (click to pan, double-click to zoom) and the technology (256-pixel square map tiles w ith predictable file names) for displaying interactive geography online, to the extent that most people just know what to do when they're presented with a map online. Flash has served well as a cross-browser platform on which to design and develop rich, beautiful Internet applications incorporating interactive data visualization and maps; now, new browser-native technologies such as canvas and SVG (sometimes collectively included under the umbrella of HTML5) are emerging to challenge Flash's supremacy and extend the reach of dynamic visualization interfaces to mobile devices. The future of data/ information visualization is very hard to predict. We can only extrapolate from what has already been invented: more three-dimensional visualization, more immersive experience with multidimensional data in a virtual reality environment, and holographic visualization of information. There is a pretty good chance that we w ill see something that we have never seen in the information visualization realm invented before the end of this decade. Application Case 4 .4 shows how Dana-Farber Cancer Institute used information visualization to better understand the cancer vaccine clinical trials. Application Case 4.4 TIBCO Spotfire Provides Dana-Farber Cancer Institute with Unprecedented Insight into Cancer Vaccine Clinical Trials When Karen Maloney, business development manager of the Cancer Vaccine Center (CVC) at Dana-Farber Cancer Institute in Boston, decided to investigate the competitive landscape of the cancer vaccine field, she looked to a strategic planning and marketing MBA class at Babson College in Wellesley, Massachusetts, for help with the research project. There she met Xiaohong Cao, whose bioinformatics background led to the decision to focus on clinical vaccine trials as representative of potential competition. This became Dana-Farber CVC's first organized attempt to assess in-depth the cancer vaccine market. Cao focused on the an alysis of 645 clini- cal trials related to cancer vaccines. The data was extracted in XML from the ClinicalTrials.gov Web site, and included categories such as "Summary of Purpose," 'Trial Sponsor," "Phase of the Trial," "Recruiting Status," and "Location." Additional sta- tistics on cancer types, including incidence and sur- vival rates, were retrieved from the National Cancer Institute Surveillance data. Challenge and Solution Although information from clinical vaccine trials is organized fairly well into categories a nd can be down- loaded, there is great inconsistency and redundancy inherent in the data registry. To gain a good under- standing of the landscape, both an overview and an in-depth analytic capability were required simul- taneously. It would have been very difficult, not to mention incredibly time-consuming, to analyze infor- mation from the multiple data sources separately, in order to understand the relationships underlying the data or identify trends and patterns using spread- sheets. And to attempt to use a traditional business intelligence tool would have required significant IT resources. Cao proposed using the TIBCO Spotfire DXP (Spotfire) computational and visual analysis tool for data exploration and discovery. Results With the help of Cao and Spotfire software, Dana- Farber's CVC developed a first-of-its-kind analysis approach to rapidly extract complex d ata specifi- cally for can cer vaccines from the major clinical trial reposit01y. Summarization and visualization of these data represents a cost-effective means of making informed decisions about future cancer vaccine clinical trials. The findings are helping the CVC at Dana-Farber understand its competition and the diseases they are working on to he lp shape its strategy in the marketplace. (Continued) 150 Pan II • Descriptive Analytics Application Case 4.4 (Continued} Spotfire software's visual and computation al analysis approach provides the CVC at Dana-Farber and the research community at large w ith a bet- ter understanding of the cancer vaccine clinical trials landscape and enables rapid insight into the hotspots of cancer vaccine activity, as well as into the identification of neglected cancers . "The whole field of medical research is going through an enormous transformation, in part driven by information technology, " adds Brusic. "Using a tool like Spotfire for analysis is a prom- ising area in this field because it h elps integrate information from multiple sources, ask specific questions , and rapidly extract new knowledge from the data that was previously not easily attainable." QUESTIONS FOR DISCUSSION 1. How did Dana-Farber Cancer Institute use TIBCO Spotfire to enhance information reporting and visualization? 2. What were the challenge, the proposed solution, and the obtained results? Sources: TIBCO Spotfire , Customer Success Story, "TIBCO Spotfire Provides Dana-Farber Cancer Institute with Unprecedented Insight into Cancer Vaccine Clinical Trials," spotfire.tibco.com/-/media/ content-center/ case-studies/dana-farber.ashx (accessed March 2013); a nd Dana-Farber Cancer Institute, dana-farber.org. SECTION 4.3 REVIEW QUESTIONS 1. What is data visualization? Why is it needed? 2. What are the historical roots of data visualization? 3. Carefully analyze Ch arles Joseph Minard's graphical portrayal of Napoleon's march. Identify and comment on all of the information dimen sions captured in this ancient diagram. 4. Who is Edward Tufte? Why do you think we should kn ow about his work? 5. What do you think the "next big thing" is in data visualization? 4.4 DIFFERENT TYPES OF CHARTS AND GRAPHS Often end users of business analytics systems are not sure what type of chart or graph to use for a specific purpose. Some ch arts and/ o r graphs are better at answering certain types of questions. What follows is a short description of the types of ch a rts a nd/ or graphs commonly found in most business analytics tools and what types of question that they are better at answering/analyzing. Basic Charts and Graphs What follows are the basic charts and graphs that are commonly used for information visualization. LINE CHART Line charts are perhaps the most frequently used graphical visuals for time-series data. Line charts (or line graphs) show the relationship between two variables; they most often are used to track changes or tren ds over time (having one of the vari- ables set to time o n the x -axis). Line charts sequentially connect individual data points to help infer changing trends over a period of time. Line charts are ofte n used to show time-dependent changes in the values of some measure such as changes on a specific stock price over a 5-year period or changes in the number of daily customer service calls over a month. Ch apte r 4 • Business Rep o rting , Visual Analytics, a nd Business Pe rformance Management 151 BAR CHART Bar ch arts are amo ng the most basic visu als used for d ata re presentation. Bar ch arts are effective w he n you h ave n o minal data o r nume rical data th a t splits nicely into differe nt categories so you can quickly see comparative resu lts and trends w ithin your d ata . Bar cha rts are ofte n u sed to compare data across multiple categories su ch as pe rcent ad vertising sp e nding by dep a rtme nts o r by p roduct categories. Bar charts can be vertically or horizo ntally oriented. They ca n also be stacked on top of e ach othe r to show multiple dimen sion s in a single ch art. PIE CHART Pie ch arts are visua lly a ppealing, as the name implies, pie-looking charts. Becau se they are so visually attractive , they are often incorrectly used. Pie charts sh o uld o nly b e used to illustrate relative p rop o rtio ns of a sp ecific measure . Fo r in sta n ce, they can be use d to show relative percentage o f advertising budget spent o n diffe re nt product lines o r they can sh o w relative p ropo rtio n s of majo rs d ecla red by colle ge students in their sopho mo re year. If the number of categories to sh ow a re m ore than ju st a few (say, mo re than 4), one sho uld serio usly con sider using a bar chart instead of a pie ch a rt. SCATTER PLOT Scatter plo ts are ofte n u sed to explo re relationships between two o r three variables (in 2D o r 2D visuals). Since they are visua l explo ration tools , h aving mo re than three variables, translating into more tha n three dime nsio ns , is n ot easily achievable . Scatter plo ts are an effective w ay to explo re the existe n ce of tre nds, con cen- tratio n s, and outliers . For instance , in a two-varia ble (two-axis) gra ph , a scatte r p lo t can be u sed to illustrate the co-re latio nship be tween age a nd weigh t of heart disease p atients o r it can illustrate the re latio n ship between number of cu sto me r care representatives an d numbe r of o pen cu stomer service claims . Often, a trend line is superimposed o n a two- dimen sio n al scatter plo t to illustrate the n ature of the relatio nship. BUBBLE CHART Bubble ch arts are o ften e nhanced versio ns of scatter plo ts . The bubble ch a rt is no t a new visualization type ; instead, it sho uld be view ed as a technique to enrich data illustrated in scatte r plo ts (or even geographic map s). By varying the size a nd/ o r colo r of the circles, o ne can add a dditio n al data d ime nsions, offering m o re e nriched meaning about the data . For instan ce, it can be used to sh ow a compe titive view of college-level class attendance by majo r and by time of the day or it can be u sed to sh ow profit margin by product type and by geog raphic regio n. Specialized Charts and Graphs The graphs and charts that we review in this sectio n are either d e rived fro m the b asic cha rts as sp e cial cases o r they a re re la tively n ew and sp ecific to a problem typ e an d/ o r an application a rea . HISTOGRAM Graphically sp eaking, a histogram looks just like a bar cha rt. The d iffe re nce between histog rams and gen eric ba r c harts is the information that is p o rtrayed in the m . Histograms are used to sh ow the frequency distribution of a va riable , or several variable s. In a histogram , the x -axis is ofte n u sed to show the categories or ranges, and the y -axis is used to show the measures/ values/freque ncies. Histog rams sh ow the distribution al shap e of the da ta . That way, on e can visu ally examine if the data is distributed n o rmally, exp o ne ntially, a nd so o n . Fo r instan ce, o ne can u se a histogram to illustrate the exam performan ce of a class , w here distribution of the g rades as well as comparative an alysis of individual results can be shown ; o r o ne can u se a histogram to show age distribution of the ir custo me r b ase . 152 Pan II • Descriptive Analytics GANTT CHART Gantt charts are a special case of horizontal bar charts that are used to portray project timelines, project tasks/ activity durations, and overlap amongst the tasks/ activities. By showing start and end dates/ times of tasks/activities and the overlapping relationships , Gantt charts make an invaluable aid for management and control of projects . For instance , Gantt charts are often used to show project timelin e, talk overlaps, relative task completions (a partial bar illustrating the completion percentage inside a bar that shows the actual task duration), resources assigned to each task, milestones, and deliverables. PERT CHART PERT charts (also called network diagrams) are developed primarily to simplify the planning and scheduling of large and complex projects. A PERT chart shows precedence relationships among the project activities/ tasks . It is composed of nodes (represented as circles or rectangles) and edges (re presented with directed arrows). Based on the selected PERT chart convention, either nodes or the edges may be used to represent the project activities/tasks (activity-on-node versus activity-on-arrow representation schema) . GEOGRAPHIC MAP When the data set includes any kind of location data (e.g., physical addresses, postal codes, state names or abbreviations, cou ntry names , latitude/ longitude, or some type of custom geographic encoding) , it is better and more informative to see the data on a map . Maps usually are u sed in conjunction with other charts and graphs, as opposed to by the mselves. For instance, one can use maps to show distribution of customer service requests by product type (depicted in pie charts) by geographic locations. Often a large variety of information (e.g., age distribution, income distribution, education, economic growth, population changes, etc.) can be portrayed in a geographic map to help decide where to open a new restaurant or a new service station. These types of systems are often called geographic informatio n systems (GIS). BULLET Bullet graphs are often used to show progress toward a goal. A bullet graph is essentially a variation of a bar chart. Ofte n they a re used in place of gauges, meters, and thermometers in dashboards to more intuitively convey the meaning within a much smaller space. Bullet graphs compare a primary m easure (e.g., year-to-date revenue) to one or more other measures (e.g., annual revenue targe t) and present this in the context of defined performance metrics (e.g ., sales quota). A bullet graph can intuitively illustrate how the prima1y measure is performing against overall goals (e.g., how close a sales representative is to achieving his/ her annual quota) . HEAT MAP Heat maps are great visuals to illustrate the comparison of continuous values across two categories using color. The goal is to help the user quickly see w here the intersection of the categories is strongest and weakest in terms of numerical values of the measure being analyzed. For instance, heat maps can be used to s how segmentation analysis of the target market where the measure (color gradient would be the purchase amount) and the dimensions would be age and income distribution. HIGHLIGHT TABLE Highlight tables are intended to take heat maps one step further. In addition to showing how data inte rsects by using color, highlight tables add a numbe r o n top to provide additional detail. That is, it is a two-dimensional table w ith cells populated with numerical values and gradients of colors. For instance, one can sh ow sales representative p e rformance b y product type and by sales volume . TREE MAP Tree maps display hierarchical (tree -structured) data as a set of nested rectangles. Each branch of the tree is given a rectangle , which is the n tiled with Chapter 4 • Business Reporting, Visual Analytics, and Business Performance Management 153 smaller rectangles representing sub-branches. A leaf node 's rectangle has an area pro- portional to a specified dimension on the data. Often the leaf nodes are colored to show a separate dimension of the data. When the color and size dimensions are correlated in some way w ith the tree structure , one can often easily see patterns that would be difficult to spot in other ways , such as if a certain color is particularly relevant. A second advantage of tree maps is that, by construction, they make efficient use of space. As a result, they can legibly display thousands of items on the screen si multa n eously. Even though these charts and graphs cover a major part of what is commonly used in information visualization, they by no means cover it all. Nowadays, one can find many other specialized graphs and charts that serve a specific purpose. Furthermore, current trends are to combine/ hybridize and animate these charts for better looking and more intuitive visualization of today's complex and volatile data sources. For instance , the interactive, animated, bubble charts available at the Gapminder Web site (gapminder. org) provide an intriguing way of exploring world health, wealth, and population data from a multidimensional perspective. Figure 4.4 depicts the sorts of displays available at the site. In this graph, population size, life expectancy, and per capita income at the continent level a re shown ; also given is a time-varying animation that shows how these variables changed over time . Chart How 10 use j [ 181 Share gr8')h j [ :•: Full screen j Color .,,.., 85 80 75 70 65 f 60 "' ~ I 55 g 50 ~ m 45 0. )C GI 40 ~ :J 35 30 25 ~ 20 :u 0 • 0 0 AJbania O AJgeria O Ango1a O A,genlina 0 Arrnenia 0 AN.ba 0 Australi.a O Austria O Azl!lbaijan 0 Bahamas Q Bamain 400 1 000 2 000 4 000 10 000 20 000 40 000 100 000 S ize Income per person (GDP/capita, PPP$ inffatio~djusted) ---- --,--- • log .. Population. tdal Play ~ ~ 1800 1820 1940 1860 1880 1000 1920 1940 1960 1980 20 1 00 0:36 B .., Tniib ~ FIGURE 4.4 A Gapminder Chart That Shows Wealth and Health of Nations. Source: gapminder.org. 154 Pan II • Descriptive Analytics SECTION 4.4 REVIEW QUESTIONS 1. Why do you think there are large numbers of different types of charts and graphs? 2. What are the main differences among line, bar, and pie ch arts? When should you choose to use one over the other? 3. Why would you use a geographic map? What other types of charts can be combined with a geographic map? 4. Find two more charts that are not covered in this section, and comment on their usability. 4.5 THE EMERGENCE OF DATA VISUALIZATION AND VISUAL ANALYTICS As Seth Grimes (2009) has noted, there is a "growing palette" of da ta visualization techniques and tools that enable the u sers of business analytics and business intelligence systems to better "communicate relationships, add historical context, uncover hidden correlations and tell persuasive stories that clarify and call to action." The latest Magic Quadrant on Business Intelligence and Analytics Platforms released by Gartner in February 2013 further emphasizes the importance of visualization in business intelligence. As the chart sh ows, most of the solution providers in the Leaders quadrant are eith e r relatively recently founded information visualization companies (e.g., Tableau Software, QlikTech, Tibco Spotfire) or are well-established , large analytics companies (e.g. , SAS, IBM, Microsoft, SAP, MicroStrategy) that are increasingly focusing their efforts in informatio n visualization and visual analytics. Details on the Gartner's latest Magic Quadrant are given in Technology Insights 4.1. TECHNOLOGY INSIGHTS 4.1 Gartner Magic Quadrant for Business Intelligence and Analytics Platforms Ganner, Inc. , the crea to r of Magic Quadrants, is a leading information technology research and advisory comp an y. Fo unded in 1979, Gartner has 5,300 associates, including 1,280 research a na- lysts and consultants , and numerous clients in 85 countries. Magic Quadrant is a research method designed and implemented by Gartner to mo nitor and evaluate the progress and positions of companies in a specific, techno logy-based market. By applying a graphical trea tment and a uniform set of evaluation criteria, Magic Quadrant helps users to unde rsta nd how techno logy providers are positioned w ithin a market. Gartner changed the name of this Magic Quadrant from "Business Intelligence Platforms" to "Business Intelligence a nd Analytics Platforms" in 2012 to emphasize the growing imponance of analytics capabilities to the informa tion syste ms that o rga nizations are now building. Gartner defines the business intelligence and analytics platform ma rket as a software platform that delive rs 15 capabilities across three categories: integratio n , information delivery, and a nalysis. These capabilities enable organizations to build precise systems of classification and measure- me nt to suppo rt decision ma king and improve performance. Figure 4.5 illustra tes the latest Magic Quadrant for Business Intelligence and Analytics platforms . Magic Quadrant places providers in four groups (niche players, ch allengers , visionaries, a nd leade rs) alo ng two dimensions: completeness of vision (x-axis) a nd ability to execute (y-axis). As the quadrant clearly shows , most of the well-known BI/ BA providers are positio ned in the "lead e rs" category w hile many o f the lesser known , relatively new, emerging provide rs are positioned in the "niche players" category. Right now, most of the activity in the business inte lligence and analytics platform m arket is from organizations that are tty ing to mature the ir visualization capabilities and to move from descriptive to diagnostic (i.e., predictive and prescriptive) analytics. The vendors in the market have overwhelming ly concentrated on meeting this use r demand. If there were a single m arket Ch apter 4 • Business Reporting , Visual Analytics, and Business Performance Manage ment 155 1 QI .., ::I u QI 1i1 B ~ :.c ct Challengers Leaders / I Tableau Softwa;,----- Microsoft / e--- QlikTech LogiXML e Birst • r'~~~~o~ • Actuate - Oracle • IBM - SAS ~ ~ MicroStrategy I \ Tibco Spotfire Information Builders • SAP Board International _,/ -_ • Panorama Software \ Alterys • I Jaspersoft • Salient Management Company Penta ho • Targit • • GoodData Arcplan • N iche players Visionaries ~ --------< Completeness of vision >——-
‘\
A s of February 2013
FIGURE 4.5 Magic Quadrant for Business Intelligence and Analytics Platforms. Source: gartner.com.
the me in 2012, it would be that data d iscovery/ visu alization became a mainstrea m architec-
ture. Fo r years, data d iscovery/visualizatio n vendors- such as Q likTech, Salie nt Ma nagement
Compa ny, Tableau Software, a nd Tibco Spo tfire- received more positive feedback tha n vend ors
offering O LAP cube and sema ntic-layer-based architectures. In 2012, the market responded:
• MicroStrategy s ignificantly improved Visual Insight.
• SAP launch ed Visu al Intelligence.
• SAS launch ed Visu al Analytics .
• Mic rosoft bolste red PowerPivot w ith Power View.
• IBM launched Cognos Insight.
• Oracle acquired Endeca .
• Actua te acquired Q uite ria n .
This e mphasis on data d iscovery/ v isu aliza tion from most of the leade rs in the market-
w hich are now promo ting tools w ith business-user-friendly data integration, coup le d w ith
e mbedded sto rage a nd com p uting layers (typically in-memory/ colu m na r) a nd unfe ttered
drilling- accele ra tes the tre nd toward decentralizatio n a nd user e mpowerment of BI and
a nalytics, a nd g reatly e nables o rganizatio ns’ ability to perfo rm d iagnostic analytics.
Source: Ga rtner Magic Q uadrant, re leased o n Fe brua ry 5, 2013, gartner.com (accessed February 2013).
In business intelligence and a n alytics, the key ch allen ges for visu alizatio n h ave
revolved around the intuitive representatio n of large, complex data sets w ith multip le
dime nsio ns and measures. For the most part, the typical charts , graphs , and other visu al
ele me nts u sed in these ap p licatio ns u su ally involve two dime nsio ns, som etimes three,
and fa irly sma ll subsets of data sets. In contrast, the data in these systems reside in a

156 Pan II • Descriptive Analytics
data warehouse. At a minimum, these warehouses involve a range of dimensions (e.g .,
product, location, organizational structure, time), a range of measures, and millions of
cells of data. In a n effort to address these challe n ges, a number of researchers have
developed a variety of new visualization techniq u es.
Visual Analytics
Visual analytics is a recently coined term that is often used loosely to mean nothing more
than information visualizatio n . What is meant by visual analytics is the combination
of visualizatio n and predictive analytics. While info rmation v isualization is a imed at
answering “what happened” a nd “what is happening” and is closely associated with
business intelligence (routine reports, scorecards, a nd dashboards), visual analytics is
aimed at answering “why is it happening, ” “what is more likely to happen,” and is usually
associated with business analytics (forecasting, segmentation, correlation a nalysis).
Many of the informatio n visu alization vendors are adding the capabilities to call them-
selves visual a nalytics solution providers. One of the top, lo ng-time analytics solution
providers, SAS Institute, is approaching it from an other direction. They are embedding
their analytics capabilities into a high-performance data v isualization e nvironment that
they call visu al analytics.
Visual or not visual, automated or manual, o nline or paper based, business reporting
is not much different than telling a story. Technology Insights 4.2 provides a different,
unorthodox viewpoint to better business reporting.
TECHNOLOGY INSIGHTS 4.2 Telling Great Stories with Data
and Visualization
Everyone who has data to analyze h as stories to tell, w hether it’s diagnosing the reasons for
manufacturing defects, selling a new idea in a way that captures the imagination of your target
audience, or informing colleagues about a panicular custome r service improvement p rogram.
And when it’s telling the story behind a big strategic choice so that you and your senior
management team can make a solid decision, providing a fact-based story can be especially
challe nging. In all cases, it’s a big job. You want to be interesting a nd memorable; you know
you need to keep it simple for your busy executives and colleagues. Yet you also know you
have to be factual, detail oriented, and data driven, especially in today’s metric-centric world.
It’s tempting to present just the data and facts , but when colleagues a nd senior manage-
ment a re overwhelmed by data a nd facts w ithout context, you lose. We have all experienced
presentations w ith large slide decks, o nly to find that the audien ce is so overwhelmed w ith data
that they don’t know what to think, or the y are so completely tuned o ut, the y take away only a
fraction of the key points.
Stan engaging your executive team and explaining your strategies and results more
powerfully by approaching your assignment as a story. You w ill need the “w hat” of your story
(the facts and data) but you also need the “who?, ” the “how?,” the “why?,” and the often missed
“so w hat?” It’s these story elements that will make you r data relevant a nd tangible for you r
audience. Creating a good story can a id you and senior management in focusing on w hat
is important.
Why Story?
Stories bring life to data and facts. They can help you ma ke se nse and orde r o ut of a disparate
collection of facts. They make it easier to remembe r key points and can paint a vivid picture of
w hat the future can look like . Stories also create interactivity- people put themselves into stories
and can relate to the situa tio n.

Ch apter 4 • Business Repo rting , Visual Analytics, and Business Pe rformance Manage ment 157
Cultures have long u sed sto1ytelling to p ass on knowledge and con tent. In some cultures,
storyte lling is critical to the ir identity. For example, in New Zealand, some of the Maori people
tattoo the ir fa ces with mokus. A moku is a facial tattoo containing a story about ancesto rs- the
fa mily tribe . A m an may have a tattoo design on h is face th at sh ows features of a h ammerhead
to highlight unique qualities about his lineage. The design he chooses signifies w hat is part of
his “true self’ and his ancestral home .
Likewise, whe n we are trying to unde rstand a sto ry, the sto ryte lle r navigates to fin din g the
“true north.” If senio r management is looking to discu ss how they will respond to a competitive
cha nge, a good story can make sen se and o rder out of a lot of no ise. For example, you may
have fa cts and data from two studies, o n e including results from an advertising study and o ne
from a product satisfactio n study. Develo ping a story for w h at you measured across both studies
can he lp p eople see the w ho le whe re there we re disp arate p arts. For rallying your distributors
around a ne w product, you can e mploy a story to give vision to w h at the futu re can look like.
Most impo rtantly, storyte lling is inte ractive- typ ically the presenter u ses words and p ictures that
audie nce members can put themse lves into . As a result, they b ecome more e ngaged and better
unde rstand the informatio n.
So What Is a Good Story?
Most p eople can easily rattle off the ir favorite film o r book. Or they re me mber a funn y story
that a colleagu e recently shared. Why do p eople rem embe r these stories? Because they contain
certain characte ristics. First, a good story has great characte rs . In some cases, th e reade r or
view e r has a vicario u s experie nce w he re they b ecome invo lved with the characte r. The charac-
te r the n has to be faced w ith a challenge that is difficult but b elievable . There must b e hurdles
that the cha racte r overcomes. And fi nally , the o utcome or prognosis is clear by the e nd o f the
sto ry. The situation may not b e resolved-but the story h as a clear endpo int.
Think of Your Analysis as a Story-Use a Story Structure
Whe n crafting a data-rich story, the first objective is to find the sto1y . Who are the ch aracters?
What is the drama o r challe nge? What hurdles have to b e overcome? And at the end of your
sto ry, what do you want your a udience to d o as a result?
Once you know the core sto ry, craft you r othe r story e le me nts: define yo u r ch aracters,
unde rstand the cha llenge, identify the hurdles, and crystallize the outcome or decision
questio n. Make sure you are clear w ith w h at you want people to d o as a result. This w ill
s ha pe how your audience will recall your sto1y. With the sto1y ele me nts in place, write out
the sto ryboard, w hich represents the structure and form o f your sto1y. Although it’s tempting
to skip this ste p , it is better first to unde rsta nd the sto ry you a re te lling and the n to focus o n
the presentatio n structure and form. Once the storybo ard is in place, the other e leme nts w ill
fall into place. The storyboard w ill help you to think a bo ut the best a nalogies or meta pho rs , to
cle arly set up challe nge o r o pportunity , and to fin ally see the fl ow and transitions need ed. The
sto1y bo ard also helps you focu s on key visuals (graphs , charts, an d graphics) that you n eed
your executives to recall.
In summary, do n’t b e afraid to u se data to tell great stories . Being factual, detail o riented ,
a nd data driven is critical in tod ay’s metric-centric world bu t it does no t have to mean being bor-
ing and le ngthy. In fa ct, by finding the re al stories in your data and following the b est practices,
you can get people to focus o n your message- and thus o n what’s im porta nt. He re are those
b est practices:
1. Think of your an alysis as a story-use a story structure .
2. Be authentic- your story will fl ow.
3. Be visual-think of yourself as a fil m edito r.
4. Make it easy for your audien ce and you.
5. Invite a nd direct discussio n .
Source: Elissa Fink and Susan J. Moore , “Five Best Prac tices for Te lling Great Stories w ith Data ,” 2012, w h ite
pa pe r by Tableau Softwa re , Inc., tableausoftware.com/whitepapers/telling-stories-with-data (accessed
Februa ry 2013).

158 Pan II • Descriptive Analytics
High-Powered Visual Analytics Environments
Due to the increasing demand for visual analytics coupled with fast-growing data volumes,
there is an exponential moveme nt toward investing in highly efficient visualization systems.
With their latest move into visual analytics, the statistical software giant SAS Institute is
now among the ones who are leading this wave. Their new product, SAS Visual Analytics,
is a very high-performance, in-me mory solution for exploring massive amounts of data
in a very short time (almost instantaneously). It empowers users to spot patterns, identify
opportunities for further analysis, and convey visual results via Web reports or a mobile
platform such as tablets and smartphones. Figure 4.6 shows the high-level architecture of
the SAS Visual Analytics platform. On one end of the architecture, there are universal Data
Builder and Administrator capabilities, leading into Explorer, Report Designer, and Mobile
BI modules, collectively providing an end-to-end visual analytics solution.
Some of the key benefits proposed by SAS analytics are:
• Empower all users with data exploration techniques and approachable analytics to
drive improved decision making. SAS Visual Analytics enables different types of users
to conduct fast, thorough explorations on all available data . Subsetting or sampling
of data is not required. Easy-to-use, interactive Web inte rfaces broaden the audi-
ence for analytics, e nabling everyone to glean new insights. Users can look at more
options, make more precise decisions, and drive success even faster than before.
• Answer complex questions faster, e nhancing the contributions from your a nalytic
talent. SAS Visual Analytics augments the data discovery and exploration process by
providing extremely fast results to enable better, more focused analysis. Analytically
savvy users can identify areas of opportunity or concern from vast amounts of data
so further investigation can take place quickly.
• Improve information sharing and collaboration. Large numbers of users, including
those with limited analytical skills, can quickly view and inte ract with reports and
charts via the Web, Adobe PDF files , and iPad mobile devices, w hile IT maintains
control of the underlying data and security. SAS Visual Analytics provides the
right information to the right person at the rig ht time to improve productivity a nd
organizational knowledge.
FIGURE 4.6 An Overview of SAS Visual Analytics Architecture. Source: SAS.com.

Chapter 4 • Business Reporting, Visual Analytics, and Business Performance Manage ment 159
FIGURE 4.7 A Screenshot from SAS Visual Analytics. Source: SAS.com.
• Liberate IT by giving users a new way to access the information they n eed. Free
IT from the con stant barrage of demands from users who need access to different
a mounts of data, different data views, ad hoc reports, and one-off requests for
information . SAS Visual Analytics e n ables IT to easily load and prepare data for
multiple users. On ce data is loaded and available, users can dynamically explore
data, create reports, and sh are information on their own .
• Provide room to grow at a self-determined pace. SAS Visual Analytics provides th e
option of using commodity hardware or database appliances from EMC Greenplum
a nd Teradata. It is designed from the groun d up for performance optimization and
scalability to meet the needs of a ny size organization.
Figure 4.7 shows a screen shot of an SAS Analytics p latform w here time-series
forecasting a nd confidence intervals around the forecast are depicted. A wealth of infor-
mation o n SAS Visual Analytics, along with access to the tool itself for teaching and learn-
ing purposes, can be fou nd a t teradatauniversitynetwork.com.
SECTION 4.5 REVIEW QUESTIONS
1. What are the reasons for the recent emergence of visual analytics?
2. Look at Gartner’s Magic Quadrant for Business Intelligence and Analytics Platforms .
What do you see? Discuss and justify your observation s .
3. What is the difference between information visualization and visual an alytics?
4. Why should storytelling be a part of your reporting and data visualization?
5. What is a high-powered visual a na lytics environme nt? Why do we need it?

160 Pan II • Descriptive Analytics
4.6 PERFORMANCE DASHBOARDS
Performance dashboards are common components of most, if n ot all, performance man-
agement systems, performance measurement systems, BPM software suites, and BI plat-
forms. Dashboards provide visu al displays of important information that is consolidated
and arranged on a single screen so that information can be digested at a single glance
and easily drilled in and further explored. A typical dashboard is shown in Figure 4.8.
This particular executive dashboard displays a variety of KPis for a hypothetical software
company called Sonatica (selling audio tools). This executive dashboard shows a high-
level view of the different functional groups surrounding the products, starting from a
general overview to the marketing efforts, sales, finance, and support departments . All of
this is intended to give executive decision makers a quick and accurate idea of what is
going on within the organization. On the left side of the dashbord, we can see (in a time-
series fashion) the quarterly changes in revenues, expenses, and margins, as well as the
comparison of those figures to previous years’ monthly numbers. On the upper-right side
we see two dials with color-coded regions showing the amount of monthly expenses for
support services (dial on the left) and the amount of other expenses (dial on the right) .
Executive Dash boa rd
Specify a date range: !lune, 2009
$10.00-L, ……. …U..,..U…..1.1..,..U..-1..1…,.&J~L.,.,L ………….. ~
Jun 09 Aug 09 Oct 09 Dec 09 Feb 10 Apr 10
$50. 0
$O. o.,…. …….. ~,…….~ …….. ~ …….. ~.-…~ ……. –
Jun 09 Aug 09 Oct 09 Dec 09 Feb 10 fJ¥ 10
Sales Distributmn (USO)
0(:§]0
$0 to $ 19, 000
• $19,000 to $38,000
• $38,000 to $57,000
• $57,000 to $76,000
• $76 ,0 00 to $95,000
FIGURE 4.8 A Sample Executive Dashboard. Source: dundas.com.
[y Hover Over
Monthly Expense H,Vh ,& Montflly &pen.H Low
Ranges
• Nominal I
• Excessive

Chapter 4 • Business Reporting, Visual Analytics, and Business Performance Management 161
As the color coding indicates, while the monthly support expenses are well within the
normal ranges, the other expenses are in the red (or darker) region, indicating excessive
values. The geographic map on the bottom right shows the distribution of sales at th e
country level throughput the world. Behind these graphical icons there are variety of
mathematical functions aggregating numerious data points to their highest level of mean-
ingul figures . By clicking on these graphical icons , the consumer of this information can
drill down to more granular levels of information and data.
Dashboards are used in a w ide variety of businesses for a wide variety of reasons.
For instance, in Application Case 4.5 , you will find the summary of a successful imple-
mentation of information dashboards by the Dallas Cowboys football team.
Application Case 4.5
Dallas Cowboys Score Big with Tableau and Teknion
Founded in 1960, the Dallas Cowboys are a pro-
fessional American football team headquartered
in Irving, Texas. The team has a large national
following, w hich is perhaps best represented by
the NFL record for number of consecutive games at
sold-out stadiums.
Challenge
Bill Priakos, COO of the Dallas Cowboys Merchan-
dising Division, and his team needed more visibility
into their data so they could run it more profitably.
Microsoft was selected as the baseline platform for
this upgrade as well as a number of other sales, logis-
tics, and e-commerce applications. The Cowboys
expected that this new information architecture
would provide the needed analytics and reporting.
Unfortunately, this was not the case, and the search
began for a robust dashboarding, analytics, and
reporting tool to fill this gap.
Solution and Results
Tableau and Teknion together provided real-time
reporting and dashboard capabilities that exceeded
the Cowboys’ requirements. Systematically and
methodically the Teknion team worked side by side
w ith data owners and data users within the Dallas
Cowboys to deliver all required functionality, on
time and under budget. “Early in the process, we
were able to get a clear understanding of what it
would take to run a more profitable operation for
the Cowboys,” said Teknion Vice President Bill
Luisi. “This process step is a key step in Teknion’s
approach w ith any client, and it a lways pays huge
dividends as the implementation plan progresses. ”
Added Luisi, “Of course, Tableau worked very
closely with us and the Cowboys during the entire
project. Together, we made sure that the Cowboys
could achieve their reporting and analytical goals in
record time.”
Now, for the first time , the Dallas Cowboys
are able to monitor their complete merchandising
activities from manufacture to end customer and see
not only what is happening across the life cycle, but
drill down even further into why it is h appening.
Today, this BI solution is used to report and
analyze the business activities of the Merchandising
Division, which is responsible for all of the Dallas
Cowboys’ brand sales. Industry estimates say that
the Cowboys generate 20 percent of all NFL mer-
chandise sales, which reflects the fact they are the
most recognized sports franchise in the world.
According to Eric Lai, a ComputerWorld
repo1ter, Tony Romo and the rest of the Dallas
Cowboys may have been only average on the foot-
ball field in the last few years, but off the field,
especially in the merchandising arena, they remain
America’s team.
QUESTIONS FOR DISCUSSION
1. How did the Dallas Cowboys use information
visualization?
2. What were the challenge, the proposed solution,
and the obta ined results?
Sources: Tableau, Case Study, tableausoftware.com/learn/
stories/tableau-and-teknion-exceed-cowboys-requirements
(accessed Fe brnary 2013); and E. Lai, “BI Visualization Tool Helps
Da llas Cowboys Sell Mo re To ny Romo J erseys,” ComputerWorld,
October 8, 2009.

162 Pan II • Descriptive Analytics
Dashboard Design
Dashboards are not a new concept. Their roots can be traced at least to the EIS of the
1980s. Today, dashboards are ubiquitous. For example, a few years back, Forrester
Research estimated that over 40 percent of the largest 2,000 companies in the world
use the techn ology (Ante and McGregor, 2006). Sin ce then, o ne can safely assume
that this number h as gone up quite significantly. In fact, nowadays it would be rather
unusual to see a large company using a BI system that does n ot employ some sort of
performance dashboards. The Dashboard Spy Web site (dashboardspy.com/about)
provides further evide nce of their ubiquity. The site contains descriptions and screen-
shots of thousands of BI dashboards, scorecards, and BI interfaces used by businesses
of a ll sizes and industries, nonprofits, a nd government agencies.
According to Eckerson (2006), a well-known expert on BI in general and dash-
boards in particular, the most distinctive feature of a dashboard is its three layers of
information:
1. Monitoring. Graphical, abstracted data to monitor key performance metrics.
2. Analysis. Summarized dimensional data to analyze the root cause of problems.
3. Management. Detailed operational data that identify what actions to take to
resolve a problem.
Because of these layers, dashboards pack a lot of information into a sin gle
screen . According to Few (2005), “The fundamental challenge of dashboard design
is to display all the required information on a single screen, clearly and with out
distraction, in a manner that can be assimilated quickly .” To speed assimilation of
the numbers , the numbers n eed to be placed in context. This can be don e by com-
paring the numbers of interest to other baseline or target numbers, by indicating
w hether the numbers are good or bad, by den oting w h eth er a trend is better or worse,
a nd by using specialized display w idgets or com ponents to set the comparative a n d
evaluative context.
Some of the common comparisons that are typically made in busin ess
intelligence systems include comparisons against past values, forecasted values,
targeted valu es, benchmark or average values, multiple instances of the same measure,
a nd the values of other measures (e.g. , revenues versus costs). In Figure 4 .8, the
vario us KPis a re set in context by comparing them w ith targeted values, the revenue
figure is set in context by comparing it w ith marketing costs , and the figures for
the various stages of the sales pipeline are set in context by comparing o n e stage
w ith another.
Even w ith comparative measures, it is important to specifically point out
w hether a particular number is good or bad and w h eth e r it is trending in the right
direction. Without these sorts of evaluative designations, it can be time-consuming
to determine the status of a particular number or result. Typically, e ither specialized
v isual objects (e .g., traffic lights) or visual attributes (e.g., color coding) are u sed
to set the evaluative context. Again, for the dashboard in Figure 4.8, color coding
(or varyin g gray tones) is used with the gauges to designate whether the KPI is good
or bad, and g reen up arrows are used with the vario u s stages of the sales pipeline to
indicate whether the results for those stages are trending up or down and whether
up or down is good or bad. Alth ough not used in this particular exam ple, additional
colors- red a nd oran ge, for instance-could be used to represent other states on
the various gauges. An inte resting an d informative dash board -driven reporting
solution built specifically for a very large telecommunicatio n company is featured in
Application Case 4.6 .

Chapter 4 • Business Reporting, Visual Analytics, and Business Performance Management 163
Application Case 4.6
Saudi Telecom Company Excels with Information Visualization
Supplying Internet a nd mobile services to over
160 million customers across the Middle East,
Saudi Telecom Company (STC) is one of the larg-
est providers in the region, extending as far as
Africa and South Asia. With millions of customers
contacting STC daily for b illing, payme nt, network
usage, and support, a ll of this information has to be
monitored somewhere. Located in the headquarters
of STC is a data center that features a soccer field-
sized wall of monitors a ll displaying information
regarding n etwork statistics, service a nalytics, and
customer calls.
The Problem
When you h ave acres of information in front of
you, prioritizing and contextualizi ng the data are
paramount in understanding it. STC needed to iden-
tify the relevant metrics, properly visualize them,
and provide them to the right people, often with
time-sensitive information. ‘The executives didn’t
have the ability to see key performance indicators”
said Waleed Al Eshaiwy, manager of the data center
at STC. “They would have to contact the technical
teams to get status reports. By tha t time, it would
often be too late and we would be reacting to prob-
lems rather than preventing them.”
The Solution
After carefully evaluating several vendors, STC made
the decision to go with Dundas because of its rich
data visualization alternatives. Dundas business
inte lligence consultants worked on-site in STC’s
headqua1ters in Riyadh to refine the telecommu-
nication dashboards so they functioned properly.
“Even if someone were to show you what was in
the database, line by line, w ithout visualizing it, it
would be difficult to know w h at was going on, ”
said Waleed, who worked closely with Dundas con-
sultants. The success that STC experienced led to
engagement on an enterprise-wide, mission-critical
project to transform their data center an d create a
more proactive monitoring environment. This project
culminated w ith the monitoring systems in STC’s
data center finally transforming from reactive to pro-
active . Figure 4.9 shows a sample dashboard for call
center management.
The Benefits
“Dundas’ information visualization tools allowed
us to see tre nds an d correct issues before they
became proble ms ,” said Mr. Eshaiwy. He added,
“We decreased the amount of service tickets
by 55 percent the year that we started using the
information visualization tools and dashboards .
The availability of the system increased, which
m eant customer satisfaction levels increased, which
led to an increased customer base, w hich of course
lead to increased revenues.” With new, custom
KPis becoming visually available to the STC team ,
Dundas’ dashboards currently occu py nea rly a
quarter of the soccer fie ld-sized monitor wall.
“Eve1y thing is on my screen , and I can drill down
and find whatever I need to know,” explained
Waleed. He added, “Because of the design and
structure of the d ashboards , we can very quickly
recognize th e root cau se of the problems and take
appropriate action. ” According to Mr. Eshaiwy,
Dundas is a success: “The adoption rates are
excellent, it’s easy to use, and it’s o n e of the most
successful projects that we have impleme nted.
Even visitors who stop by my office are grabbed
right away by the look of the dashboard!”
QUESTIONS FOR DISCUSSION
1. Why do you think telecommunications compa-
nies are among the prime users of information
visualization tools?
2. How did Saudi Telecom u se information
visualization?
3. What were their challenges, the proposed solu-
tion, and the obtained results?
Source: Dundas , Customer Success Story, “Saudi Telecom
Company Used Dundas’ Information Visualization Solution,”
dundas.com/wp-content/uploads/Saudi-Telecom-Company-
Case-Studyl (accessed February 2013).
(Continued)

164 Pan II • Descriptive Analytics
Application Case 4.6 (Continued}
CALL CENTER DASHBOARD ….. ~ ••@ Dundas Data Visualization Inc.
“” -AGENT FClw QTO .. ,,.” TfRS CAllS RATE ,, .. ‘”””” ……. 0 ‘ • 1’6 18 JO M.lt” W~ _,….._ 92 ., 17 3l 92 – Gr- _,….._ 89 11 16 ll 89 . ” Aon W.,,ky _,….._ 81 • • .. , 12 ,, a,
At>u, Oumblodo,. _,….._ II • • … :: 11 31 81
s.-.,,.. s,,,,,. _,….._ 81 . .. • • – ‘l 1t 2• 81 M .. nc» Jul)r _,….._ 76 • • • … • .. : 17 27 76
Sytv,, Plolh
_,….._
7• • • • . 20 2l 7• .. J
Ny~, _,….._ 74 .. ‘!. 1S 2S 1,
P.at>lo N..- _,….. ,0 .. • .. II 2, ,0 11
Ehub11h 8ttw1tN’ _,….. 68

17 22 68
Kun Vonnegut
_,….._
68 • .. l 16 21 68
Ri

216 Pan III • Predictive Analytics
TABLE 5.2 Common Accuracy Metrics for Classification Models
Metric
TP
True Positive Rate = —
TP + FN
TN
True Negative Rate = -TN_ +_ F_P
TP + TN
Accuracy = TP + TN + FP + FN
TP
Precision = —
TP + FP
TP
Recall = TP + FN
Preprocessed
Data
1/3
Training
Data
Testing
Data
FIGURE 5.9 Simple Random Data Splitting.
Description
The ratio of correctly classified positives divided
by the total positive count (i. e., hit rate
or recall)
The ratio of correctly classified negat ives
divided by the total negative count
(i.e., fa lse alarm rate)
The ratio of correctly classified instances
(positives and negatives) divided by the
total number of instances
The ratio of correctly classified positives divided
by the sum of correctly classified positives and
incorrectly classified positives
Ratio of correctly classified positives divided by
the sum of correctly classified positives and
incorrectly classified negatives
Model
Development
Classifier
Model
Assessment
[scoring)
Prediction
Accuracy
The validation set is use d during model building to prevent overfitting (more on arti-
fici a l neural n e tworks can be found in Chapter 6). Figure 5.9 shows the simple split
methodology.
The main criticism of this m ethod is that it makes the assumption that the data in
the two subsets are of the same kind (i.e., have the exact same properties) . Becau se this
is a simple random partitioning, in most realistic data sets where the data are skewed on
the classification variable, such an assumption may not hold tru e. In order to improve this
situation, stratified sampling is suggested, where the strata become the output variable.
Even though this is an improvement over the simple split, it still has a bias associated
from the single random partitioning.
k-FOLD CROSS-VALIDATION In order to minimize the bias associated w ith the ra ndom
sampling of the training and holdout data samples in comparin g the predictive accuracy
of two or more methods, one can use a methodology called k-fold cross-validation. In
k-fold cross-validation, also called rotation estimation, the complete data set is randomly
split into k mutually exclusive subsets of approximately equal size. The classification
model is trained and tested k times. Each time it is trained on all but one fold and the n
tested o n the remaining single fold. The cross-validation estimate of the overall accuracy

Chapter 5 • Data Mining 217
of a model is calculated by simply averaging the k individual accuracy measures, as
shown in the following equation:
where CVA stands for cross-validation accuracy, k is the number of folds used, and A is
the accuracy measure (e.g., hit-rate, sensitivity, specificity) of each fold.
ADDITIONAL CLASSIFICATION ASSESSMENT METHODOLOGIES Other popular assess-
ment methodologies include the following:
• Leave-one-out. The leave-one-out method is similar to the k-fold cross-validation
where the k takes the value of 1; that is, every data point is used for testing once
on as many models developed as there are number of data points. This is a time-
consuming methodology, but sometimes for small data sets it is a viable option.
• Bootstrapping. With bootstrapping, a fixed number of instances from the origi-
nal data is sampled (with replacement) for training and the rest of the data set is
used for testing. This process is repeated as many times as desired.
• Jackknifing. Similar to the leave-one-out methodology, with jackknifing the accuracy
is calculated by leaving one sample out at each iteration of the estimation process.
• Area under the ROC curve. The area under the ROC curve is a graphical assess-
ment technique where the true positive rate is plotted o n the y-axis and false positive
rate is plotted on the x -axis. The area under the ROC curve determines the accuracy
measure of a classifier: A value of 1 indicates a perfect classifier whereas 0.5 indicates
no better than random chance; in reality, the values would range between the two
extreme cases. For example, in Figure 5.10 A has a better classification performance
than B , while C is not any better than the random chance of flipping a coin.
False Positive Rate (1-Specificity)
0.9
0.8
0.7
~//,./····/···//./··· ::·
~
> B ·., 0.6 “iii
C:
0)
~
0.5 0) ..,
C
“‘ CI:
0)
.2 0.4 ..,
“iii
0
Cl.
0) 0.3
:::,
‘-
I-
0.2
0.1
0—————————–l
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
FIGURE 5.10 A Sample ROC Curve.

218 Pan III • Predictive Analytics
CLASSIFICATION TECHNIQUES A number of techniques (or algorithms) are u sed for
classification modeling, including the fo llowing:
• Decision tree analysis. Decision tree analysis (a machine-learning technique)
is arguably the most popular classification technique in the data mining arena.
A detailed description of this technique is given in the following section .
• Statistical analysis. Statistical techniques were the primary classification algo-
rithm for many years until the emergence of machine-learning techniques. Statistical
classification techniques include logistic regression and discriminant an alysis, both
of which make the assumptions that the relationships between the input and output
variables are linear in nature, the data is normally distributed, a nd the variables are
not correlated and are independent of each other. The questionable nature of th ese
assumptions has led to the shift toward machine-learning techniques.
• Neural networks. These are among the most popular machine-learning tech-
niques that can be used for classification-type problems. A detailed description of
this technique is pr

Calculate your order
Pages (275 words)
Standard price: $0.00
Client Reviews
4.9
Sitejabber
4.6
Trustpilot
4.8
Our Guarantees
100% Confidentiality
Information about customers is confidential and never disclosed to third parties.
Original Writing
We complete all papers from scratch. You can get a plagiarism report.
Timely Delivery
No missed deadlines – 97% of assignments are completed in time.
Money Back
If you're confident that a writer didn't follow your order details, ask for a refund.

Calculate the price of your order

You will get a personal manager and a discount.
We'll send you the first draft for approval by at
Total price:
$0.00
Power up Your Academic Success with the
Team of Professionals. We’ve Got Your Back.
Power up Your Study Success with Experts We’ve Got Your Back.

Order your essay today and save 30% with the discount code ESSAYHELP