Developing a Modern Data Architecture Overview White Paper
For this homework assignment, you are assuming the role of a “Big 4” (KPMG, EY, Deloitte, PwC), where your client, Farmer Consulting, is asking for a white paper discussing the key points, benefits, and components are a modern data architecture. Farmer Consulting is “behind the times” in their infrastructure, and need to make a move towards a modern data architecture. You recently attended a conference and saw a great presentation on this topic, and have a copy of the deck (attached below), that you believe is the basis of your white paper. Your assignment is to use the lecture videos, notes, and presentation below to write a persuasive, informative, and action oriented white paper for your client. The white paper should include an executive summary highlighting the key takeaways that focus points to keep your client excited about reading the paper, and a structured, well flowing paper that will inform their opinion on how to build a modern data architecture.
D
EVELOPING A
MODERN ENTERPRISE
DATA STRATEGY
Edd Wilder-James, Scott Kurth
March 2017
22 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SV
Data
Science
TODAY’S SCHEDULE
Introduction
Why Have a Data Strategy
?
Connecting Data with the Business
Understanding Data Gaps
The Data Platform Architecture
Break
Identifying Strategic Workloads
The Chief Data Officer
The Experimental Enterprise
INTRODUCTION
3 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
@SVDataScience
To view SVDS speakers and scheduling,
or to receive a copy of our slides, go to:
www.svds.com/StrataCA2017
4 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
Silicon Valley Data Science is a boutique
consulting firm focused on transforming
your business through data science and
engineering.
5 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
WE DO DATA RIGHT
• We work in cross-functional teams made up of data
scientists, engineers, and solutions architects.
• We combine enterprise know-how with custom
methods derived from Silicon Valley best practices.
• We use an Agile Software Development approach to
make rapid progress against difficult problems that
require flexibility.
• We focus on delivering business value as early as
possible, then iterating toward the larger goal.
6 @SVDataScience6 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
OUR SERVICES
DATA
STRATEGY
AGILE
ENGINEERING
AGILE
DATA SCIENCE
ARCHITECTURE
7 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
Supports investigative work and
builds a solid layer for production.
Conducts experiments and responds
to the changing environment.
Makes foundational infrastructure
readily accessible.
THE EXPERIMENTAL ENTERPRISE
8 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
THE DATA VALUE CHAIN
DRAW VALUE FROM YOUR STRATEGIC DATA
ASSETS
DISCOVER INGEST PROCESS PERSIST INTEGRATE ANALYZE EXPOSE
9 @SVDataScience
WHAT’S ON YOUR MIND?
What is preventing your organization from
realizing its vision?
1010 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
TODAY’S SCHEDULE
Introduction
Why Have a Data Strategy?
Connecting Data with the Business
Understanding Data Gaps
The Data Platform Architecture
Break
Identifying Strategic Workloads
The Chief Data Officer
The Experimental Enterprise
WHY HAVE A
DATA STRATEGY?
11 @SVDataScience
DATA STRATEGY
is not for the faint of heart*
* Creating an Enterprise Data Strategy by Wayne Eckerson
http://www.enterprisemanagement360.com/white_paper/creating-an-enterprise-data-strategy/
12 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
The alternative is to treat
data as a cost of business, to
be minimized.
Data must serve the
strategic imperatives of a
business: the key strategic
aspirations that define the
future vision for an
organization.
IS THERE AN
ALTERNATIVE?
13 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
A modern data strategy is a
roadmap to enable data-
driven decision-making and
applications that helps an
enterprise achieve its
strategic imperatives.
An effective data strategy
helps an enterprise make
technology choices,
grounded in business
priorities, to get the most
value from their data.
IS THERE AN
ALTERNATIVE?
14 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
CONNECTING TECHNOLOGY AND
BUSINESS VALUE
If you find that:
• you can’t articulate how the cost of your data
systems relates to the benefits to your business, or
• you can’t articulate how your technology philosophy
enables your business aspirations
then your organization would almost certainly benefit
from data strategy.
15 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
Poll:
• Is the technology
leadership in your
organization prioritizes
investments to meet the
ambitions of the business?
• Can your organization
clearly articulate the
business impact of the
data and technology
investments it makes?
ARTICULATING THE
BUSINESS IMPACT OF
DATA & TECHNOLOGY
1616 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
TODAY’S SCHEDULE
Introduction
Why Have a Data Strategy?
Connecting Data with the Business
Understanding Data Gaps
The Data Platform Architecture
Break
Identifying Strategic Workloads
The Chief Data Officer
The Experimental Enterprise
CONNECTING DATA
WITH THE BUSINESS
17 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
CLEAN VALIDATE CONTROL PROTECT
CONVENTIONAL DATA STRATEGY
“
WHAT YOU DO TO DATA”
18 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
CONVENTIONAL WISDOM:
10 THINGS A DATA STRATEGY SHOULD INCLUDE*
1. What data should be collected?
2. How long should data be kept?
3. Where should the data be
stored?
4. How will data privacy and
security be managed?
5. From where can data be
accessed?
6. What data can be displayed?
7. What level of detail should be
retained?
8. Who is responsible for the data
(governance)?
9. How is data integrated?
10. How will data be distributed
(virtualization?)
* 10 Key Elements of your Data Strategy by Mike Schiff
http://www.tdwi.org//Articles/2012/01/17/10-Elements-Data-Strategy.aspx?Page=1
19 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
MODERN DATA STRATEGY
“WHAT YOU DO WITH DATA”
TARGET VIP CUSTOMERS ATTRACT NEW CUSTOMERS
AUTOMATE
20 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
A NEW ORTHODOXY?
FOUR PRINCIPLES OF A
SUCCESSFUL DATA STRATEGY*
1. How does data generate value?
2. What are our critical data assets?
3. What is our data ecosystem?
4. How do we govern data?
* The 4 Principles of a Successful Data Strategy by Paul Barth
http://www.cioupdate.com/insights/article.php/3936706/The-4-Principles-of-a-
Successful-Data-Strategy.htm
21 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
EDW
Governance
Security
NOT ALL DATA IS EQUAL
Conventional data strategy
22 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
EDW
Governance
Security
NOT ALL DATA IS EQUAL
Modern data strategy
23 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
WHAT IS A DATA STRATEGY?
Existing data &
technology
Possible data &
technology
Business
strategic
ambitions
Constraints Priorities
Roadmap of
investments
Tools to update
and assess
roadmap
Plan to update
capabilities
24 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
Modern Role of Data:
Represents the new role data
and analytics play in the
enterprise.
Outcomes, not Operations: A
strategic notion of maturity
should begin with value
creation before addressing
underlying operational
processes.
Transforming Pragmatically:
Changes are grounded in the
holistic view of the future
state of your enterprise.
A NEW NOTION OF
MATURITY
25 @SVDataScience25 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
An organization’s ability to
derive value from its data
defines its maturity.
NEW STAGES, NEW DIMENSIONS
ASSETS
CULTURE
DECISIONS
OUTCOMES
Illustrative
26 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
Not just the technology!
• People
• Processes
• Systems
DIMENSIONS OF
DATA MATURITY
27 @SVDataScience27 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
CURIOUS WHERE YOU FALL?
ASSETS
CULTURE
DECISIONS
OUTCOMES
IllustrativeMaturity Mini-Assessment
• 20Q survey (5-10 min)
• Identifies your stage and provides
general recommendations
• Creates baseline for future
performance and growth
dmm.svds.com
28 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
• Infrastructure is holding
back growth
• Infrastructure is holding
back development
• Analog to digital
transformation
• Changing business models
• Unifying fragmented
offerings
YOU NEED A DATA
STRATEGY WHEN…
29 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
BEGIN WITH THE BUSINESS
• First understand what drives your business
• Then make the leap from strategy to tactics
Technologists: This can’t be done without the business
leaders in the room
Business Leaders: This can’t be done without the
technologists in the room
3030 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
Understand the strategic
imperatives of your
organization:
• Annual report
• Investor updates
• Talk to leadership
STRATEGIC
IMPERATIVES
31 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
Break down the strategic
imperatives to make them
tangible, achievable, and
measurable. These become
your business objectives.
Business objectives provide
the guide for many other
analyses in building your data
strategy.
BUSINESS
OBJECTIVES
32 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
REAL ESTATE MARKETPLACE: ZILLOW
Business Objectives
• Build and maintain best algorithms for pricing
• Use Hedonic pricing method to incorporate multiple attributes
and ‘nearest neighbors’ to create accurate Zestimate®
• Deploy sophisticated and adaptive models, at scale (over 110
million homes) and at timely interval (3 times / week)
• Use scalable infrastructure (cloud) for rapid analysis
• Build industry’s best real estate data sets
• Increase completeness of data by include public data sets such
as construction listings, foreclosure listings, market context
• Capture unique data with customer reviews and feedback from
real-estate firms
• Manage scale of 110 million properties
and growing
Strategic Imperatives
• Provide products and
services to help
consumers with every
stage of home ownership
– buying, selling, renting,
borrowing, and
remodeling
• Generate more
subscription and ad
revenue
• Drive more unique users
to marketplace
• Become leading real
estate and home-related
information marketplace
on mobile and web
NOTE: Zillow is not an SVDS client.
33 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
HEALTH PROVIDER:
KAISER PERMANENTE
Business Objectives
• Increase data sharing with extended care teams
through secure electronic health record access
• Provide quicker, better diagnoses through evidence-
based medicine techniques
• Provide mobile access to scheduling, pharmacy
interactions, and other related services
• Improve member satisfaction by analyzing web and
mobile user interactions, behavior, and feedback
data
• Share access to knowledge, innovation, and
population data with the public and other health care
leaders
Strategic Imperatives
• Provide seamless,
personalized care
through an integrated
team of care providers
• Enable members to
manage their own care
through easy-to-use
channels
• Transform care and
improve outcomes
through investments in
research and innovation
NOTE: Kaiser Permanente is not an SVDS client.
34 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
REAL ESTATE MARKETPLACE: ZILLOW
STRATEGIC IMPERATIVES
• Provide products and services to help consumers
with every stage of home ownership – buying, selling,
renting, borrowing, and remodeling
• Generate more subscription and ad revenue
• Drive more unique users to marketplace
• Become leading real estate and home-related
information marketplace on mobile and web
NOTE: Zillow is not an SVDS client.
35 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
REAL ESTATE MARKETPLACE: ZILLOW
BUSINESS OBJECTIVES
1. Build and maintain best algorithms for pricing
• Use Hedonic pricing method to incorporate multiple
attributes and ‘nearest neighbors’ to create accurate
Zestimate®
• Deploy sophisticated and adaptive models, at scale
(over 110 million homes) and at timely interval (3
times / week)
• Use scalable infrastructure (cloud) for rapid analysis
NOTE: Zillow is not an SVDS client.
36 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
REAL ESTATE MARKETPLACE: ZILLOW
BUSINESS OBJECTIVES
2. Build industry’s best real estate data sets
• Increase completeness of data by include public
data sets such as construction listings, foreclosure
listings, market context
• Capture unique data with customer reviews and
feedback from real-estate firms
• Manage scale of 110 million properties
and growing
NOTE: Zillow is not an SVDS client.
37 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
HEALTH PROVIDER: KAISER PERMANENTE
STRATEGIC IMPERATIVES
• Provide seamless, personalized care through an
integrated team of care providers
• Enable members to manage their own care through
easy-to-use channels
• Transform care and improve outcomes through
investments in research and innovation
NOTE: Kaiser Permanente is not an SVDS client.
38 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
HEALTH PROVIDER: KAISER PERMANENTE
BUSINESS OBJECTIVES
• Increase data sharing with extended care teams
through secure electronic health record access
• Provide quicker, better diagnoses through evidence-
based medicine techniques
• Provide mobile access to scheduling, pharmacy
interactions, and other related services
NOTE: Kaiser Permanente is not an SVDS client.
39 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
HEALTH PROVIDER: KAISER PERMANENTE
BUSINESS OBJECTIVES
• Improve member satisfaction by analyzing web and
mobile user interactions, behavior, and feedback
data
• Share access to knowledge, innovation, and
population data with the public and other health
care leaders
NOTE: Kaiser Permanente is not an SVDS client.
4040 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
TODAY’S SCHEDULE
Introduction
Why Have a Data Strategy?
Connecting Data with the Business
Understanding Data Gaps
The Data Platform Architecture
Break
Identifying Strategic Workloads
The Chief Data Officer
The Experimental Enterprise
UNDERSTANDING
DATA GAPS
41 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
None of these questions
make sense unless you ask:
For what?
Commonly-asked questions:
• Do I have gaps in my data?
• How good is my data?
• Is my data clean enough?
NO ONE’S DATA IS
PERFECT
42 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
FOR WHAT?
• Do I have gaps in my data?
• How good is my data?
• Is my data clean enough?
• Do I have gaps in my data?
…for understanding customer purchase behavior
• How good is my data?
…for predicting quarterly sales
• Is my data clean enough?
…for automating production
4343 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
• What are you trying to
achieve as a business
[with data]?
These are your business
objectives.
• How do you plan to
achieve it [with data]?
These are your use cases.
UNDERSTAND
YOUR BUSINESS
GOALS
44 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
UNDERSTAND YOUR AUDIENCE
Who is going to use this analysis and how?
• CDO? Heads of Business Units? Data Science Directors?
DBAs?
• Project assessment? Operational dashboard?
Continuous improvement plan?
Understanding stakeholders and expectations will
dictate the level of technical analysis required.
45 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
UNDERSTAND YOUR AUDIENCE
What are the dimensions of requirements that matter
to your audience?
• For a technical application, it might be depth, breadth,
latency, frequency.
• For an executive perspective, it might be higher-order
requirements like ease of integration or coverage.
What are the questions your audience needs
answered? Select the dimensions that provide visibility
into those questions.
4646 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
• Start with an effective
catalog of your data.
• Organize the data to be
effective. Think about how
data is produced AND how
it gets used in your
organization.
• By data source?
• By entity?
• By organization?
• By data owner?
UNDERSTAND
YOUR DATA
47 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
LINK IT ALL TOGETHER
Business
Objectives
Use Cases
Requirements
Data
48 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
VISUALIZE YOUR GAPS
49 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
SO… WHAT IS A ”GAP”?
Two schools of thought:
• Purists: If a requirement isn’t met, it’s a gap.
• Pragmatists: If you can still get the job done,
it isn’t a gap.
Both views can be valuable ways of looking at your
analysis.
5050 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
TODAY’S SCHEDULE
Introduction
Why Have a Data Strategy?
Connecting Data with the Business
Understanding Data Gaps
The Data Platform Architecture
Break
Identifying Strategic Workloads
The Chief Data Officer
The Experimental Enterprise
THE DATA PLATFORM
ARCHITECTURE
51 @SVDataScience
WHY BIG DATA?
1. New Capabilities
2. Economic Scalability
© 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience5
2
Edmunds.com wanted to reduce time-
to-market by speeding creation of
attribute data for new car models.
We developed a new capability to
automatically extract vehicle features
from specification guides and
categorize the features into
appropriate vehicle classes.
DATA
PLATFORMS
FOR NEW
CAPABILITIES
53 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
Existing revenue streams:
• Ads
• Price quotes (leads)
Shopping is the focus:
• Need real-time
inventory
• Accurately described
VINs
54 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
DATA PLATFORMS
FOR ECONOMIC
SCALABILITY
at NetApp
NOTE: NetApp is not an SVDS client.
http://blogs.wsj.com/cio/2012/06/12/netapp-cio-uses-big-data-to-assess-product-performance
55 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
UP VS. OUT — SAAS EDITION
$,
€
, ¥
, £
Users
Revenue
Cost to serve
Scale-out cost
Profit
Loss
56 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
UP VS. OUT — ENTERPRISE EDITION
$,
€
, ¥
, £
Data Resource Usage
Scale-up cost
Scale-out cost
UC
1
UC2
UC
3
UC4
UC5
57 @SVDataScience
BIG DATA
… it’s really about agility
58 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
• Linear scale-out cost
• Opex vs. capex
• Ease of purchase
BUYING AGILITY
59 @SVDataScience
Scale-out systems move us from managing scarcity to
promoting utility.
6060 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
• Architectural factors
• Schema on read
• Rapid deployment
• Mirror production setup
• Executes faster
• Programmer factors
• Fun to program
• Concision
• Easier to test
• Faster to write
DEVELOPMENT
AGILITY
61 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
WHAT IS DOCKER?
• Container technology: bundles every part of an
application
• Provides isolation for each application without the
overhead of running a virtual machine
• Ships only the parts that are needed—leaves out the
operating system
62 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
WHY SHOULD BUSINESS CARE?
• Better use of server resource than virtual machines
• A fast and reliable way of deploying applications
• It’s the ideal packaging mechanism for scale-out
distributed systems
• Easy for developers to work in an environment
identical to production
• Sharing containers leads to innovation
63 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
WHAT IS APACHE KAFKA?
• Scale-out fault-tolerant messaging system
• Comes from LinkedIn
• Supported by Confluent
64 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
USE CASES
• Stream
processing
• Log aggregation
• Creating decoupled evented
architecture
s
65 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
WHY SHOULD BUSINESS CARE?
• Scalability in a critical area of distributed applications
• Online reliability, compared to alternatives
• Will be a core building block of distributed data
architecture
66 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
WHAT IS APACHE SPARK?
• In-memory distributed computing platform
• Comes from Berkeley AMPlab
• In production with early adopters, now integral to
every commercial Hadoop distribution
• Doesn’t need Hadoop, but runs easily on top
67 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
USE CASES
• Managing a major retailer’s inventory across a
diverse network of entities in near real time
• Managing and processing event streams for online
gaming
• Supporting data science initiatives across massive
data sets at a media analytics company
68 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
WHY SHOULD BUSINESS CARE?
• Enables use cases Hadoop didn’t provide, all in one
platform
• streaming, interactive analytics, machine learning,
graphs
• Fast
• Iteration time down, more productive
• Use existing cluster investment
• Sits on HDFS, can run under YARN
(or use Amazon S3, or Cassandra)
69 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
WHY SHOULD BUSINESS CARE?
• SparkSQL
• Use SQL skills and tools, e.g. Tableau
• Dataframes integrate external data sources into one
context: RDBMS, Hive, JSON…
• Developer-friendly
• Concise and fluid to program
• Language integration: Scala, R, Python, Java
70 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
WHAT ARE NOTEBOOKS?
• Interactive documents that contain a program and
its output
• Long history: Mathematica
• Particularly successful with data science
• Projects to watch
• Jupyter — https://jupyter.org/
• Apache Zeppelin —
https://zeppelin.incubator.apache.org/
71 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
72 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
WHY SHOULD BUSINESS CARE?
• Easy collaboration and sharing of data science
• Think “Docker for analysis”
• Easy access to data and compute resource
• A building block for more self-service analytical
capabilities
Commercial version of Notebooks + Spark is the
Databricks Cloud
@SVDataScience
ENTERPRISE DATA
ARCHITECTURE
Towards a production
74 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
DATA
PLATFORM
Data Management
Security, Operations, Data Quality, Meta Data Management and Data Lineage
Analytics
Lo
w
L
at
en
cy
A
cc
es
s
Data
Ingest
Data
Repository
Persistence
Offline
Processing
Real-Time
Processing
Batch
Processing
Data
Services
External
Systems
Data Acquisition
Internal External
75 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
CHOICES: TOOLS
76 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
Graph Document Key-Value Columnar
Social networks
Ontologies
Knowledge, Property
Logging
Document archive
Web content
Shopping Cart
Session Data
Sensors
Network devices
Internet of Things
Technical Use Cases
CHOICES: DATABASES
77 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
Graph Document Key-Value Columnar
Social networks
Ontologies
Knowledge, Property
Logging
Document archive
Web content
Shopping Cart
Session Data
Sensors
Network devices
Internet of Things
CHOICES: DATABASES
SPECIALIZED
78 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
Graph Document Key-Value Columnar
Social networks
Ontologies
Knowledge, Property
Logging
Document archive
Web content
Shopping Cart
Session Data
Sensors
Network devices
Internet of Things
CHOICES: DATABASES
GENERAL PURPOSE
79 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
CHOICES: VELOCITY
SVDS R&D TRAINS
Batch:
• Using FFT transformed
frequency data, identify the
train based around
fundamental frequencies of
train whistle.
• Construct the decision tree
for train classifier based on
minimum and maximum
fundamental frequencies
Real-Time:
• Apply FFT to audio signal
• Extract min and max
fundamental frequencies
• Classify the train into local
or express
• Send data to the Event
Detector to alert the APP
• Store results in HBase
80 @SVDataScience
[Amazon] do services because
they’ve come to understand that
it’s the Right Thing. There are
without question pros and cons to
the SOA approach, and some of
the cons are pretty long. But
overall it’s the right thing because
SOA-driven design enables
Platforms. …
You wouldn’t really think that an
online bookstore needs to be an
extensible, programmable
platform. Would you?
+Steve Yegge
CHOICES: SERVICES
https://plus.google.com/112678702228711889851/posts/eVeouesvaV
X
81 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
CHOICES: DATA RESILIENCY
Hard Failure: If the data
source is broken, so is
the app.
Stovepipe: One-to-one
relationship from data
source to product.
Multi-sourced:
Redundancy of
overlapping data
sources makes your
products more
resilient.
Graceful
Degradation: If a data
source breaks, there is
a backup and your app
continues to function.
Production data
services abstract the
probabilistic
integration of
overlapping data
sources. We call this
model a Data Mesh.
82 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
CHOICES: EXTERNAL
SYSTEMS
Applications, visualization, business
intelligence
8383 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
üIncremental revenue
üTime to market
üEconomically viable
implementation
üCost avoidance
üBrand benefit
üEcosystem friendliness
DEFINING
SUCCESS
@SVDataScience
BREAK
8585 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
TODAY’S SCHEDULE
Introduction
Why Have a Data Strategy?
Connecting Data with the Business
Understanding Data Gaps
The Data Platform Architecture
Break
Identifying Strategic Workloads
The Chief Data Officer
The Experimental Enterprise
IDENTIFYING
STRATEGIC
WORKLOAD
S
86 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
HOW SVDS DOES DATA STRATEGY
• We work with your stakeholders to analyze and articulate a data
strategy.
• The data strategy provides an actionable roadmap that generates
immediate value and serves as the foundation for future
capability investments.
• We work to understand your current business and technology
landscapes in order to unlock untapped business opportunities.
• Our collaborative approach ensures that your business, product,
and technology teams become effective advocates within your
organization.
87 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
BUSINESS MODEL
TRANSFORMATION
PRODUCT RESEARCH &
RECOMMENDATION COMPANY
A product research and recommendation
company is transforming their core business
from content and information services to a
referrer of high-value transactions to partners.
SVDS devised a data strategy that enables new
analytical capabilities core to their retail
ambitions, addressing critical accuracy and
timeliness issues with unstructured data.
Based on this data strategy, they are building a
solution for near real-time product inventory
that increases their value to partners in a
complex, multi-tier market.
88 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
PERSONALIZED
USER EXPERIENCE
MEDIA & ENTERTAINMENT COMPANY
A media and entertainment company seeks to
deliver personalized content directly to users on
digital entertainment devices.
SVDS developed a data strategy and architecture
that enables real-time data ingestion, deeper
customer insight, and highly-personalized
content recommendations.
The data strategy and architecture design now
serve as the foundation for iterative, new
product development and guide technology
investments and acquisitions.
89 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
ACTION PLAN
& ROADMAP
OUR METHOD FOR DATA STRATEGY
IDENTIFY
STRATEGIC
IMPERATIVES
DEFINE
BUSINESS
OBJECTIVES
DEFINE DATA
REQUIREMENTS
IDENTIFY GAPS
IN CURRENT
SYSTEMS &
TECHNOLOGY
MAP BUSINESS
OBJECTIVES TO
USE CASES
RATIONALIZE USE CASES
INTO WORKLOADS
90 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
USE CASE
2
IDENTIFY YOUR STRATEGIC WORKLOADS
USE CASE
1
WORKLOAD
A
WORKLOAD
B
WORKLOAD
C
WORKLOAD
B
WORKLOAD
C
USE CASE
3
WORKLOAD
B
WORKLOAD
D
@SVDataScience © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
AN EXAMPLE
DATA STRATEGY FOR THE DOGS
92 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
NOTE: PetSmart is not an SVDS client. This is a fictional example based on public information.
http://risnews.edgl.com/retail-news/PetSmart-Leverages-Analytics-for-Personalized-Experience91783
AN EXAMPLE
DATA STRATEGY FOR THE DOGS
We’ve been investing in new
capabilities to help us capture
and use customer and pet data,
and this year, we will deliver on
new methods to use this data to
drive growth.
— David Lenhardt
PetSmart CEO
“
@SVDataScience © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
STRATEGIC IMPERATIVES
STRATEGIC
IMPERATIVES
BUSINESS
OBJECTIVES
USE CASES
WORKLOAD
Our strategy:
“To be the preferred
provider for the
lifetime needs of
pets.”
Connect with pet parents in
a personalized way
Attract and retain our most
valuable customers
Provide innovative products
& services at fair prices
Drive consistent execution
in our stores
@SVDataScience © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
AN EXAMPLE
Connect with pet parents
in a personalized way
Deliver personalized
recommendations and offers
Recommendation
Engine
Recommend new pet
products based on past
purchases at point of sale
Recommend upcoming
store/community events
based on customer
preferences
STRATEGIC
IMPERATIVES
BUSINESS
OBJECTIVES
USE CASES
WORKLOAD
@SVDataScience © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
BUSINESS OBJECTIVES
Illustrative
Connect with pet parents in
a personalized way
Learn from consumer
interactions
Optimize consumer journeys
based on insights
Deliver personalized
content to customers
1
2
3
. . .
STRATEGIC
IMPERATIVES
BUSINESS
OBJECTIVES
USE CASES
WORKLOAD
@SVDataScience © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
USE CASES
Deliver personalized
content to customers
1. Identify customers 2. Profile behaviors
4. Anticipate behaviors
. . .
3. Understand context
5. Optimize personalization
Illustrative
STRATEGIC
IMPERATIVES
BUSINESS
OBJECTIVES
USE CASES
WORKLOAD
@SVDataScience © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED.
WORKLOADS
Data Value Chain Example Workloads
Acquire • Capture mobile app transactions• Accessing streaming web activity data
Ingest • Flexible data ingestion• Ingest unstructured data
Process • Data validation• Omnichannel data integration
Persist • Heterogeneous data storage• Scalable data storage
Analyze • Probabilistic data integration
• Predictive modeling
Expose • Service based data access• Interactive visualization
STRATEGIC
IMPERATIVES
BUSINESS
OBJECTIVES
USE CASES
WORKLOAD
98 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
TECHNICAL WORKLOADS
Acquire
Ingest
Process
Persist
Analyze
Expose
1. Identify customers Technical Workload
Customer data (Acquire,
Ingest, Persist)
• Acquire multiple data sources & formats
• Flexible data ingestion
• Flexible & scalable data storage and
processing
Identity resolution • Probabilistic data integration
Data cleansing • Data validation
Householding • Probabilistic data integration
Relationship context • Detailed views of entities
Life-time Value • Feature engineering
Illustrative
99 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
2. Profile behaviors Technical Workload
360 degree view of customer • Detailed views of entities
Views of historical transactions • Time series analysis
Determination of ‘favorites’ • Predicting customer behavior
Map to archetype
• Stream processing
Evaluate previously unseen
transactions and classify
• Stream processing
Update archetypes • Feature extraction
• Analyze customer behavior
TECHNICAL WORKLOADS
Acquire
Ingest
Process
Persist
Analyze
Expose
Illustrative
100 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
3. Understand context Technical Workload
Characterize temporal customer
behavior
• Feature engineering
• Analyze customer behavior
Determine goal of next
interaction
• Predictive modeling
Categorize content needs • Predictive modeling
TECHNICAL WORKLOADS
Acquire
Ingest
Process
Persist
Analyze
Expose
Illustrative
101 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
4. Anticipate behaviors Technical Workload
Score product offers with
likelihood to respond
• Integrate internal systems
• Service based data access
Score content options with
likelihood to respond
• Integrate internal systems
• Service based data access
Identify next best action • Third party structured data
integration
• Business rules execution
TECHNICAL WORKLOADS
Acquire
Ingest
Process
Persist
Analyze
Expose
Illustrative
102 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
5. Optimize personalization Technical Workload
Apply business rules, constraints
to personalization options
• Business rule execution
Select optimal personalization to
achieve goal
• Optimization execution
TECHNICAL WORKLOADS
Acquire
Ingest
Process
Persist
Analyze
Expose
Illustrative
103 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
PRIORITIES DIMENSIONS OVERCOME YOUR
ASSUMPTIONS
FOCUS ON THE VALUE
104 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
DEVELOPMENT HORIZONS
Illustrative
105 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
TECHNICAL WORKLOAD PRIORITIZATION
TECHNICAL WORKLOAD
STRATEGIC
VALUE
TECHNICAL
FEASIBILITY
ACCESSIBILITY
OF REQUIRED
SKILLS
ARCHITECTURAL
FIT
PROD ROLL-
OUT EFFORT
Real time recommendations 10
Omnichannel data integration 10
Predictive modeling 9
Unstructured text analytics 8
Behavioral analytics 7
Data quality monitoring 6
Pattern recognition 5
Heterogeneous data storage 3
Data ingestion 3
Illustrative
106 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
DEFINE YOUR ROADMAP
107 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
Plan Prove Pilot Production
We define a project plan to build a specific capability.
For each capability, we describe a project to build
technical workloads that implement use cases that
address high-priority business objectives.
Silicon Valley Data Science employs an agile development
processes as we work with our clients from planning and
proof-of-concepts to pilot implementations and finally
full scale production systems.
PROJECT ACTION PLAN
Plan Prove Pilot Production Agile Build
Process
Illustrative
108 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
PATH FORWARD
Horizon I Horizon II Horizon III Horizon IV
2-3
months
5-6
months
3-4
months
3-4
months
0 months
Illustrative
109 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
DEVISING A PROJECT PLAN:
INPUTS & APPROACH
Technical Workload
AssessmentData Gaps
Project Roadmaps
Workload Rationalization Development Horizons
110 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
RINSE REPEATLATHER
111 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
MAKE SURE IT’S FLEXIBLE
• Technology moves incredibly fast, and competitive
landscapes are highly dynamic.
• Your data strategy should be a living document,
revisited often and revised as conditions change.
112 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
MAKE SURE IT’S ACTIONABLE
• If it isn’t clear how you’re going to execute your
strategy, then you don’t have the right one.
• Must work within the realm of the possible and
practical.
113 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
FROM IDEA TO PRODUCTION
We identify the business goals, distill those into use cases, and then
work in short, iterative cycles to achieve tangible gains.
Plan Prototype Pilot Production
What can we do with data?
114 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
MODERNIZING DATA TECHNOLOGY
HEALTH MANAGEMENT COMPANY
Aging data infrastructure and brittle application
integration was inhibiting growth and business
insight for a health management company.
Their data strategy focused on creating a
concrete roadmap for migrating to a new data
platform so that technology and infrastructure
are no longer a barrier to growth and
transparency.
Based on this data strategy, they are building a
new data platform in stages that allows them to
add new products and services to capture more
market opportunity.
115 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
Case Study: Data Strategy
Major Pharmaceutical Company
Defined Data Strategy that will help enable business growth and enable
expansion into new markets
Challenge
• Ongoing need to improve
discovery and better predict new
targets for drug development
• Difficulty to integrate new data
sources into identification &
discovery processes
• Inability to connect business
strategy & aims with specific,
tangible projects
Solution
• SVDS devised a data strategy with a
concrete roadmap for migrating to
a new data platform
• Recommended data technology &
architecture which supports highest
value projects
• Outlined cultural, technological,
organizational, and collaboration
challenges & objectives
Results
• Identified specific opportunity areas
to increase GTM efficiency
• Prescribed Common Data and
Analytics Platform for Commercial
and R&D operations
• Recommended projects for
Predictive Modeling & Data
Exploration
116 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
DATA STRATEGY CHECKLIST
¨ Identify your business objectives
¨ Go from objectives to tactics
¨ Include all stakeholders in the conversation
¨ Look at how technology can support strategic
workloads
¨ Exploit patterns and reuse
¨ Prioritize the possibilities to figure out where to start
¨ Define your roadmap with an end-point in mind
¨ Lather, rinse, repeat
117117 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
TODAY’S SCHEDULE
Introduction
Why Have a Data Strategy?
Connecting Data with the Business
Understanding Data Gaps
The Data Platform Architecture
Break
Identifying Strategic Workloads
The Chief Data Officer
The Experimental Enterprise
THE CHIEF DATA
OFFICER
118
DO YOU NEED
EXECUTIVE HELP?
119 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
To download a free PDF, go to:
www.svds.com/CDOreport
120 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
EMERGENCE OF THE CDO
• Started with heavily regulated industries such as
government and finance
• Now becoming common in “disruptable” industries
such as retail and telecommunications
121 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
RESPONSIBILITIES OF THE CDO
Centralization:
• Data from internal silos
• Data from external APIs and real-time streams
• The organization’s priorities
122 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
RESPONSIBILITIES OF THE CDO
Evangelization:
• Technical chops, business savvy, and the
diplomacy skills to translate between the two
123 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
RESPONSIBILITIES OF THE CDO
Facilitation:
• Coordinate stakeholders across the organization
• Free up resources and lower barriers
• Offer tools and training to help others succeed
124 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
CHALLENGES FOR THE CDO
Building technical bridges:
• Working with data in different silos, formats, etc.
Mining for business value:
• “If you don’t have good business questions it
doesn’t matter what kind of technology you
have.” — Joy Bonaguro
125 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
UNDERSTANDING THE CDO
“While technology is inevitably involved when working
with data, the defining goal of the CDO is not
technological, but business-oriented. The ideal CDO
exists to drive business value.”
— Julie Steele
126 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
DECIDING TO HIRE A CDO
Know why you want one:
• Are you part of a regulated industry?
• Do you need to move from being product-centric
to customer-centric?
• Could you add products or services?
• Could your current processes and outcomes be
optimized even further?
• Are there insights in one part of your company
that could benefit others?
127 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
DECIDING TO HIRE A CDO
Look for the right skill set:
• Technical chops
• Business savvy
• Diplomacy and political skills
• Executive-level experience
128 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
THE AVAILABILITY GAP
“The spike in demand for Chief Digital Officers has been
felt globally. In Europe, the number of search requests
for this role has risen by almost a third in the last 24
months. The United States has seen the same growth in
half that time.”
— Russell Reynolds Associates
129 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
PREPPING FOR SUCCESS
Companies that are eager and prepared for real change
will be the most appealing to qualified CDO candidates.
130130 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
TODAY’S SCHEDULE
Introduction
Why Have a Data Strategy?
Connecting Data with the Business
Understanding Data Gaps
The Data Platform Architecture
Break
Identifying Strategic Workloads
The Chief Data Officer
The Experimental Enterprise
THE EXPERIMENTAL
ENTERPRISE
131 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
“…let’s seek to understand how
the new generation of
technology companies are
doing what they do, what the
broader consequences are for
businesses and the economy.”
– Marc Andreesen
132 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
DIGITAL NERVOUS SYSTEM
133 @SVDataScience
Data is your business.
134 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
Disruptive
Change
Cloud
Computing
Customer
Content
Internet of
Things
User
Experience
SAAS & Apps
Business
Intelligence Consumer IT
Regulation
Employees
Partners
Contractors
Suppliers
?
135 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
FROM: Innosight Executive Briefing Winter 2012
136 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
SILICON VALLEY’S DATA MACHINE
137 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
138 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
UP VS. OUT
$,
€
, ¥
, £
Data Resource Usage
Scale-up cost
Scale-out cost
UC1
UC2
UC3
UC4
UC5
139 @SVDataScience
The legacy of big data is business agility.
140140 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
• Make it cheap
• Failure as a feature
• Ask good questions
• Make it quick
• Both learning and
adaptation
• Enable the feedback loop
• Don’t break things
• Make operations a
platform for innovation
• APIs, platforms, simulation
BUILD FOR
EXPERIMENTS
141 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
THE EXPERIMENTAL ENTERPRISE
Supports investigative work and
builds a solid layer for production.
Conducts experiments and responds
to the changing environment.
Makes foundational infrastructure
readily accessible.
142 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
LEAD A DATA REVOLUTION
• You can only win with situational awareness
• New architectures offer new opportunities
• Creation of data-driven value requires new approach
• Create an Experimental Enterprise
• Business must lead, and understand the potential of
the technology
143 © 2017 SILICON VALLEY DATA SCIENCE LLC. ALL RIGHTS RESERVED. @SVDataScience
To view SVDS speakers and scheduling,
or to receive a copy of our slides, go to:
www.svds.com/StrataCA2017
THANK YOU
Ask how we can help
info@svds.com
Edd Wilder-James (@edd)
Scott Kurth (@ScottWKurth)
March 2017