Swipe on tablet or mobile.
And use your arrow keys on desktop.
Data analytics gets a lot of hype.
A lot of that hype is good hype.
The kind that powerful data-analytics initiatives deserve.
But a lot of it is bad hype.
The kind that confuses data-analytics with things it isn’t…
The difference between analytics and AI, ML and DL
- A system for discovering, interpreting, and communicating
meaningful patterns in data.
- Artificial intelligence (AI):
- This is a machine’s ability to learn through
problem solving, and improve on that ability over time, by itself, until it
demonstrates ‘intelligent’ behaviours like visual perception and decision-making.
- Machine learning (ML):
- This is a system’s ability to automatically learn from
– and respond to – data without additional programming.
- Deep learning (DL):
- This is about using multiple layers of artificial neural
networks to power complex data initiatives, unearthing insights humans
couldn’t possibly spot on their own.
So let’s talk about what
data analytics really is
- without the frills.
Data analytics is the practice of applying an
algorithmic or mechanical process to derive insights
from data that inform queries set by users.
Here’s what it really is:
Data analytics is how you give the smartest people in
your business the ability to find verifiable answers to
the most important questions.
The kinds of questions that help you run your business well. Things like:
The kinds of questions that help you
run your business well.
What do my highest
lifetime value customers
have in common?
If I could predict X (the
weather, the traffic, etc)
…how would I change
Do some product managers
have an outsized impact on
our speed to market?
Do we lose customers
gradually or suddenly?
What are the anomalous
behaviors of a fraudster
and how quickly can we
Could certain elements
of our business data
indicate the presence
How would we build on our
prevention policies if we
could detect banking fraud
in real time?
See, analytics for the sake of analytics is meaningless.
You need to be aiming for a
specific business outcome:
a clear goal.
Speaking of specific outcomes:
– here are the three big
things you’ll be able to do
when you embark on a
your business with new product
ideas and market insights
Customer and supplier 360 | Market segmentation |
Recommendation engines | Predictive product development
your products and services
Predictive maintenance | Predictive health | Customer churn
prediction | Financial risk assessment and prediction
Cyber security | Fraud prevention | Anti money laundering |
Insider trading | SPAM detection
(And yes, build the foundations for ML and
AI initiatives further down the line.)
“If your company isn’t good at analytics, you’re not ready for AI”
Harvard Business Review, 2017
So let’s talk about how to achieve
those three big outcomes.
If you haven’t already, you’ll want to do
something called data “governance”.
This is where you get your data ready for analysis
– and make sure it stays tidy.
Data can be messy. Here’s an example. Some customers get called ‘customers’ only
once they’ve paid. Others get called ‘customers’ when they’re only using a free trial.
If you’re a company trying to analyze your customer list, that small discrepancy could
Data governance is about finding discrepancies like that, then setting up organization-
wide policies, standards, and metrics for dealing with them.
The next step is where
the fun starts
This is where you actually ask the data questions.
The good news – it’s not that hard to write a basic data query.
The good news
– it’s not that hard to write a basic data query.
want to know:
How many of our products that
cost between 50 and 100 dollars
are out of stock?
How to ask
SELECT * FROM Products
WHERE Price BETWEEN 50 AND 100
AND Inventory IS 0
And with some practice, you’ll be writing more complex data queries, like:
And with some practice,
you’ll be writing more complex data queries,
want to know:
From my selected sample of
different hotels and room
availabilities, which particular
hotels have at least two rooms
available for less than $186?
How to ask
SELECT a.hotel_name, a.city,
COUNT(*) AS room_count FROM
hotels a INNER JOIN rooms b
ON a.hotel_code = b.hotel_code
AND b.price < 186 GROUP BY a.hotel_name,
a.city HAVING COUNT(*) > 2
ORDER BY room_count DESC
Now for the tech.
The first thing you need to know
– the cloud is way better for data
You can ingest terabytes
of data ,
Storing all that data
is easier and it takes less
time to access it.
You can run advanced
DATA transformations and
data data data data data data data data data data data data data data data data data data data data data data data data data data data data data data data data data data data data data data data data
Your data is more
How older data warehouse architectures
Your data warehouse architecture has a huge impact on your ability to do data
Older architectures can make analytics projects cost as much as 60-80%1 more
– they slow things down, make it hard to pivot, and generally get in the way of
everything you want from an analytics initiative.
They just weren’t designed for what we’re doing today.
In a nutshell:
Cloud faster data insights cheaper data insights smarter data insights better data insights
It’s why we only ever recommend
cloud-native services for data analysis,
like Google Cloud Platform
By the way, that’s the same platform that Google use to analyse their own data.
– you know, all the information on the planet.
Using Google Cloud Platform, some pretty
big companies have brought about some
pretty neat outcomes…
Online supermarket Ocado used to store
their data in clunky, inaccessible silos.
That had to change. It was slowing down
their operations and significantly delaying
their customer-response times.
So they migrated nearly all their
business information – over 100TB of
data – to Google Cloud Platform.
- Urgent-customer-email response times
are 4x faster
- Contact-center efficiency is up 7%
- Data queries get responded to 80x faster
– at 33% lower cost
- IT overheads have reduced significantly
- Human disease
Every day the Broad Institute generates
25TB of data, searching for life-saving
insights hidden in our DNA.
To manage all that data, they recently
ditched their in-house data-analytics
function and storage setup for Google
- Human genomes are sequenced 400%
- Global collaboration is seamless – with no
compromise on security
- The company can now scale dynamically,
- Human disease
The Telegraph used to use old-school
legacy tech for their enterprise data
warehousing. Every day their costs
mounted up as their teams struggled to
access actionable data insights.
So they switched to Google Cloud
- Ad-campaign-performance data gets
processed 18x faster
- Errors in data processing and analysis are
detected and assessed in real time
- Up to 4TB of data can be processed in
under a minute
your story say?