SlideShare a Scribd company logo
1 of 68
Download to read offline
STAT I ST I CA L P R O G RA M M I N G
I N JAVAS C R I PT
D av i d S i m o n s
@ Swa m Wi t h Tu rt l e s
slides:
www.tinyurl.com/stats-js
demos:
swamwithturtles.github.io/js-statistics
code:
github.com/SwamWithTurtles/js-statistics
W H O A M I ?
Freelance
Software
Developer
@SwamWithTurtles
Java and
JavaScript
Afraid of goats?
W H O A M I ?
DATA
NERD
C O N T E N T S
T H E O RY CA S E S T U D I E S
JAVA S C R I P T
A P P L I CAT I O N
W H AT I S
DATA ?
G A I N I N G
I N S I G H T S
R A N D O M N E S S S I M U L AT I O N
L E A R N I N G T H R O U G H
Reward: What shape is the internet?
Data
B E H I N D T H E H O O D
A P I
D B
A D M I N
I N T E R F A C E
S C H E D U L E D
T A S K S
3 R D
P A R T Y
A P I S
W H AT D ATA
WA S T H E R E ?
S O …
W H AT D ATA
WA S T H E R E ?
• Counts of lists (e.g. brands,
products etc.)
• Stock levels and prices of
products
• Days an item has been out
of stock
W H AT D ATA
WA S T H E R E ?
• Non-functional data
• Numbers of users
• Performance for users
• Performance of third
party APIs
• Robustness of system
(Uptime, status codes,
frequency of errors)
T H E R E I S D ATA
E V E RY W H E R E
T H E L E S S O N ?
What is data?
What is good data?
W H AT D ATA
S H O U L D I C A R E
A B O U T ?
• Data you get repeatedly
• Data you can extract
‘information’ from
• Normally this means
numerical data, though
NLP is getting big!
• Data that answers valuable
questions
Gaining Insights
A d a t a s e t :
Identification WIND CEILING TEMP DEWPT RHX
USAF NCDC Date HrMn I Type QCP Dir Q I Spd Q Hgt Q I I Temp Q Dewpt Q RHx
865300,99999,19860401,0000,4,FM-12, ,110,1,N, 7.2,1,22000,1,C,N, 21.6,1, 19.2,1, 86,
865300,99999,19860401,0300,4,FM-12, ,110,1,N, 5.1,1,22000,1,C,N, 19.4,1, 18.5,1, 95,
865300,99999,19860401,0600,4,FM-12, ,070,1,N, 7.2,1,03600,1,C,N, 19.2,1, 999.9,9,999,
865300,99999,19860401,0900,4,FM-12, ,070,1,N, 6.2,1,00120,1,C,N, 19.2,1, 18.9,1, 98,
865300,99999,19860401,1200,4,FM-12, ,070,1,N, 7.7,1,03600,1,C,N, 21.6,1, 18.3,1, 82,
865300,99999,19860401,1500,4,FM-12, ,040,1,N, 9.8,1,03600,1,C,N, 23.0,1, 18.8,1, 77,
865300,99999,19860401,1800,4,FM-12, ,030,1,N, 6.2,1,03600,1,C,N, 19.6,1, 19.0,1, 96,
865300,99999,19860401,2100,4,FM-12, ,050,1,N, 6.7,1,03600,1,C,N, 19.0,1, 18.7,1, 98,
865300,99999,19860402,0000,4,FM-12, ,340,1,N, 7.2,1,03600,1,C,N, 20.0,1, 19.4,1, 96,
865300,99999,19860402,0300,4,FM-12, ,360,1,N, 4.1,1,03600,1,C,N, 19.4,1, 19.1,1, 98,
865300,99999,19860402,0600,4,FM-12, ,999,1,C, 0.0,1,03600,1,C,N, 19.2,1, 18.9,1, 98,
865300,99999,19860402,0900,4,FM-12, ,999,1,C, 0.0,1,00210,1,C,N, 19.0,1, 18.7,1, 98,
865300,99999,19860402,1200,4,FM-12, ,200,1,N, 2.6,1,00210,1,C,N, 20.4,1, 20.1,1, 98,
865300,99999,19860402,1500,4,FM-12, ,210,1,N, 5.1,1,00750,1,C,N, 23.2,1, 19.3,1, 79,
865300,99999,19860402,1800,4,FM-12, ,200,1,N, 3.1,1,00750,1,C,N, 26.4,1, 18.4,1, 62,
865300,99999,19860402,2100,4,FM-12, ,999,1,C, 0.0,1,22000,1,C,N, 26.2,1, 17.1,1, 57,
865300,99999,19860403,0000,4,FM-12, ,140,1,N, 4.1,1,22000,1,C,N, 19.2,1, 17.0,1, 87,
865300,99999,19860403,0300,4,FM-12, ,999,1,C, 0.0,1,22000,1,C,N, 15.8,1, 15.2,1, 96,
865300,99999,19860403,0600,4,FM-12, ,999,1,C, 0.0,1,22000,1,C,N, 15.4,1, 14.0,1, 91,
865300,99999,19860403,1200,4,FM-12, ,060,1,N, 5.1,1,22000,1,C,N, 21.0,1, 19.8,1, 93,
865300,99999,19860403,1500,4,FM-12, ,060,1,N, 4.1,1,00900,1,C,N, 24.8,1, 21.3,1, 81,
865300,99999,19860403,1800,4,FM-12, ,050,1,N, 7.7,1,09000,1,C,N, 28.0,1, 21.4,1, 67,
865300,99999,19860403,2100,4,FM-12, ,040,1,N, 5.1,1,09000,1,C,N, 25.4,1, 21.4,1, 79,
865300,99999,19860404,0000,4,FM-12, ,060,1,N, 6.2,1,03600,1,C,N, 22.2,1, 21.3,1, 95,
865300,99999,19860404,0300,4,FM-12, ,050,1,N, 5.1,1,09000,1,C,N, 21.0,1, 20.7,1, 98,
865300,99999,19860404,0600,4,FM-12, ,060,1,N, 6.2,1,22000,1,C,N, 20.2,1, 19.9,1, 98,
865300,99999,19860404,1200,4,FM-12, ,040,1,N, 5.1,1,00120,1,C,N, 20.4,1, 19.5,1, 95,
865300,99999,19860404,1500,4,FM-12, ,020,1,N, 7.7,1,00420,1,C,N, 24.2,1, 20.4,1, 79,
865300,99999,19860404,1800,4,FM-12, ,250,1,N, 4.1,1,00750,1,C,N, 25.6,1, 20.7,1, 74,
865300,99999,19860404,2100,4,FM-12, ,250,1,N, 5.1,1,00750,1,C,N, 23.6,1, 20.4,1, 82,
865300,99999,19860405,0000,4,FM-12, ,180,1,N, 6.2,1,00420,1,C,N, 20.2,1, 19.6,1, 96,
s u m m a r y s t a t i s t i c s
S U M M A RY
S TAT I S T I C S
• A statistic is a function of
the data we have inputed
• It aims to capture
information about values
to make it more
understandable
T H E FA M O U S
O N E :
• Mean (‘average’)
• Sum all of the data
and divide by the
number of items
• Gives a sense of ‘size’
Group 1:
Group 2:
O T H E R
S TAT I S T I C S
• “Location”
• Mean, Mode, Median
• “Spread”
• Standard Deviation
• “Shape”
• Skew, Kurtosis
D E M O
Distributions
What is a random variable?
Discrete Variables
Can be any of a list of values, each with its own probability
H E A D S 0 . 5
TA I L S 0 . 5
2 1 / 3 6
3 2 / 3 6
4 3 / 3 6
5 4 / 3 6
6 5 / 3 6
7 6 / 3 6
8 5 / 3 6
9 4 / 3 6
1 0 3 / 3 6
1 1 2 / 3 6
1 2 1 / 3 6
This makes sense:
X = Result of a coin flip
H E A D S 0 . 5
TA I L S 0 . 5 But:
X won’t always have the
same value
R A N D O M VA R I A B L E S
X = Result of a coin flip
H E A D S 0 . 5
TA I L S 0 . 5
X is a
Random Variable
This is its distribution
D E M O …
Continuous
A numerical variable,
that can be any number
(sometimes within a range)
height
weight
Math.random()
H O W D O W E D E F I N E T H E
D I S T R I B U T I O N ?
Math.random() height
D E M O
S O W H AT ?
E R R R …
• When we do data analysis,
we’re really looking at the
range of values a random
variable can be…
• … and asking questions
about its distribution.
Y O U ’ R E A N
A U D I T O R
I M A G I N E …
A U D I T I N G A
L E D G E R
• Make a list of all ingoing
and outgoing transactions
• These are random
variables.
• What is their distribution?
Does it deviate from what
we expect?
B E N F O R D ’ S L A W
http://www.journalofaccountancy.com/Issues/1999/May/nigrini
I N T U I T I V E
U S E R I N P U T S
D E S I G N I N G
O U R TA S K …
• Designing a system that
tries to understand what
happens under financial
system “shocks”
• So: a user would input a
shock, its impacts would
propagate and we would
see our bottom line.
O U R F I R S T AT T E M P T
• Shock ‘sliders’ that scaled linearly
0 %
2 5 %
B O O M
9 0 %
B U S T
D I S T R I B U T I O N O F F I N A N C I A L
C H A N G E S
S O …
• Shock ‘sliders’ that scaled linearly
0 %
8 %
B O O M
1 0 5 %
B U S T
Change that happens
with 75% chance
Change that happens
with 10% chance
Randomness
M A K I N G R A N D O M VA R I A B L E S
S O M E
WA R N I N G S
• Exactly what randomness
means is a fuzzy question.
• These numbers are not
‘cryptographically’
random.
J AVA S C R I P T ’ S
E N T RY T O
R A N D O M N E S S
• Different runtimes can
implement it differently.
• V8 implements Multiply-With-
Carry:
• Take a sequence of ‘seed’
values
• Iteratively perform modular
arithmetic-based operations
• Extend the initial seed values
to a longer sequence.
Math.random()
W H AT A B O U T
O T H E R
D I S T R I B U T I O N S ?
B U T …
T H E S H O R T A N S W E R
Math.random()= f( )
T H E S H O R T A N S W E R
=
H E A D S 0 . 5
TA I L S 0 . 5
=
W H AT ’ S T H E F U N C T I O N ?
jStat
beta
centralF
cauchy
chi-squared
exponential
gamma
inverse gamma
kumaraswamy
lognormal
normal
pareto
student t
uniform
weibull
binomial
negative binomial
hypergeometric
poisson
triangular
OR
U S I N G R A N D O M N E S S
w hy w o u l d i w a n t
t o u s e
R A N D O M N E S S
?
S T U B B E D
T E S T D ATA
• Avoid coupling yourself to
specific test
implementations
• Spin-up life-like
environments for load
testing
N O N -
D E T E R M I N I S T I C
A L G O R I T H M S
• Modelling underlying or
random data
• Solving a problem that is
expensive or impossible to
solve perfectly
P I T FA L L S
C H O O S I N G T H E
D I S T R I B U T I O N
• What if a ‘uniform’
distribution isn’t enough?
• What if we want random
data that isn’t just
numbers?
E X A M P L E : S O C I A L N E T W O R K
E X A M P L E : S O C I A L N E T W O R K
11 Traversals
D E M O
B a r a b a s i - A l b e r t
R a n d o m M o d e l
B A R A B A S I - A L B E R T
R A N D O M M O D E L
• Start with two linked
objects
• Add one new object at a
time
• Link that object to one
existing object, with
already ‘popular’ objects
more likely to be chosen.
T H I S
M O D E L S …
• Academic Citations
• Actor filmographies
• Spread of Infectious
diseases
• Social Networks
C O N T E N T S
T H E O RY CA S E S T U D I E S
JAVA S C R I P T
A P P L I CAT I O N
W H AT I S
DATA ?
G A I N I N G
I N S I G H T S
R A N D O M N E S S S I M U L AT I O N
L E A R N I N G T H R O U G H
Reward: What shape is the internet?
We’reOUTof
TIME
• Data is any information we collect. Not all data is
valuable.
• Seeing trends in lots of numbers is hard. Summary
statistics and charts help us unpick its meaning.
• Data can be treated as random ‘realisations’ from a
backing distribution.
• Making random variables is easy, and can be done in
different shapes for different purposes.
W H AT I S
DATA ?
G A I N I N G
I N S I G H T S
R A N D O M N E S S S I M U L AT I O N
L I B R A R I E S W E U S E D
G E N E R A L L I B R A R I E S
K N O C K O U T. J S
R E Q U I R E . J S
B O O T S T R A P
D ATA M A N I P U L AT I O N
L O D A S H
J S TAT
D ATA I M P O RT PA PA PA R S E
C H A RT I N G
D 3
C H A R T. J S
T H A N K YO U
D av i d S i m o n s
@ Swa m Wi t h Tu rt l e s

More Related Content

What's hot

Gain Maximum Visibility into Your Applications
Gain Maximum Visibility into Your Applications Gain Maximum Visibility into Your Applications
Gain Maximum Visibility into Your Applications Amazon Web Services
 
100% Visibility - Jason Yee - Codemotion Amsterdam 2018
100% Visibility - Jason Yee - Codemotion Amsterdam 2018100% Visibility - Jason Yee - Codemotion Amsterdam 2018
100% Visibility - Jason Yee - Codemotion Amsterdam 2018Codemotion
 
SharePoint Saturday Redmond - Building solutions with the future in mind
SharePoint Saturday Redmond - Building solutions with the future in mindSharePoint Saturday Redmond - Building solutions with the future in mind
SharePoint Saturday Redmond - Building solutions with the future in mindChris Johnson
 
100% de visibilidade nas suas aplicações - DEM03 - Sao Paulo Summit
100% de visibilidade nas suas aplicações -  DEM03 - Sao Paulo Summit100% de visibilidade nas suas aplicações -  DEM03 - Sao Paulo Summit
100% de visibilidade nas suas aplicações - DEM03 - Sao Paulo SummitAmazon Web Services
 
Wrangle Your Defense Using Offensive Tactics BSides CT 2019
Wrangle Your Defense Using Offensive Tactics BSides CT 2019Wrangle Your Defense Using Offensive Tactics BSides CT 2019
Wrangle Your Defense Using Offensive Tactics BSides CT 2019Matt Dunn
 
Gain Maximum Visibility - DEM06 - Anaheim AWS Summit
Gain Maximum Visibility - DEM06 - Anaheim AWS SummitGain Maximum Visibility - DEM06 - Anaheim AWS Summit
Gain Maximum Visibility - DEM06 - Anaheim AWS SummitAmazon Web Services
 
Data Interoperability for Learning Analytics and Lifelong Learning
Data Interoperability for Learning Analytics and Lifelong LearningData Interoperability for Learning Analytics and Lifelong Learning
Data Interoperability for Learning Analytics and Lifelong LearningMegan Bowe
 
Gain Maximum Visibility into Your Applications - DEM04 - Atlanta AWS Summit
Gain Maximum Visibility into Your Applications - DEM04 - Atlanta AWS SummitGain Maximum Visibility into Your Applications - DEM04 - Atlanta AWS Summit
Gain Maximum Visibility into Your Applications - DEM04 - Atlanta AWS SummitAmazon Web Services
 
10 d bs in 30 minutes
10 d bs in 30 minutes10 d bs in 30 minutes
10 d bs in 30 minutesDavid Simons
 
Wrangle Your Defense Using Offensive Tactics - ISSA May Meeting
Wrangle Your Defense Using Offensive Tactics - ISSA May MeetingWrangle Your Defense Using Offensive Tactics - ISSA May Meeting
Wrangle Your Defense Using Offensive Tactics - ISSA May MeetingMatt Dunn
 
AWS Seminar Series 2015 Melbourne
AWS Seminar Series 2015 MelbourneAWS Seminar Series 2015 Melbourne
AWS Seminar Series 2015 MelbourneAmazon Web Services
 
Thinking like a Network
Thinking like a NetworkThinking like a Network
Thinking like a NetworkJonas Altman
 
AWS Seminar Series 2015 Brisbane
AWS Seminar Series 2015 BrisbaneAWS Seminar Series 2015 Brisbane
AWS Seminar Series 2015 BrisbaneAmazon Web Services
 
Beyond the Retrospective: Embracing Complexity on the Road to Service Ownership
Beyond the Retrospective: Embracing Complexity on the Road to Service OwnershipBeyond the Retrospective: Embracing Complexity on the Road to Service Ownership
Beyond the Retrospective: Embracing Complexity on the Road to Service OwnershipJ. Paul Reed
 
Ellicium Solutions - Making Data Science Work
Ellicium  Solutions - Making Data Science Work Ellicium  Solutions - Making Data Science Work
Ellicium Solutions - Making Data Science Work Ellicium Solutions Inc.
 

What's hot (20)

Gain Maximum Visibility into Your Applications
Gain Maximum Visibility into Your Applications Gain Maximum Visibility into Your Applications
Gain Maximum Visibility into Your Applications
 
100% Visibility - Jason Yee - Codemotion Amsterdam 2018
100% Visibility - Jason Yee - Codemotion Amsterdam 2018100% Visibility - Jason Yee - Codemotion Amsterdam 2018
100% Visibility - Jason Yee - Codemotion Amsterdam 2018
 
SharePoint Saturday Redmond - Building solutions with the future in mind
SharePoint Saturday Redmond - Building solutions with the future in mindSharePoint Saturday Redmond - Building solutions with the future in mind
SharePoint Saturday Redmond - Building solutions with the future in mind
 
100% de visibilidade nas suas aplicações - DEM03 - Sao Paulo Summit
100% de visibilidade nas suas aplicações -  DEM03 - Sao Paulo Summit100% de visibilidade nas suas aplicações -  DEM03 - Sao Paulo Summit
100% de visibilidade nas suas aplicações - DEM03 - Sao Paulo Summit
 
Yammer time
Yammer timeYammer time
Yammer time
 
eHarmony @ Phoenix Con 2016
eHarmony @ Phoenix Con 2016eHarmony @ Phoenix Con 2016
eHarmony @ Phoenix Con 2016
 
Wrangle Your Defense Using Offensive Tactics BSides CT 2019
Wrangle Your Defense Using Offensive Tactics BSides CT 2019Wrangle Your Defense Using Offensive Tactics BSides CT 2019
Wrangle Your Defense Using Offensive Tactics BSides CT 2019
 
Gain Maximum Visibility - DEM06 - Anaheim AWS Summit
Gain Maximum Visibility - DEM06 - Anaheim AWS SummitGain Maximum Visibility - DEM06 - Anaheim AWS Summit
Gain Maximum Visibility - DEM06 - Anaheim AWS Summit
 
Data Interoperability for Learning Analytics and Lifelong Learning
Data Interoperability for Learning Analytics and Lifelong LearningData Interoperability for Learning Analytics and Lifelong Learning
Data Interoperability for Learning Analytics and Lifelong Learning
 
Gain Maximum Visibility into Your Applications - DEM04 - Atlanta AWS Summit
Gain Maximum Visibility into Your Applications - DEM04 - Atlanta AWS SummitGain Maximum Visibility into Your Applications - DEM04 - Atlanta AWS Summit
Gain Maximum Visibility into Your Applications - DEM04 - Atlanta AWS Summit
 
10 d bs in 30 minutes
10 d bs in 30 minutes10 d bs in 30 minutes
10 d bs in 30 minutes
 
Wrangle Your Defense Using Offensive Tactics - ISSA May Meeting
Wrangle Your Defense Using Offensive Tactics - ISSA May MeetingWrangle Your Defense Using Offensive Tactics - ISSA May Meeting
Wrangle Your Defense Using Offensive Tactics - ISSA May Meeting
 
AWS Seminar Series 2015 Melbourne
AWS Seminar Series 2015 MelbourneAWS Seminar Series 2015 Melbourne
AWS Seminar Series 2015 Melbourne
 
Thinking like a Network
Thinking like a NetworkThinking like a Network
Thinking like a Network
 
AWS SeMINAR SERIES 2015 Sydney
AWS SeMINAR SERIES 2015 SydneyAWS SeMINAR SERIES 2015 Sydney
AWS SeMINAR SERIES 2015 Sydney
 
AWS Seminar Series 2015 Brisbane
AWS Seminar Series 2015 BrisbaneAWS Seminar Series 2015 Brisbane
AWS Seminar Series 2015 Brisbane
 
AWS SEMINAR SERIES 2015 Perth
AWS SEMINAR SERIES 2015 PerthAWS SEMINAR SERIES 2015 Perth
AWS SEMINAR SERIES 2015 Perth
 
Auckland AWS Seminar Series
Auckland AWS Seminar SeriesAuckland AWS Seminar Series
Auckland AWS Seminar Series
 
Beyond the Retrospective: Embracing Complexity on the Road to Service Ownership
Beyond the Retrospective: Embracing Complexity on the Road to Service OwnershipBeyond the Retrospective: Embracing Complexity on the Road to Service Ownership
Beyond the Retrospective: Embracing Complexity on the Road to Service Ownership
 
Ellicium Solutions - Making Data Science Work
Ellicium  Solutions - Making Data Science Work Ellicium  Solutions - Making Data Science Work
Ellicium Solutions - Making Data Science Work
 

Similar to Statistical Programming with JavaScript

Why Every Product Manager Needs to Know Big Data
Why Every Product Manager Needs to Know Big DataWhy Every Product Manager Needs to Know Big Data
Why Every Product Manager Needs to Know Big DataJeremy Horn
 
Graph theory in Practise
Graph theory in PractiseGraph theory in Practise
Graph theory in PractiseDavid Simons
 
R - what do the numbers mean? #RStats
R - what do the numbers mean? #RStatsR - what do the numbers mean? #RStats
R - what do the numbers mean? #RStatsJen Stirrup
 
Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...
Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...
Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...Massimiliano Crosato
 
Four Architectural Patterns
Four Architectural Patterns Four Architectural Patterns
Four Architectural Patterns David Simons
 
Six Things You Need to Know About the Modern Call Center
Six Things You Need to Know About the Modern Call CenterSix Things You Need to Know About the Modern Call Center
Six Things You Need to Know About the Modern Call CenterSpoken Communications
 
GW Intro to Digital Communications Class 6
GW Intro to Digital Communications Class 6 GW Intro to Digital Communications Class 6
GW Intro to Digital Communications Class 6 Geoff Livingston
 
Scaling your Tableau - Migrating from Tableau Online to a proper DWH solution...
Scaling your Tableau - Migrating from Tableau Online to a proper DWH solution...Scaling your Tableau - Migrating from Tableau Online to a proper DWH solution...
Scaling your Tableau - Migrating from Tableau Online to a proper DWH solution...Sergii Khomenko
 
Scientific visualization
Scientific visualizationScientific visualization
Scientific visualizationNicolas Rougier
 
SEWM'14 keynote: Mining Events from Multimedia Streams
SEWM'14 keynote: Mining Events from Multimedia StreamsSEWM'14 keynote: Mining Events from Multimedia Streams
SEWM'14 keynote: Mining Events from Multimedia StreamsJonathon Hare
 
Mining Events from Multimedia Streams (WAIS Research group seminar June 2014)
Mining Events from Multimedia Streams (WAIS Research group seminar June 2014)Mining Events from Multimedia Streams (WAIS Research group seminar June 2014)
Mining Events from Multimedia Streams (WAIS Research group seminar June 2014)Jonathon Hare
 
Data Visualizations in Digital Products (ProductCamp Boston 2016)
Data Visualizations in Digital Products (ProductCamp Boston 2016)Data Visualizations in Digital Products (ProductCamp Boston 2016)
Data Visualizations in Digital Products (ProductCamp Boston 2016)ProductCamp Boston
 
From Content Strategy to Drupal Site Building - Connecting the dots
From Content Strategy to Drupal Site Building - Connecting the dotsFrom Content Strategy to Drupal Site Building - Connecting the dots
From Content Strategy to Drupal Site Building - Connecting the dotsRonald Ashri
 
From Content Strategy to Drupal Site Building - Connecting the Dots
From Content Strategy to Drupal Site Building - Connecting the DotsFrom Content Strategy to Drupal Site Building - Connecting the Dots
From Content Strategy to Drupal Site Building - Connecting the DotsRonald Ashri
 
Artificial Intelligence and Machine Learning
Artificial Intelligence and Machine LearningArtificial Intelligence and Machine Learning
Artificial Intelligence and Machine LearningAbhishek Sharma
 

Similar to Statistical Programming with JavaScript (20)

Why Every Product Manager Needs to Know Big Data
Why Every Product Manager Needs to Know Big DataWhy Every Product Manager Needs to Know Big Data
Why Every Product Manager Needs to Know Big Data
 
Graph theory in Practise
Graph theory in PractiseGraph theory in Practise
Graph theory in Practise
 
R - what do the numbers mean? #RStats
R - what do the numbers mean? #RStatsR - what do the numbers mean? #RStats
R - what do the numbers mean? #RStats
 
Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...
Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...
Mirko Lorenz Data Driven Journalism Overview Seminar Ordine dei Giornalisti d...
 
Star Schema Overview
Star Schema OverviewStar Schema Overview
Star Schema Overview
 
Vikram emerging technologies
Vikram emerging technologiesVikram emerging technologies
Vikram emerging technologies
 
Four Architectural Patterns
Four Architectural Patterns Four Architectural Patterns
Four Architectural Patterns
 
Six Things You Need to Know About the Modern Call Center
Six Things You Need to Know About the Modern Call CenterSix Things You Need to Know About the Modern Call Center
Six Things You Need to Know About the Modern Call Center
 
GW Intro to Digital Communications Class 6
GW Intro to Digital Communications Class 6 GW Intro to Digital Communications Class 6
GW Intro to Digital Communications Class 6
 
Agree to Disagree
Agree to DisagreeAgree to Disagree
Agree to Disagree
 
Scaling your Tableau - Migrating from Tableau Online to a proper DWH solution...
Scaling your Tableau - Migrating from Tableau Online to a proper DWH solution...Scaling your Tableau - Migrating from Tableau Online to a proper DWH solution...
Scaling your Tableau - Migrating from Tableau Online to a proper DWH solution...
 
Scientific visualization
Scientific visualizationScientific visualization
Scientific visualization
 
SEWM'14 keynote: Mining Events from Multimedia Streams
SEWM'14 keynote: Mining Events from Multimedia StreamsSEWM'14 keynote: Mining Events from Multimedia Streams
SEWM'14 keynote: Mining Events from Multimedia Streams
 
AUA Data Science Meetup
AUA Data Science MeetupAUA Data Science Meetup
AUA Data Science Meetup
 
Graph Modelling
Graph ModellingGraph Modelling
Graph Modelling
 
Mining Events from Multimedia Streams (WAIS Research group seminar June 2014)
Mining Events from Multimedia Streams (WAIS Research group seminar June 2014)Mining Events from Multimedia Streams (WAIS Research group seminar June 2014)
Mining Events from Multimedia Streams (WAIS Research group seminar June 2014)
 
Data Visualizations in Digital Products (ProductCamp Boston 2016)
Data Visualizations in Digital Products (ProductCamp Boston 2016)Data Visualizations in Digital Products (ProductCamp Boston 2016)
Data Visualizations in Digital Products (ProductCamp Boston 2016)
 
From Content Strategy to Drupal Site Building - Connecting the dots
From Content Strategy to Drupal Site Building - Connecting the dotsFrom Content Strategy to Drupal Site Building - Connecting the dots
From Content Strategy to Drupal Site Building - Connecting the dots
 
From Content Strategy to Drupal Site Building - Connecting the Dots
From Content Strategy to Drupal Site Building - Connecting the DotsFrom Content Strategy to Drupal Site Building - Connecting the Dots
From Content Strategy to Drupal Site Building - Connecting the Dots
 
Artificial Intelligence and Machine Learning
Artificial Intelligence and Machine LearningArtificial Intelligence and Machine Learning
Artificial Intelligence and Machine Learning
 

More from David Simons

Non-Functional Requirements
Non-Functional RequirementsNon-Functional Requirements
Non-Functional RequirementsDavid Simons
 
Build Tools & Maven
Build Tools & MavenBuild Tools & Maven
Build Tools & MavenDavid Simons
 
Decoupled APIs through microservices
Decoupled APIs through microservicesDecoupled APIs through microservices
Decoupled APIs through microservicesDavid Simons
 
TDD: What is it good for?
TDD: What is it good for?TDD: What is it good for?
TDD: What is it good for?David Simons
 
Domain Driven Design: A Precis
Domain Driven Design: A PrecisDomain Driven Design: A Precis
Domain Driven Design: A PrecisDavid Simons
 
Using Clojure to Marry Neo4j and Open Democracy
Using Clojure to Marry Neo4j and Open DemocracyUsing Clojure to Marry Neo4j and Open Democracy
Using Clojure to Marry Neo4j and Open DemocracyDavid Simons
 
Exploring Election Results with Neo4J
Exploring Election Results with Neo4JExploring Election Results with Neo4J
Exploring Election Results with Neo4JDavid Simons
 

More from David Simons (7)

Non-Functional Requirements
Non-Functional RequirementsNon-Functional Requirements
Non-Functional Requirements
 
Build Tools & Maven
Build Tools & MavenBuild Tools & Maven
Build Tools & Maven
 
Decoupled APIs through microservices
Decoupled APIs through microservicesDecoupled APIs through microservices
Decoupled APIs through microservices
 
TDD: What is it good for?
TDD: What is it good for?TDD: What is it good for?
TDD: What is it good for?
 
Domain Driven Design: A Precis
Domain Driven Design: A PrecisDomain Driven Design: A Precis
Domain Driven Design: A Precis
 
Using Clojure to Marry Neo4j and Open Democracy
Using Clojure to Marry Neo4j and Open DemocracyUsing Clojure to Marry Neo4j and Open Democracy
Using Clojure to Marry Neo4j and Open Democracy
 
Exploring Election Results with Neo4J
Exploring Election Results with Neo4JExploring Election Results with Neo4J
Exploring Election Results with Neo4J
 

Recently uploaded

Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Paige Cruz
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewDianaGray10
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)Samir Dash
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governanceWSO2
 
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....rightmanforbloodline
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMKumar Satyam
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityVictorSzoltysek
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxMarkSteadman7
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...caitlingebhard1
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxFIDO Alliance
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data SciencePaolo Missier
 
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringWSO2
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaWSO2
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAnitaRaj43
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingScyllaDB
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformWSO2
 
The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...
The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...
The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...SOFTTECHHUB
 
Decarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceDecarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceIES VE
 

Recently uploaded (20)

Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governance
 
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
Choreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software EngineeringChoreo: Empowering the Future of Enterprise Software Engineering
Choreo: Empowering the Future of Enterprise Software Engineering
 
Modernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using BallerinaModernizing Legacy Systems Using Ballerina
Modernizing Legacy Systems Using Ballerina
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
 
The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...
The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...
The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...
 
Decarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceDecarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational Performance
 

Statistical Programming with JavaScript

  • 1. STAT I ST I CA L P R O G RA M M I N G I N JAVAS C R I PT D av i d S i m o n s @ Swa m Wi t h Tu rt l e s
  • 4. W H O A M I ? Freelance Software Developer @SwamWithTurtles Java and JavaScript Afraid of goats?
  • 5. W H O A M I ? DATA NERD
  • 6. C O N T E N T S T H E O RY CA S E S T U D I E S JAVA S C R I P T A P P L I CAT I O N W H AT I S DATA ? G A I N I N G I N S I G H T S R A N D O M N E S S S I M U L AT I O N L E A R N I N G T H R O U G H Reward: What shape is the internet?
  • 8.
  • 9. B E H I N D T H E H O O D A P I D B A D M I N I N T E R F A C E S C H E D U L E D T A S K S 3 R D P A R T Y A P I S
  • 10. W H AT D ATA WA S T H E R E ? S O …
  • 11. W H AT D ATA WA S T H E R E ? • Counts of lists (e.g. brands, products etc.) • Stock levels and prices of products • Days an item has been out of stock
  • 12. W H AT D ATA WA S T H E R E ? • Non-functional data • Numbers of users • Performance for users • Performance of third party APIs • Robustness of system (Uptime, status codes, frequency of errors)
  • 13. T H E R E I S D ATA E V E RY W H E R E T H E L E S S O N ?
  • 15. What is good data?
  • 16. W H AT D ATA S H O U L D I C A R E A B O U T ? • Data you get repeatedly • Data you can extract ‘information’ from • Normally this means numerical data, though NLP is getting big! • Data that answers valuable questions
  • 18. A d a t a s e t : Identification WIND CEILING TEMP DEWPT RHX USAF NCDC Date HrMn I Type QCP Dir Q I Spd Q Hgt Q I I Temp Q Dewpt Q RHx 865300,99999,19860401,0000,4,FM-12, ,110,1,N, 7.2,1,22000,1,C,N, 21.6,1, 19.2,1, 86, 865300,99999,19860401,0300,4,FM-12, ,110,1,N, 5.1,1,22000,1,C,N, 19.4,1, 18.5,1, 95, 865300,99999,19860401,0600,4,FM-12, ,070,1,N, 7.2,1,03600,1,C,N, 19.2,1, 999.9,9,999, 865300,99999,19860401,0900,4,FM-12, ,070,1,N, 6.2,1,00120,1,C,N, 19.2,1, 18.9,1, 98, 865300,99999,19860401,1200,4,FM-12, ,070,1,N, 7.7,1,03600,1,C,N, 21.6,1, 18.3,1, 82, 865300,99999,19860401,1500,4,FM-12, ,040,1,N, 9.8,1,03600,1,C,N, 23.0,1, 18.8,1, 77, 865300,99999,19860401,1800,4,FM-12, ,030,1,N, 6.2,1,03600,1,C,N, 19.6,1, 19.0,1, 96, 865300,99999,19860401,2100,4,FM-12, ,050,1,N, 6.7,1,03600,1,C,N, 19.0,1, 18.7,1, 98, 865300,99999,19860402,0000,4,FM-12, ,340,1,N, 7.2,1,03600,1,C,N, 20.0,1, 19.4,1, 96, 865300,99999,19860402,0300,4,FM-12, ,360,1,N, 4.1,1,03600,1,C,N, 19.4,1, 19.1,1, 98, 865300,99999,19860402,0600,4,FM-12, ,999,1,C, 0.0,1,03600,1,C,N, 19.2,1, 18.9,1, 98, 865300,99999,19860402,0900,4,FM-12, ,999,1,C, 0.0,1,00210,1,C,N, 19.0,1, 18.7,1, 98, 865300,99999,19860402,1200,4,FM-12, ,200,1,N, 2.6,1,00210,1,C,N, 20.4,1, 20.1,1, 98, 865300,99999,19860402,1500,4,FM-12, ,210,1,N, 5.1,1,00750,1,C,N, 23.2,1, 19.3,1, 79, 865300,99999,19860402,1800,4,FM-12, ,200,1,N, 3.1,1,00750,1,C,N, 26.4,1, 18.4,1, 62, 865300,99999,19860402,2100,4,FM-12, ,999,1,C, 0.0,1,22000,1,C,N, 26.2,1, 17.1,1, 57, 865300,99999,19860403,0000,4,FM-12, ,140,1,N, 4.1,1,22000,1,C,N, 19.2,1, 17.0,1, 87, 865300,99999,19860403,0300,4,FM-12, ,999,1,C, 0.0,1,22000,1,C,N, 15.8,1, 15.2,1, 96, 865300,99999,19860403,0600,4,FM-12, ,999,1,C, 0.0,1,22000,1,C,N, 15.4,1, 14.0,1, 91, 865300,99999,19860403,1200,4,FM-12, ,060,1,N, 5.1,1,22000,1,C,N, 21.0,1, 19.8,1, 93, 865300,99999,19860403,1500,4,FM-12, ,060,1,N, 4.1,1,00900,1,C,N, 24.8,1, 21.3,1, 81, 865300,99999,19860403,1800,4,FM-12, ,050,1,N, 7.7,1,09000,1,C,N, 28.0,1, 21.4,1, 67, 865300,99999,19860403,2100,4,FM-12, ,040,1,N, 5.1,1,09000,1,C,N, 25.4,1, 21.4,1, 79, 865300,99999,19860404,0000,4,FM-12, ,060,1,N, 6.2,1,03600,1,C,N, 22.2,1, 21.3,1, 95, 865300,99999,19860404,0300,4,FM-12, ,050,1,N, 5.1,1,09000,1,C,N, 21.0,1, 20.7,1, 98, 865300,99999,19860404,0600,4,FM-12, ,060,1,N, 6.2,1,22000,1,C,N, 20.2,1, 19.9,1, 98, 865300,99999,19860404,1200,4,FM-12, ,040,1,N, 5.1,1,00120,1,C,N, 20.4,1, 19.5,1, 95, 865300,99999,19860404,1500,4,FM-12, ,020,1,N, 7.7,1,00420,1,C,N, 24.2,1, 20.4,1, 79, 865300,99999,19860404,1800,4,FM-12, ,250,1,N, 4.1,1,00750,1,C,N, 25.6,1, 20.7,1, 74, 865300,99999,19860404,2100,4,FM-12, ,250,1,N, 5.1,1,00750,1,C,N, 23.6,1, 20.4,1, 82, 865300,99999,19860405,0000,4,FM-12, ,180,1,N, 6.2,1,00420,1,C,N, 20.2,1, 19.6,1, 96,
  • 19. s u m m a r y s t a t i s t i c s
  • 20. S U M M A RY S TAT I S T I C S • A statistic is a function of the data we have inputed • It aims to capture information about values to make it more understandable
  • 21. T H E FA M O U S O N E : • Mean (‘average’) • Sum all of the data and divide by the number of items • Gives a sense of ‘size’
  • 23. O T H E R S TAT I S T I C S • “Location” • Mean, Mode, Median • “Spread” • Standard Deviation • “Shape” • Skew, Kurtosis
  • 24. D E M O
  • 26. What is a random variable?
  • 27. Discrete Variables Can be any of a list of values, each with its own probability H E A D S 0 . 5 TA I L S 0 . 5 2 1 / 3 6 3 2 / 3 6 4 3 / 3 6 5 4 / 3 6 6 5 / 3 6 7 6 / 3 6 8 5 / 3 6 9 4 / 3 6 1 0 3 / 3 6 1 1 2 / 3 6 1 2 1 / 3 6
  • 28. This makes sense: X = Result of a coin flip H E A D S 0 . 5 TA I L S 0 . 5 But: X won’t always have the same value
  • 29. R A N D O M VA R I A B L E S X = Result of a coin flip H E A D S 0 . 5 TA I L S 0 . 5 X is a Random Variable This is its distribution
  • 30. D E M O …
  • 31. Continuous A numerical variable, that can be any number (sometimes within a range) height weight Math.random()
  • 32. H O W D O W E D E F I N E T H E D I S T R I B U T I O N ? Math.random() height
  • 33. D E M O
  • 34. S O W H AT ? E R R R …
  • 35. • When we do data analysis, we’re really looking at the range of values a random variable can be… • … and asking questions about its distribution.
  • 36. Y O U ’ R E A N A U D I T O R I M A G I N E …
  • 37. A U D I T I N G A L E D G E R • Make a list of all ingoing and outgoing transactions • These are random variables. • What is their distribution? Does it deviate from what we expect?
  • 38. B E N F O R D ’ S L A W http://www.journalofaccountancy.com/Issues/1999/May/nigrini
  • 39. I N T U I T I V E U S E R I N P U T S D E S I G N I N G
  • 40. O U R TA S K … • Designing a system that tries to understand what happens under financial system “shocks” • So: a user would input a shock, its impacts would propagate and we would see our bottom line.
  • 41. O U R F I R S T AT T E M P T • Shock ‘sliders’ that scaled linearly 0 % 2 5 % B O O M 9 0 % B U S T
  • 42. D I S T R I B U T I O N O F F I N A N C I A L C H A N G E S
  • 43. S O … • Shock ‘sliders’ that scaled linearly 0 % 8 % B O O M 1 0 5 % B U S T Change that happens with 75% chance Change that happens with 10% chance
  • 45. M A K I N G R A N D O M VA R I A B L E S
  • 46. S O M E WA R N I N G S • Exactly what randomness means is a fuzzy question. • These numbers are not ‘cryptographically’ random.
  • 47. J AVA S C R I P T ’ S E N T RY T O R A N D O M N E S S • Different runtimes can implement it differently. • V8 implements Multiply-With- Carry: • Take a sequence of ‘seed’ values • Iteratively perform modular arithmetic-based operations • Extend the initial seed values to a longer sequence. Math.random()
  • 48. W H AT A B O U T O T H E R D I S T R I B U T I O N S ? B U T …
  • 49. T H E S H O R T A N S W E R Math.random()= f( )
  • 50. T H E S H O R T A N S W E R = H E A D S 0 . 5 TA I L S 0 . 5 =
  • 51. W H AT ’ S T H E F U N C T I O N ? jStat beta centralF cauchy chi-squared exponential gamma inverse gamma kumaraswamy lognormal normal pareto student t uniform weibull binomial negative binomial hypergeometric poisson triangular OR
  • 52. U S I N G R A N D O M N E S S
  • 53. w hy w o u l d i w a n t t o u s e R A N D O M N E S S ?
  • 54. S T U B B E D T E S T D ATA • Avoid coupling yourself to specific test implementations • Spin-up life-like environments for load testing
  • 55. N O N - D E T E R M I N I S T I C A L G O R I T H M S • Modelling underlying or random data • Solving a problem that is expensive or impossible to solve perfectly
  • 56. P I T FA L L S
  • 57. C H O O S I N G T H E D I S T R I B U T I O N • What if a ‘uniform’ distribution isn’t enough? • What if we want random data that isn’t just numbers?
  • 58. E X A M P L E : S O C I A L N E T W O R K
  • 59. E X A M P L E : S O C I A L N E T W O R K 11 Traversals
  • 60. D E M O
  • 61. B a r a b a s i - A l b e r t R a n d o m M o d e l
  • 62. B A R A B A S I - A L B E R T R A N D O M M O D E L • Start with two linked objects • Add one new object at a time • Link that object to one existing object, with already ‘popular’ objects more likely to be chosen.
  • 63. T H I S M O D E L S … • Academic Citations • Actor filmographies • Spread of Infectious diseases • Social Networks
  • 64. C O N T E N T S T H E O RY CA S E S T U D I E S JAVA S C R I P T A P P L I CAT I O N W H AT I S DATA ? G A I N I N G I N S I G H T S R A N D O M N E S S S I M U L AT I O N L E A R N I N G T H R O U G H Reward: What shape is the internet?
  • 66. • Data is any information we collect. Not all data is valuable. • Seeing trends in lots of numbers is hard. Summary statistics and charts help us unpick its meaning. • Data can be treated as random ‘realisations’ from a backing distribution. • Making random variables is easy, and can be done in different shapes for different purposes. W H AT I S DATA ? G A I N I N G I N S I G H T S R A N D O M N E S S S I M U L AT I O N
  • 67. L I B R A R I E S W E U S E D G E N E R A L L I B R A R I E S K N O C K O U T. J S R E Q U I R E . J S B O O T S T R A P D ATA M A N I P U L AT I O N L O D A S H J S TAT D ATA I M P O RT PA PA PA R S E C H A RT I N G D 3 C H A R T. J S
  • 68. T H A N K YO U D av i d S i m o n s @ Swa m Wi t h Tu rt l e s