book

Probability and Statistics for Computer Scientists, 2nd Edition

Name: Probability and Statistics for Computer Scientists, 2nd Edition
Author: Michael Baron
ISBN: 9781498760607

by Michael Baron

August 2013

Intermediate to advanced

473 pages

11h 32m

English

Chapman and Hall/CRC

Read now

Unlock full access

Preliminaries
Dedication
Preface
For whom this book is writtenRecommended coursesPrerequisites, and use of the appendixStyle and motivationComputers, demos, illustrations, and MATLAB®Second editionThanks and acknowledgmentsFigure 1Table 0.1
Chapter 1 Introduction and Overview
1.1 Making decisions under uncertaintySummary and conclusion1.2 Overview of this bookSummary and conclusionsExercisesFigure 1.1
Part I Probability and Random Variables
Chapter 2 Probability
2.1 Events and their probabilities2.1.1 Outcomes, events, and the sample space2.1.2 Set operations2.2 Rules of Probability2.2.1 Axioms of Probability2.2.2 Computing probabilities of eventsExtreme casesUnionComplementIntersection of independent events2.2.3 Applications in reliability2.3 Combinatorics2.3.1 Equally likely outcomes2.3.2 Permutations and combinationsPermutations with replacementPermutations without replacementCombinations without replacementComputational shortcutsCombinations with replacement2.4 Conditional probability and independenceConditional probabilityIndependenceBayes RuleLaw of Total ProbabilitySummary and conclusionsExercisesFigure 2.1Figure 2.2Figure 2.3Figure 2.4Figure 2.5Figure 2.6Figure 2.7Figure 2.8
Chapter 3 Discrete Random Variables and Their Distributions
3.1 Distribution of a random variable3.1.1 Main concepts3.1.2 Types of random variables3.2 Distribution of a random vector3.2.1 Joint distribution and marginal distributions3.2.2 Independence of random variables3.3 Expectation and variance3.3.1 Expectation3.3.2 Expectation of a function3.3.3 Properties3.3.4 Variance and standard deviation3.3.5 Covariance and correlation3.3.6 Properties3.3.7 Chebyshev’s inequality3.3.8 Application to finance3.4 Families of discrete distributions3.4.1 Bernoulli distribution3.4.2 Binomial distributionMATLAB notes3.4.3 Geometric distribution3.4.4 Negative Binomial distribution3.4.5 Poisson distribution3.4.6 Poisson approximation of Binomial distributionSummary and conclusionsExercisesFigure 3.1Figure 3.2Figure 3.3Figure 3.4Figure 3.5Figure 3.6
Chapter 4 Continuous Distributions
4.1 Probability densityAnalogy: pmf versus pdfJoint and marginal densitiesExpectation and variance4.2 Families of continuous distributions4.2.1 Uniform distributionThe Uniform propertyStandard Uniform distributionExpectation and variance4.2.2 Exponential distributionTimes between rare events are ExponentialMemoryless property4.2.3 Gamma distributionExpectation, variance, and some useful integration remarksGamma-Poisson formula4.2.4 Normal distributionStandard Normal distribution4.3 Central Limit TheoremNormal approximation to Binomial distributionContinuity correctionSummary and conclusionsExercisesFigure 4.1Figure 4.2Figure 4.3Figure 4.4Figure 4.5Figure 4.6Table 4.1Table 4.2Table 4.3
Chapter 5 Computer Simulations and Monte Carlo Methods
5.1 Introduction5.1.1 Applications and examples5.2 Simulation of random variables5.2.1 Random number generators5.2.2 Discrete methodsArbitrary discrete distribution5.2.3 Inverse transform methodArbitrary continuous distributionDiscrete distributions revisitedExponential-Geometric relation5.2.4 Rejection method5.2.5 Generation of random vectors5.2.6 Special methodsPoisson distributionNormal distribution5.3 Solving problems by Monte Carlo methods5.3.1 Estimating probabilitiesAccuracy of a Monte Carlo study5.3.2 Estimating means and standard deviations5.3.3 Forecasting5.3.4 Estimating lengths, areas, and volumesLengthsAreas and volumesAreas of arbitrary regions with unknown boundaries5.3.5 Monte Carlo integrationAccuracy of resultsImproved Monte Carlo integration methodAccuracy of the improved methodSummary and conclusionsExercisesFigure 5.1Figure 5.2Figure 5.3Figure 5.4Figure 5.5Figure 5.6Figure 5.7Figure 5.8
Part II Stochastic Processes

Chapter 6 Stochastic Processes
6.1 Definitions and classifications6.2 Markov processes and Markov chains6.2.1 Markov chainsCharacteristics of a Markov chain6.2.2 Matrix approach6.2.3 Steady-state distributionComputing the steady-state distributionThe limit of PhSteady stateExistence of a steady-state distribution. Regular Markov chainsConclusion6.3 Counting processes6.3.1 Binomial processRelation to real time: framesMarkov property6.3.2 Poisson processContinuous timePoisson process as the limiting caseRare events and modeling6.4 Simulation of stochastic processesDiscrete-time processesMarkov chainsBinomial processContinuous-time processesPoisson processSummary and conclusionsExercisesFigure 6.1Figure 6.2Figure 6.3Figure 6.4Figure 6.5Figure 6.6Figure 6.7Figure 6.8Figure 6.9Figure 6.10
Chapter 7 Queuing Systems
7.1 Main components of a queuing systemArrivalQueuing and routing to serversServiceDeparture7.2 The Little’s Law7.3 Bernoulli single-server queuing processMarkov propertySteady-state distribution7.3.1 Systems with limited capacity7.4 M/M/1 systemM/M/1 as a limiting case of a Bernoulli queuing processSteady-state distribution for an M/M/1 system7.4.1 Evaluating the system’s performanceUtilizationWaiting timeResponse timeQueueLittle’s Law revisitedWhen a system gets nearly overloaded7.5 Multiserver queuing systems7.5.1 Bernoulli k-server queuing processMarkov propertyTransition probabilities7.5.2 M/M/k systemsSteady-state distribution7.5.3 Unlimited number of servers and M/M/∞M/M/∞ queueing system7.6 Simulation of queuing systemsMarkov caseGeneral caseExample: simulation of a multiserver queuing systemSummary and conclusionsExercisesFigure 7.1Figure 7.2Figure 7.3Figure 7.4Figure 7.5
Part III Statistics
Chapter 8 Introduction to Statistics
8.1 Population and sample, parameters and statisticsSampling and non-sampling errors8.2 Simple descriptive statistics8.2.1 MeanUnbiasednessConsistencyAsymptotic Normality8.2.2 MedianUnderstanding the shape of a distributionComputation of a population medianComputing sample medians8.2.3 Quantiles, percentiles, and quartiles8.2.4 Variance and standard deviationComputation8.2.5 Standard errors of estimates8.2.6 Interquartile rangeDetection of outliersHandling of outliers8.3 Graphical statistics8.3.1 HistogramHow else may histograms look like?MixturesThe choice of bins8.3.2 Stem-and-leaf plot8.3.3 BoxplotParallel boxplots8.3.4 Scatter plots and time plotsMATLAB notesSummary and conclusionsExercisesFigure 8.1Figure 8.2Figure 8.3Figure 8.4Figure 8.5Figure 8.6Figure 8.7Figure 8.8Figure 8.9Figure 8.10Figure 8.11Figure 8.12
Chapter 9 Statistical Inference I
9.1 Parameter estimation9.1.1 Method of momentsMomentsEstimation9.1.2 Method of maximum likelihoodDiscrete caseContinuous case9.1.3 Estimation of standard errors9.2 Confidence intervals9.2.1 Construction of confidence intervals: a general method9.2.2 Confidence interval for the population mean9.2.3 Confidence interval for the difference between two means9.2.4 Selection of a sample size9.2.5 Estimating means with a given precision9.3 Unknown standard deviation9.3.1 Large samples9.3.2 Confidence intervals for proportions9.3.3 Estimating proportions with a given precision9.3.4 Small samples: Student’s t distribution9.3.5 Comparison of two populations with unknown variancesCase 1. Equal variances9.4 Hypothesis testing9.4.1 Hypothesis and alternative9.4.2 Type I and Type II errors: level of significance9.4.3 Level α tests: general approachStep 1. Test statisticStep 2. Acceptance region and rejection regionStep 3: Result and its interpretation9.4.4 Rejection regions and power9.4.5 Standard Normal null distribution (Z-test)9.4.6 Z-tests for means and proportions9.4.7 Pooled sample proportion9.4.8 Unknown σ: T-tests9.4.9 Duality: two-sided tests and two-sided confidence intervals9.4.10 P-valueHow do we choose α?P-valueTesting hypotheses with a P-valueComputing P-valuesUnderstanding P-values9.5 Inference about variances9.5.1 Variance estimator and Chi-square distribution9.5.2 Confidence interval for the population variance9.5.3 Testing varianceLevel α testP-value9.5.4 Comparison of two variances. F-distribution.9.5.5 Confidence interval for the ratio of population variances9.5.6 F-tests comparing two variancesMATLAB notesSummary and conclusionsExercisesFigure 9.1Figure 9.2Figure 9.3Figure 9.4Figure 9.5Figure 9.6Figure 9.7Figure 9.8Figure 9.9Figure 9.10Figure 9.11Figure 9.12Figure 9.13Figure 9.14Figure 9.15Table 9.1Table 9.2Table 9.3Table 9.4Table 9.5Table 9.6
Chapter 10 Statistical Inference II
10.1 Chi-square tests10.1.1 Testing a distribution10.1.2 Testing a family of distributions10.1.3 Testing independence10.2 Nonparametric statistics10.2.1 Sign test10.2.2 Wilcoxon signed rank testNull distribution of Wilcoxon test statisticExact distributionNormal approximation10.2.3 Mann-Whitney-Wilcoxon rank sum testMann-Whitney-Wilcoxon test in MATLABNull distribution of Mann-Whitney-Wilcoxon test statisticNormal approximation10.3 Bootstrap10.3.1 Bootstrap distribution and all bootstrap samplesBootstrap distribution10.3.2 Computer generated bootstrap samples10.3.3 Bootstrap confidence intervalsParametric method, based on the bootstrap estimation of the standard errorNonparametric method, based on the bootstrap quantiles10.4 Bayesian inference10.4.1 Prior and posteriorConjugate distribution familiesGamma family is conjugate to the Poisson modelBeta family is conjugate to the Binomial modelNormal family is conjugate to the Normal model10.4.2 Bayesian estimation10.4.3 Bayesian credible sets10.4.4 Bayesian hypothesis testingLoss and riskSummary and conclusionsExercisesFigure 10.1Figure 10.2Figure 10.3Figure 10.4Figure 10.5Figure 10.6Figure 10.7Figure 10.8Figure 10.9Figure 10.10Figure 10.11Figure 10.12Figure 10.13Figure 10.14Table 10.1Table 10.2
Chapter 11 Regression
11.1 Least squares estimation11.1.1 Examples11.1.2 Method of least squares11.1.3 Linear regressionEstimation in linear regression11.1.4 Regression and correlation11.1.5 Overfitting a model11.2 Analysis of variance, prediction, and further inference11.2.1 ANOVA and R-square11.2.2 Tests and confidence intervalsDegrees of freedom and variance estimationInference about the regression slopeANOVA F-testF-test and T-test11.2.3 PredictionConfidence interval for the mean of responsesPrediction interval for the individual responsePrediction bands11.3 Multivariate regression11.3.1 Introduction and examples11.3.2 Matrix approach and least squares estimationMatrix approach to multivariate linear regressionLeast squares estimates11.3.3 Analysis of variance, tests, and predictionTesting significance of the entire modelVariance estimatorTesting individual slopesPrediction11.4 Model building11.4.1 Adjusted R-square11.4.2 Extra sum of squares, partial F-tests, and variable selectionStepwise (forward) selectionBackward elimination11.4.3 Categorical predictors and dummy variablesAvoid singularity by creating only (C − 1) dummiesInterpretation of slopes for dummy variablesMatlab notesSummary and conclusionsExercisesFigure 11.1Figure 11.2Figure 11.3Figure 11.4Figure 11.5Figure 11.6Table 11.1
Part IV Appendix
Chapter 12 Appendix
12.1 Inventory of distributions12.1.1 Discrete families12.1.2 Continuous families12.2 Distribution tables12.3 Calculus review12.3.1 Inverse function12.3.2 Limits and continuity12.3.3 Sequences and series12.3.4 Derivatives, minimum, and maximumComputing maxima and minima12.3.5 IntegralsImproper integralsIntegration by substitutionIntegration by partsComputing areasGamma function and factorial12.4 Matrices and linear systemsMultiplying a row by a columnMultiplying matricesTranspositionSolving systems of equationsInverse matrixMatrix operations in Matlab12.5 Answers to selected exercisesFigure 12.1Figure 12.2Figure 12.3Figure 12.4Figure 12.5Figure 12.6Figure 12.7Figure 12.8Figure 12.9Table A1Table A2Table A3Table A4Table A5Table A6Table A7Table A8Table A9Table 12.1

Overview

Student-Friendly Coverage of Probability, Statistical Methods, Simulation, and Modeling ToolsIncorporating feedback from instructors and researchers who used the previous edition, Probability and Statistics for Computer Scientists, Second Edition helps students understand general methods of stochastic modeling, simulation, and data analysis; make o

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Practical Statistics for Data Scientists, 2nd Edition

Publisher Resources

ISBN: 9781439875919

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills