Data Analysis : A Gentle Introduction for Future Data Scientists.

This slim volume provides a very approachable guide to the techniques and basic ideas of probability and statistics and more advanced techniques such as generalised linear models, classification using logistic regression, and support-vector machines.

Saved in:

Bibliographic Details
Online Access:	Full Text (via EBSCO)
Main Author:	Upton, Graham
Other Authors:	Brawn, Dan
Format:	Electronic eBook
Language:	English
Published:	Oxford : Oxford University Press, Incorporated, 2023.
Subjects:	Mathematical statistics. Probabilities.

Table of Contents:

Cover
Titlepage
Copyright
Contents
Preface
1 First steps
1.1 Types of data
1.2 Sample and population
1.2.1 Observations and random variables
1.2.2 Sampling variation
1.3 Methods for sampling a population
1.3.1 The simple random sample
1.3.2 Cluster sampling
1.3.3 Stratified sampling
1.3.4 Systematic sampling
1.4 Oversampling and the use of weights
2 Summarizing data
2.1 Measures of location
2.1.1 The mode
2.1.2 The mean
2.1.3 The trimmed mean
2.1.4 The Winsorized mean
2.1.5 The median
2.2 Measures of spread
2.2.1 The range
2.2.2 The interquartile range
2.3 Boxplot
2.4 Histograms
2.5 Cumulative frequency diagrams
2.6 Step diagrams
2.7 The variance and standard deviation
2.8 Symmetric and skewed data
3 Probability
3.1 Probability
3.2 The rules of probability
3.3 Conditional probability and independence
3.4 The total probability theorem
3.5 Bayes' theorem
4 Probability distributions
4.1 Notation
4.2 Mean and variance of a probability distribution
4.3 The relation between sample and population
4.4 Combining means and variances
4.5 Discrete uniform distribution
4.6 Probability density function
4.7 The continuous uniform distribution
5 Estimation and confidence
5.1 Point estimates
5.1.1 Maximum likelihood estimation (mle)
5.2 Confidence intervals
5.3 Confidence interval for the population mean
5.3.1 The normal distribution
5.3.2 The Central Limit Theorem
5.3.3 Construction of the confidence interval
5.4 Confidence interval for a proportion
5.4.1 The binomial distribution
5.4.2 Confidence interval for a proportion (large sample case)
6.3.1 Do the two samples come from the same population?
6.3.2 Do the two populations have the same mean?
7 Comparing proportions
7.1 The 2 2 table
7.2 Some terminology
7.2.1 Odds, odds ratios, and independence
7.2.2 Relative risk
7.2.3 Sensitivity, specificity, and related quantities
7.3 The R C table
7.3.1 Residuals
7.3.2 Partitioning
8 Relations between two continuous variables
8.1 Scatter diagrams
8.2 Correlation
8.2.1 Testing for independence
8.3 The equation of a line
8.4 The method of least squares
8.5 A random dependent variable, Y
8.5.1 Estimation of σ2
5.4.3 Confidence interval for a proportion (small sample)
5.5 Confidence bounds for other summary statistics
5.5.1 The bootstrap
5.6 Some other probability distributions
5.6.1 The Poisson and exponential distributions
5.6.2 The Weibull distribution
5.6.3 The chi-squared (χ2) distribution
6 Models, p-values, and hypotheses
6.1 Models
6.2 p-values and the null hypothesis
6.2.1 Two-sided or one-sided-- 6.2.2 Interpreting p-values
6.2.3 Comparing p-values
6.2.4 Link with confidence interval
6.3 p-values when comparing two samples

Data Analysis : A Gentle Introduction for Future Data Scientists.

Similar Items