Data Analysis : A Gentle Introduction for Future Data Scientists.
This slim volume provides a very approachable guide to the techniques and basic ideas of probability and statistics and more advanced techniques such as generalised linear models, classification using logistic regression, and support-vector machines.
Saved in:
Online Access: |
Full Text (via EBSCO) |
---|---|
Main Author: | |
Other Authors: | |
Format: | Electronic eBook |
Language: | English |
Published: |
Oxford :
Oxford University Press, Incorporated,
2023.
|
Subjects: |
Table of Contents:
- Cover
- Titlepage
- Copyright
- Contents
- Preface
- 1 First steps
- 1.1 Types of data
- 1.2 Sample and population
- 1.2.1 Observations and random variables
- 1.2.2 Sampling variation
- 1.3 Methods for sampling a population
- 1.3.1 The simple random sample
- 1.3.2 Cluster sampling
- 1.3.3 Stratified sampling
- 1.3.4 Systematic sampling
- 1.4 Oversampling and the use of weights
- 2 Summarizing data
- 2.1 Measures of location
- 2.1.1 The mode
- 2.1.2 The mean
- 2.1.3 The trimmed mean
- 2.1.4 The Winsorized mean
- 2.1.5 The median
- 2.2 Measures of spread
- 2.2.1 The range
- 2.2.2 The interquartile range
- 2.3 Boxplot
- 2.4 Histograms
- 2.5 Cumulative frequency diagrams
- 2.6 Step diagrams
- 2.7 The variance and standard deviation
- 2.8 Symmetric and skewed data
- 3 Probability
- 3.1 Probability
- 3.2 The rules of probability
- 3.3 Conditional probability and independence
- 3.4 The total probability theorem
- 3.5 Bayes' theorem
- 4 Probability distributions
- 4.1 Notation
- 4.2 Mean and variance of a probability distribution
- 4.3 The relation between sample and population
- 4.4 Combining means and variances
- 4.5 Discrete uniform distribution
- 4.6 Probability density function
- 4.7 The continuous uniform distribution
- 5 Estimation and confidence
- 5.1 Point estimates
- 5.1.1 Maximum likelihood estimation (mle)
- 5.2 Confidence intervals
- 5.3 Confidence interval for the population mean
- 5.3.1 The normal distribution
- 5.3.2 The Central Limit Theorem
- 5.3.3 Construction of the confidence interval
- 5.4 Confidence interval for a proportion
- 5.4.1 The binomial distribution
- 5.4.2 Confidence interval for a proportion (large sample case)
- 6.3.1 Do the two samples come from the same population?
- 6.3.2 Do the two populations have the same mean?
- 7 Comparing proportions
- 7.1 The 2 2 table
- 7.2 Some terminology
- 7.2.1 Odds, odds ratios, and independence
- 7.2.2 Relative risk
- 7.2.3 Sensitivity, specificity, and related quantities
- 7.3 The R C table
- 7.3.1 Residuals
- 7.3.2 Partitioning
- 8 Relations between two continuous variables
- 8.1 Scatter diagrams
- 8.2 Correlation
- 8.2.1 Testing for independence
- 8.3 The equation of a line
- 8.4 The method of least squares
- 8.5 A random dependent variable, Y
- 8.5.1 Estimation of σ2
- 5.4.3 Confidence interval for a proportion (small sample)
- 5.5 Confidence bounds for other summary statistics
- 5.5.1 The bootstrap
- 5.6 Some other probability distributions
- 5.6.1 The Poisson and exponential distributions
- 5.6.2 The Weibull distribution
- 5.6.3 The chi-squared (χ2) distribution
- 6 Models, p-values, and hypotheses
- 6.1 Models
- 6.2 p-values and the null hypothesis
- 6.2.1 Two-sided or one-sided-- 6.2.2 Interpreting p-values
- 6.2.3 Comparing p-values
- 6.2.4 Link with confidence interval
- 6.3 p-values when comparing two samples