Online machine learning [electronic resource] : a practical guide with examples in Python / Eva Bartz, Thomas Bartz-Beielstein, editors.
This book deals with the exciting, seminal topic of Online Machine Learning (OML). The content is divided into three parts: the first part looks in detail at the theoretical foundations of OML, comparing it to Batch Machine Learning (BML) and discussing what criteria should be developed for a meanin...
Saved in:
Online Access: |
Full Text (via Springer) |
---|---|
Other Authors: | , |
Format: | Electronic eBook |
Language: | English |
Published: |
Singapore :
Springer,
2024.
|
Series: | Machine Learning: Foundations, Methodologies, and Applications.
|
Subjects: |
Table of Contents:
- Intro
- Foreword
- Preface
- Contents
- Contributors
- 1 Introduction: From Batch to Online Machine Learning
- 1.1 Streaming Data
- 1.2 Disadvantages of Batch Learning
- 1.2.1 Memory Requirements
- 1.2.2 Drift
- 1.2.3 New, Unknown Data
- 1.2.4 Accessibility and Availability of the Data
- 1.2.5 Other Problems
- 1.3 Incremental Learning, Online Learning, and Stream Learning
- 1.4 Transitioning Batch to Online Machine Learning
- References
- 2 Supervised Learning: Classification and Regression
- 2.1 Classification
- 2.1.1 Baseline Algorithms
- 2.1.2 The Naive-Bayes Classifier
- 2.1.3 Tree-Based Methods
- 2.1.4 Other Classification Methods
- 2.2 Regression
- 2.2.1 Online Linear Regression
- 2.2.2 Hoeffding Tree Regressor
- 2.3 Ensemble Methods for OML
- 2.4 Clustering
- 2.5 Overview: OML Methods
- References
- 3 Drift Detection and Handling
- 3.1 Architectures for Drift Detection Methods
- 3.1.1 Adaptive Estimators
- 3.1.2 Change Detectors
- 3.1.3 Ensemble-Based Approaches
- 3.2 Basic Considerations for Windowing Techniques
- 3.3 Popular Drift Detection Methods
- 3.3.1 Statistical Tests for Drift and Change Detection
- 3.3.2 Control Charts
- 3.3.3 Adaptive Windowing (ADWIN)
- 3.3.4 Implicit Drift Detection Algorithms
- 3.4 OML Algorithms with Drift Detection: Hoeffding-Window Trees
- 3.4.1 Concept-Adapting Very Fast Decision Trees (CVFDT)
- 3.4.2 Hoeffding Adaptive Trees (HAT)
- 3.4.3 Overview: Hoeffding-Window Trees
- 3.4.4 Overview: HT in River
- 3.5 Drift Scaling in Online Machine Learning
- 3.5.1 Statistical Measures in a Sequential Manner
- 3.5.2 Adapted Scaling Techniques
- References
- 4 Initial Selection and Subsequent Updating of OML Models
- 4.1 Initial Model Selection
- 4.2 Updating and Changing the Model
- 4.2.1 Adding New Features
- 4.2.2 Manual Model Changes in Response to Drift
- 4.2.3 Ensuring Model Quality After a Model Update
- 4.3 Catastrophic Forgetting
- 4.3.1 Strategies for Dealing with Catastrophic Forgetting
- References
- 5 Evaluation and Performance Measurement
- 5.1 Data Selection Methods
- 5.1.1 Holdout Selection
- 5.1.2 Progressive Validation: Interleaved Test-Then-Train
- 5.1.3 Machine Learning in Batch Mode with a Prediction Horizon
- 5.1.4 Landmark Batch Machine Learning with a Prediction Horizon
- 5.1.5 Window-Batch Method with Prediction Horizon
- 5.1.6 Online-Machine Learning with a Prediction Horizon
- 5.1.7 Online-Maschine Learning
- 5.2 Determining the Training and Test Data Set in the Package spotRiver
- 5.2.1 Methods for BML und OML
- 5.2.2 Methods for OML River
- 5.3 Algorithm (Model) Performance
- 5.4 Data Stream and Drift Generators
- 5.4.1 Data Stream Generators in Sklearn
- 5.4.2 SEA-Drift Generator
- 5.4.3 Friedman-Drift Generator
- 5.5 Summary
- References
- 6 Special Requirements for Online Machine Learning Methods
- 6.1 Missing Data, Imputation
- 6.2 Categorical Attributes
- 6.3 Outlier and Anomaly Detection