Online machine learning [electronic resource] : a practical guide with examples in Python / Eva Bartz, Thomas Bartz-Beielstein, editors.

This book deals with the exciting, seminal topic of Online Machine Learning (OML). The content is divided into three parts: the first part looks in detail at the theoretical foundations of OML, comparing it to Batch Machine Learning (BML) and discussing what criteria should be developed for a meanin...

Full description

Saved in:
Bibliographic Details
Online Access: Full Text (via Springer)
Other Authors: Bartz, Eva, Bartz-Beielstein, Thomas
Format: Electronic eBook
Language:English
Published: Singapore : Springer, 2024.
Series:Machine Learning: Foundations, Methodologies, and Applications.
Subjects:
Table of Contents:
  • Intro
  • Foreword
  • Preface
  • Contents
  • Contributors
  • 1 Introduction: From Batch to Online Machine Learning
  • 1.1 Streaming Data
  • 1.2 Disadvantages of Batch Learning
  • 1.2.1 Memory Requirements
  • 1.2.2 Drift
  • 1.2.3 New, Unknown Data
  • 1.2.4 Accessibility and Availability of the Data
  • 1.2.5 Other Problems
  • 1.3 Incremental Learning, Online Learning, and Stream Learning
  • 1.4 Transitioning Batch to Online Machine Learning
  • References
  • 2 Supervised Learning: Classification and Regression
  • 2.1 Classification
  • 2.1.1 Baseline Algorithms
  • 2.1.2 The Naive-Bayes Classifier
  • 2.1.3 Tree-Based Methods
  • 2.1.4 Other Classification Methods
  • 2.2 Regression
  • 2.2.1 Online Linear Regression
  • 2.2.2 Hoeffding Tree Regressor
  • 2.3 Ensemble Methods for OML
  • 2.4 Clustering
  • 2.5 Overview: OML Methods
  • References
  • 3 Drift Detection and Handling
  • 3.1 Architectures for Drift Detection Methods
  • 3.1.1 Adaptive Estimators
  • 3.1.2 Change Detectors
  • 3.1.3 Ensemble-Based Approaches
  • 3.2 Basic Considerations for Windowing Techniques
  • 3.3 Popular Drift Detection Methods
  • 3.3.1 Statistical Tests for Drift and Change Detection
  • 3.3.2 Control Charts
  • 3.3.3 Adaptive Windowing (ADWIN)
  • 3.3.4 Implicit Drift Detection Algorithms
  • 3.4 OML Algorithms with Drift Detection: Hoeffding-Window Trees
  • 3.4.1 Concept-Adapting Very Fast Decision Trees (CVFDT)
  • 3.4.2 Hoeffding Adaptive Trees (HAT)
  • 3.4.3 Overview: Hoeffding-Window Trees
  • 3.4.4 Overview: HT in River
  • 3.5 Drift Scaling in Online Machine Learning
  • 3.5.1 Statistical Measures in a Sequential Manner
  • 3.5.2 Adapted Scaling Techniques
  • References
  • 4 Initial Selection and Subsequent Updating of OML Models
  • 4.1 Initial Model Selection
  • 4.2 Updating and Changing the Model
  • 4.2.1 Adding New Features
  • 4.2.2 Manual Model Changes in Response to Drift
  • 4.2.3 Ensuring Model Quality After a Model Update
  • 4.3 Catastrophic Forgetting
  • 4.3.1 Strategies for Dealing with Catastrophic Forgetting
  • References
  • 5 Evaluation and Performance Measurement
  • 5.1 Data Selection Methods
  • 5.1.1 Holdout Selection
  • 5.1.2 Progressive Validation: Interleaved Test-Then-Train
  • 5.1.3 Machine Learning in Batch Mode with a Prediction Horizon
  • 5.1.4 Landmark Batch Machine Learning with a Prediction Horizon
  • 5.1.5 Window-Batch Method with Prediction Horizon
  • 5.1.6 Online-Machine Learning with a Prediction Horizon
  • 5.1.7 Online-Maschine Learning
  • 5.2 Determining the Training and Test Data Set in the Package spotRiver
  • 5.2.1 Methods for BML und OML
  • 5.2.2 Methods for OML River
  • 5.3 Algorithm (Model) Performance
  • 5.4 Data Stream and Drift Generators
  • 5.4.1 Data Stream Generators in Sklearn
  • 5.4.2 SEA-Drift Generator
  • 5.4.3 Friedman-Drift Generator
  • 5.5 Summary
  • References
  • 6 Special Requirements for Online Machine Learning Methods
  • 6.1 Missing Data, Imputation
  • 6.2 Categorical Attributes
  • 6.3 Outlier and Anomaly Detection