Practical system reliability [electronic resource] / Eric Bauer, Xuemei Zhang, Douglas A. Kimber.

Learn how to model, predict, and manage system reliability/availability throughout the development life cycle Written by a panel of authors with a wealth of industry experience, the methods and concepts presented here give readers a solid understanding of modeling and managing system and software av...

Full description

Saved in:
Bibliographic Details
Online Access: Full Text (via Wiley)
Main Authors: Bauer, Eric, Zhang, Xuemei (Author), Kimber, Douglas A. (Author)
Format: Electronic eBook
Language:English
Published: Piscataway, NJ : Hoboken, N.J. : IEEE Press ; Wiley, ©2009.
Subjects:
Table of Contents:
  • Preface
  • Acknowledgments
  • 1 Introduction
  • 2 System Availability
  • 2.1 Availability, Service and Elements
  • 2.2 Classical View
  • 2.3 Customers' View
  • 2.4 Standards View
  • 3 Conceptual Model of Reliability and Availability
  • 3.1 Concept of Highly Available Systems
  • 3.2 Conceptual Model of System Availability
  • 3.3 Failures
  • 3.4 Outage Resolution
  • 3.5 Downtime Budgets
  • 4 Why Availability Varies Between Customers
  • 4.1 Causes of Variation in Outage Event Reporting
  • 4.2 Causes of Variation in Outage Duration
  • 5 Modeling Availability
  • 5.1 Overview of Modeling Techniques
  • 5.2 Modeling Definitions
  • 5.3 Practical Modeling
  • 5.4 Widget Example
  • 5.5 Alignment with Industry Standards
  • 6 Estimating Parameters and Availability from Field Data
  • 6.1 Self-Maintaining Customers
  • 6.2 Analyzing Field Outage Data
  • 6.3 Analyzing Performance and Alarm Data
  • 6.4 Coverage Factor and Failure Rate
  • 6.5 Uncovered Failure Recovery Time
  • 6.6 Covered Failure Detection and Recovery Time
  • 7 Estimating Input Parameters from Lab Data
  • 7.1 Hardware Failure Rate
  • 7.2 Software Failure Rate
  • 7.3 Coverage Factors
  • 7.4 Timing Parameters
  • 7.5 System-Level Parameters
  • 8 Estimating Input Parameters in the Architecture/Design Stage
  • 8.1 Hardware Parameters
  • 8.2 System-Level Parameters
  • 8.3 Sensitivity Analysis
  • 9 Prediction Accuracy
  • 9.1 How Much Field Data Is Enough?
  • 9.2 How Does One Measure Sampling and Prediction Errors?
  • 9.3 What Causes Prediction Errors?
  • 10 Connecting the Dots
  • 10.1 Set Availability Requirements
  • 10.2 Incorporate Architectural and Design Techniques
  • 10.3 Modeling to Verify Feasibility
  • 10.4 Testing
  • 10.5 Update Availability Prediction
  • 10.6 Periodic Field Validation and Model Update
  • 10.7 Building an Availability Roadmap
  • 10.8 Reliability Report
  • 11 Summary
  • Appendix A System Reliability Report outline
  • 1 Executive Summary
  • 2 Reliability Requirements
  • 3 Unplanned Downtime Model and Results.
  • Annex A Reliability Definitions
  • Annex B References
  • Annex C Markov Model State-Transition Diagrams
  • Appendix B Reliability and Availability Theory
  • 1 Reliability and Availability Definitions
  • 2 Probability Distributions in Reliability Evaluation
  • 3 Estimation of Confidence Intervals
  • Appendix C Software Reliability Growth Models
  • 1 Software Characteristic Models
  • 2 Nonhomogeneous Poisson Process Models
  • Appendix D Acronyms and Abbreviations
  • Appendix E Bibliography
  • Index
  • About the Authors.