Fundamentals of data science / Sanjeev J. Wagh, Manisha S. Bhende, and Anuradha D. Thakare.

Fundamentals of Data Science is designed for students, academicians and practitioners with a complete walkthrough right from the foundational groundwork required to outlining all the concepts, techniques and tools required to understand Data Science. Data Science is an umbrella term for the non-trad...

Full description

Saved in:
Bibliographic Details
Online Access: Full Text (via Taylor & Francis)
Main Authors: Wagh, Sanjeev (Author), Bhende, Manisha, 1977- (Author), Thakare, Anuradha, 1978- (Author)
Format: eBook
Language:English
Published: Boca Raton, FL : CRC Press, Taylor & Francis Group, 2022.
Edition:First edition.
Subjects:
Table of Contents:
  • Part-I Data Science IntroductionImportance of Data ScienceNeed for Data ScienceWhat is Data ScienceData Science ProcessBusiness Intelligence and Data SciencePrerequisite for Data ScientistComponents of Data ScienceTools and Skills NeedSummaryExerciseReferencesStatistics and Probability2.1 Data Types2.2. Variable Types2.3 Statistics2.4 Sampling Techniques and Probability2.5 Information Gain and Entropy2.6 Probability Theory2.7 Probability Types2.8 Probability Distribution2.9 Bayes Theorem2.10 Inferential Statistics 2.11 SummaryExercise References3. Databases for Data Science3.1 SQL-Tool for Data Science3.1.1 Basic Statistics with SQL3.1.2 Data Munging with SQL3.1.3 Filtering, Joins and Aggregation3.1.4 Window Functions and Ordered Data3.1.5 Preparing Data for Analytics Tool3.2 NoSQL for Data Science3.2.1 Why NoSQL3.2.2 Document databases for Data Science3.2.3 Wide-Column Databases for Data Science3.2.4 Graph Databases for Data Science3.3 SummaryExerciseReferencesPart II Data Modelling and AnalyticsChapter 4: Data Science Methodology4.1 Analytics for Data Science4.2 Data Analytics Examples4.3 Data Analytics Life Cycle4.3.1 Data Discovery4.3.2 Data preparation4.3.3 Model Planning4.3.4 Model Building4.3.5 Communicate Results4.3.6 Operationalization4.4 SummaryExerciseReferencesChapter 5: Data Science Methods and Machine learning5.1 Regression Analysis5.1.1 Linear Regression 5.1.2 Logistic Regression 5.1.3 Multinomial Logistic Regression5.1.4 Time Series Models5.2 Machine Learning5.2.1 Decision Trees 5.2.2 Naïve Bayes5.2.3 Support Vector Machines5.2.4 Nearest Neighbour learning5.2.5 Clustering5.2.6 Confusion Matrix5.3 SummaryExercise ReferencesChapter 6: Data Analytics and Text Mining6.1 Text Mining6.1.1 Major Text Mining Areas6.2 Text Analytics6.2.1 Text Analysis Subtasks6.2.2 Basic Text Analysis Steps6.3 Natural Language Processing6.3.1 Major Components of NLP6.3.2 Stages of NLP6.3.3 Statistical Processing of Natural Language6.3.4 Applications of NLP6.4 SummaryExercise ReferencesPart III: Platforms for Data ScienceChapter 7: Data Science Tool: Python Basics Of PythonPython libraries: Data Frame Manipulation with Pandas, NumpyData Analysis Exploration With PythonTime Series DataClustering with PythonArch & GarchDimensionality ReductionPython for Machine LearningAlgorithms: KNN, Decision Tree, Random Forest, SVMPython IDEs for Data ScienceSummaryExerciseReferencesChapter 8: Data Science Tool: R8.1 Reading and Getting Data into R8.1.1 Reading Data into R8.1.2 Writing Data into File8.1.3 Scan() function8.1.4 Built-in Datasets8.2 Ordered and Unordered Factors8.3 Arrays and Matrices8.3.1 Arrays8.3.2 Matrices8.4 Lists and Data Frames8.4.1 Lists8.4.2 Data Frames8.5 Probability Distributions8.5.1 Normal Distribution8.6 Statistical Models in R8.6.1 Model Fitting8.6.2 Marginal Effects8.7 Manipulating Objects8.7.1 Viewing Objects8.7.2 Modifying Objects8.7.3 Appending Elements8.7.4 Deleting Objects8.8 Data Distribution8.8.1 Visualizing Distributions8.8.2 Statistics in Distributions8.9 SummaryExerciseReferencesChapter 9: Data Science Tool: MATLAB9.1 Data Science Workflow and MATLAB9.2 Importing Data9.2.1 How Data is stored9.2.2 How MATLAB Represents Data9.2.3 MATLAB Data Types9.2.4 Automating the Import Process9.3 Visualizing and Filtering Data9.3.1 Plotting Data Contained in Tables9.3.2 Selecting Data from Tables9.3.3 Accessing and Creating Table Variables9.4 Performing Calculations9.4.1 Basic Mathematical Operations9.4.2 Using Vectors9.4.3 Using Functions9.4.4 Calculating Summary Statistics9.4.5 Correlations between Variables9.4.6 Accessing Subsets of Data9.4.7 Performing Calculations by Category9.5 SummaryExerciseReferencesChapter 10 : GNU Octave as a Data Science Tool10.1 Vectors and Matrices10.2 Arithmetic Operations 10.3 Set Operations10.4 Plotting Data10.5 SummaryExerciseReferencesChapter 11: Data Visualization using Tableau11.1 Introduction to Data Visualization 11.2 Tableau Basics11.3 Dimensions, Measures and Descriptive Statistics11.4 Basic Charts11.5 Dashboard Design & Principles 11.6 Special Chart Types11.7 Integrate Tableau with Google Sheets11.8 SummaryExerciseReferencesIndex.