Book : Predictive Modeler

Home » Book

Welcome to my ebook Predictive Modeling – Principles & Practice.

My vision for the book is simple. One does not need to go through years of culinary schooling in order to prepare a great meal. All you need is a great recipe. I have tried to pack a lot of practical usefulness in powerful little recipes that you can execute quickly and easily. Take a look through the menu below, and choose your adventure!

Note: You will notice that most of the table of contents is not yet hyperlinked. This is because I am still working on those posts! I am adding new content every week. You can subscribe to get notified of new posts. Articles that include downloadable content are indicated with a little graphic. For example, indicates that the article includes a downloadable excel file, or a SQL file , or an R script , TensorFlow script etc.

1. Read.Me

> 1.1 Introduction
> 1.2 Learning Machine

2. Setting up your Predictive Modeling environment

> 2.1 The Hardware

>> 2.1.1 Computer specifications

> 2.2 The Software

>> 2.2.1 Anaconda

>>> 2.2.1.1 Installing Anaconda

>>> 2.2.1.2 Updating Anaconda

>> 2.2.2 Installing Julia Language

>> 2.2.3 Installing Python

>> 2.2.4 Installing R & RStudio

>> 2.2.5 MS SQL Server & Management Studio
>> 2.2.6 AiXQL

>>> 2.2.6.1 Download AiXQL

>>> 2.2.6.2 Installing AiXQL

>>> 2.2.6.3 Request Key

>> 2.2.7 Installing TensorFlow

>> 2.2.8 Installing PyTorch

>> 2.2.9 Installing TBD

3. Making a Numeric Prediction

> 3.1 Regression Analysis

>> 3.1.1 Ordinary Least Squares (OLS)

>>> 3.1.1.1 Python OLS: Basic Example

>>> 3.1.1.2 R OLS: Basic Example

>>> 3.1.1.3 Julia OLS: Basic Example

>>> 3.1.1.4 Python OLS: Advanced Case Study: Industrial data
>>> 3.1.1.5 Python OLS: Advanced Case Study: Very Large Data
>>> 3.1.1.6 Python OLS: Boston Housing Prices Data

>> 3.1.2 Non-Negative Least Squares

>> 3.1.3 Logistic Regression
>> 3.1.4 Stochastic Gradient Descent (SGD) Regression

>> 3.1.5 Stepwise Regression

>>> 3.1.5.1 R: Stepwise Regression

>>> 3.1.5.2 Least Angle Regression (LAR)

>> 3.1.6 Generalized Linear Model (GLM)
>> 3.1.7 Generalized Additive Model (GAM)
>> 3.1.8 Ridge Regression

>>> 3.1.8.1 Python Ridge: Boston Housing Prices Data

>> 3.1.9 Isotonic Regression
>> 3.1.10 Lasso Regression

>>> 3.1.10.1 Python Lasso: Boston Housing Prices Data

>> 3.1.11 Support Vector Regression
>> 3.1.12 Robust Regression
>> 3.1.13 ElasticNet Regression

>>> 3.1.13.1 Python ElasticNet: Boston Housing Prices Data

>> 3.1.14 Symbolic Regression

>>> 3.1.14.1 Python Symbolic Regression: A Basic Example

> 3.2 Neural Networks

>> 3.2.1 Inspired from Our Brain
>> 3.2.2 Types of Neural Networks
>> 3.2.3 The Multi-Layer Perceptron

>>> 3.2.3.1 Theory: The Multi-Layer Perceptron
>>> 3.2.3.2 Example in MS Excel

>> 3.2.4 TensorFlow

>>> 3.2.4.1 What is TensorFlow?

>>> 3.2.4.2 Tutorial: Beginner

>>> 3.2.4.3 Tutorial: Regression

>>> 3.2.4.4 TensorFlow: Boston Housing Prices Data

>>> 3.2.4.5 TensorFlow: XOR Problem

>> 3.2.5 Vector Quantization
>> 3.2.6 Python: Neural Networks

>>> 3.2.6.1 A Basic Example
>>> 3.2.6.2 Advanced: Case Study #1: Industrial Data
>>> 3.2.6.3 Advanced: Case Study #2: Large Number of Variables

>> 3.2.7 R: Neural Networks
>> 3.2.8 Julia: Neural Networks
>> 3.2.9 AiXQL: Neural Networks

> 3.3 Stochastic Machines

>> 3.3.1 Support Vector Machine
>> 3.3.2 Boltzmann Machine
>> 3.3.3 Simulated Annealing
>> 3.3.4 Genetic Algorithms
>> 3.3.5 Matrix Factorization Method

> 3.4 Time-Series Methods

>> 3.4.1 Autoregression (AR)
>> 3.4.2 Moving Average (MA)
>> 3.4.3 Autoregressive Moving Average (ARMA)
>> 3.4.4 Autoregressive Moving Integrated Average (ARIMA)
>> 3.4.5 Seasonal Autoregressive Moving Integrated Average (SARIMA)

4. Making a prediction about class or category

> 4.1 SGD Classifier
> 4.2 Linear SVC Method
> 4.3 Lazy Classifiers

>> 4.3.1 Nearest Neighbor

>>> 4.3.1.1 k-Nearest Neighbor

>> 4.3.2 K* Algorithm
>> 4.3.3 Bayesian Rules Classifier
>> 4.3.4 Locally Weighted Learning

> 4.4 Kernel Methods

>> 4.4.1 Kernel Density Estimation

> 4.5 Classification Tree Algorithms

>> 4.5.1 Decision Tree

>>> 4.5.1.1 Basic Example: Iris dataset

>> 4.5.2 Naive Bayes Tree
>> 4.5.3 CART
>> 4.5.4 CHAID
>> 4.5.5 Decision Stump
>> 4.5.6 Random Forest

>>> 4.5.6.1 Basic Example: Iris dataset

>> 4.5.7 C4.5 or J4.8
>> 4.5.8 AdaBoost

>>> 4.5.8.1 Basic Example: Iris dataset

>> 4.5.9 Gradient Boosted Tree

>>> 4.5.9.1 Basic Example: Iris dataset

>> 4.5.10 Alternating Decision Tree

> 4.6 Bayesian Classifiers

>> 4.6.1 Averaged, One-Dependence Estimators
>> 4.6.2 BayesNet
>> 4.6.3 Complement Naive Bayes
>> 4.6.4 Naive Bayes
>> 4.6.5 Hidden Naive Bayes
>> 4.6.6 DBNBText
>> 4.6.7 AODEsr (Subsumption Resolution)
>> 4.6.8 WAODE

> 4.7 Rule Based Algorithms

>> 4.7.1 Decision Table
>> 4.7.2 OneR
>> 4.7.3 ZeroR
>> 4.7.4 Conjunctive Rule
>> 4.7.5 PART
>> 4.7.6 NNGE
>> 4.7.7 PRISM
>> 4.7.8 M5Rules
>> 4.7.9 RIDOR
>> 4.7.10 JRIP
>> 4.7.11 Ordinal Learning Method
>> 4.7.12 Fuzzy Unordered Rule Induction

5. Unsupervised Learning

> 5.1 Nearest Neighbor

>> 5.1.1 k-Nearest Neighbor

> 5.2 K* Algorithm
> 5.3 Bayesian Rules Classifier
> 5.4 Locally Weighted Learning
> 5.5 Self-Organizing Maps

6. Exploring Complexity

> 6.1 Cellular Automata
> 6.2 Complex Adaptive Systems
> 6.3 Agent-based modeling

7. Measuring Performance

> 7.1 Error Types
> 7.2 Loss Functions
> 7.3 Performance Metrics

>> 7.3.1 Metric Selection

> 7.4 Validation

>> 7.4.1 Split Sampling
>> 7.4.2 Cross-Validation
>> 7.4.3 Bootstrapping

> 7.5 Estimation Error Measurement

>> 7.5.1 R-Square
>> 7.5.2 Weighted R-Square
>> 7.5.3 Adjusted R-Square
>> 7.5.4 Absolute Error
>> 7.5.5 Prediction Error
>> 7.5.6 RMSE
>> 7.5.7 Correlation Coefficient

> 7.6 Classification Error Measurement

>> 7.6.1 Confusion Matrix
>> 7.6.2 Sensitivity & Specificity
>> 7.6.3 Precision & Accuracy
>> 7.6.4 Entropy
>> 7.6.5 Kappa Statistic

> 7.7 Visualizing Performance

>> 7.7.1 Lift Charts
>> 7.7.2 ROC Curves
>> 7.7.3 Lorenz Curves & Gini Coefficient

8. Automated Predictive Modeling

> 8.1 AutoML
> 8.2 Microsoft Azure AutoML
> 8.3 LazyPredict

9. Practical Matters

> 9.1 Blueprinting & Prototyping
> 9.2 Managing Expectation
> 9.3 Communication
> 9.4 Documentation

>> 9.4.1 Excel Documentation
>> 9.4.2 SQL Documentation
>> 9.4.3 Project Documentation
>> 9.4.4 Notes & Assumptions

> 9.5 Monitoring & Maintenance
> 9.6 Folder Organization

10. Responsibility & Ethics

> 10.1 With great power…

11. Interesting Applications

> 11.1 Charting FitBit weight data

Book

Table of Contents

Recent Posts