Home » Book » Practice


My guiding vision for the book was simple. One does not need to go through years of culinary schooling in order to prepare a great meal. All you need is a fantastic recipe. I have tried to pack a lot of practical usefulness in little recipes that you can execute on a Sunday afternoon. Take a look through the menu below, and choose your adventure!

Table of Contents for Volume 2: Practice

Note #1: You will notice that most of the table of contents is not yet hyperlinked. This is because I am still working on those posts! I am adding new content every week, please check back soon.

Note #2: Articles that include downloadable content are indicated with a little graphic. For example,  indicates that the article includes a downloadable excel file, or a SQL file , or an R script , etc.

2.1. Your Predictive Modeling Environment
> 2.1.1 The Machine
>> Computer specifications
> 2.1.2 The Software
> 2.1.3 Organizing your workspace
>> Folder organization
2.2. Data Preparation & Exploration
> 2.2.1 Missing Data
2.3. Finding the right predictive variables
> 2.3.1 Linear Least Median Squares Regression
> 2.3.2 Discriminant Analysis
> 2.3.3 Linear Discriminants
> 2.3.4 Quadratic Discriminants
> 2.3.5 Logistic Discriminants
2.4. Making a numeric prediction
> 2.4.1 Regression Analysis
>> Linear Least Mean Squares Regression
>> Linear Least Median Squares Regression
>> Robust Regression
>> Logistic Regression
>> Probabilistic Regression
>> Generalized Linear Model (GLM)
>> Generalized Additive Model (GAM)
>> Multivariate Adaptive Regression Splines
>> PACE Regression
>> Isotonic Regression
>> Project Pursuit Regression
>> Gaussian Process Regression
> 2.4.2 Neural Networks
An algorithm that can be trained using data to identify past patterns and apply these to future data.
>> Inspired from Our Brain
>> Types of Neural Networks
>> The Multi-Layer Perceptron
>> Voted Perceptron
>> Radial Basis Functions
>> Vector Quantization
> 2.4.3 Stochastic Machines
>> Support Vector Machine
>>> Sequential Minimal Optimization
>> Boltzmann Machine
>> Simulated Annealing
>> Genetic Algorithms
>> Matrix Factorization Method
> 2.4.4 Time-Series Methods
2.5. Making a categorical prediction
> 2.5.1 Lazy Classifiers
>> Nearest Neighbor
>>> k-Nearest Neighbor
>> K* Algorithm
>> Bayesian Rules Classifier
>> Locally Weighted Learning
> 2.5.2 Kernel Methods
>> Kernel Density Estimation
> 2.5.3 Classification Tree Algorithms
>> Rule Tree
>> Naive Bayes Tree
>> Decision Stump
>> Random Tree
>> Random Forest
>> C4.5 or J4.8
>> ID3
>> M5P
>> Alternating Decision Tree
> 2.5.4 Bayesian Classifiers
>> Averaged, One-Dependence Estimators
>> BayesNet
>> Complement Naive Bayes
>> Naive Bayes
>> Naive Bayes Multinomial
>> Naive Bayes Multinomial Updateable
>> Hidden Naive Bayes
>> DBNBText
>> AODEsr (Subsumption Resolution)
> 2.5.5 Rule Based Algorithms
>> Decision Table
>> OneR
>> ZeroR
>> Conjunctive Rule
>> M5Rules
>> Ordinal Learning Method
>> Fuzzy Unordered Rule Induction
2.6. Unsupervised Learning
> 2.6.1 Lazy Classifiers
>> Nearest Neighbor
>>> k-Nearest Neighbor
>> K* Algorithm
>> Bayesian Rules Classifier
>> Locally Weighted Learning
2.7. Exploring Complexity
> 2.7.1 Complexity Science
>> Cellular Automata
>> Complex Adaptive Systems
2.8. Measuring Performance
> 2.8.1 Error Types
> 2.8.2 Loss Functions
> 2.8.3 Performance Metrics
>> Metric Selection
> 2.8.4 Validation
>> Split Sampling
>> Cross-Validation
>> Bootstrapping
> 2.8.5 Estimation Error Measurement
>> R-Square
>> Weighted R-Square
>> Adjusted R-Square
>> Absolute Error
>> Prediction Error
>> Correlation Coefficient
> 2.8.6 Classification Error Measurement
>> Confusion Matrix
>> Sensitivity & Specificity
>> Precision & Accuracy
>> Entropy
>> Kappa Statistic
> 2.8.7 Visualizing Performance
>> Lift Charts
>> ROC Curves
>> Lorenz Curves & Gini Coefficient
2.9. Putting it all together
> 2.9.1 Ensemble Models
> 2.9.2 Crazy, Great Model!
2.10. Relative Performance of Algorithms

Recent posts under: Practice

Since most of the table of contents is not yet hyperlinked, you can see some of the more recent posts below for easier access.

No Comments
Your Anaconda install includes Python. Please follow this link in order to install Anaconda & run a test ordinary linear least-squares regression using Python. Link: Install Anaconda & run OLS using Python   Go back to Volume 2: Practice.
No Comments
In this post you will: Install Anaconda [~10 minutes] Getting Started with Anaconda Navigator [~3 minutes] Running Ordinary Linear Least Squares (OLS) Regression with Python [~3 minutes] Anaconda is one of the fastest ways to install Python, R, Jupyter etc. It is essentially a software manager that will install and[...]
No Comments
In this post you will: Install Julia [will take approx. 10 minutes] Instantiate the Julia kernel from within a Jupyter notebook (using Anaconda Navigator + Jupyter Lab) [~3 minutes] Run Ordinary Least Squares Regression [~5 minutes] Prerequisites: The following posts may be good to review before this one: Installing Anaconda Julia[...]
No Comments
In order to practice predictive modeling you need a great way to work with the data. Microsoft SQL (MS SQL) is an excellent choice for data management system in my experience. The great news is that what you need to get up and running with MS SQL is available for free![...]
No Comments
The following video shows how you can install the AiXQL database & associated scripts. The AiXQL database is the central repository containing all AiXQL stored procedures. Over time, I aim to include original algorithms as well as heuristic adaptations of other common algorithms written in SQL language for my readers[...]
No Comments
Neural Networks are a very important class of machine learning algorithms. These days machine learning or artificial intelligence is fairly mainstream (e.g. intelligent cars, facial recognition, etc.). Whenever you hear about artificial intelligence in the media, there is a good chance that neural networks are behind that intelligence. Neural Nets[...]
No Comments
Prerequisites In order to get the most out of the post below, please check out the following blog posts before proceeding: 1. Theory: The Multi-Layer Perceptron This is an exciting post, because in this one we get to interact with a neural network! There is a download link to an[...]
No Comments
There are different models of machine learning, and an important one is supervised learning . This model requires that we have as well as the corresponding   data. The output data acts as a "supervisor", comparing the output of the algorithm (i.e. a prediction or Ŷ) to the actual value from data (i.e. Y)[...]
No Comments
Neural networks are patterned after our brain. So, understanding neural networks should be just as easy as understanding ourselves! And just as hard. There is a feedback loop in the work done in the field of artificial intelligence. The inspiration for Artificial Neural Networks (ANN) is the mammalian brain. And through[...]
No Comments
In this post you will: Install RStudio (~5 minutes) Running Ordinary Linear Least Squares (OLS) Regression with R (~5 minutes) Prerequisites: The following posts may be good to review before this one: Installing Anaconda Step 1: Install RStudio RStudio can be installed from within Anaconda. See the video below for a quick[...]