Saturday, 26 September 2015

SAS Analytics Interview - Quiz

SAS is used for Reporting, Data Analytics, Business Analytics, and Predictive Modeling. Understanding of SAS working is essential in delivering right insights and reports.
Some of the questions asked on analytics interview are listed in previous blog. Below quiz on SAS Interview Questions help you in testing your knowledge and preparedness for an analytics job.
Interview Questions SAS

Take a quiz to check your knowledge. It is free SAS quiz.

Friday, 25 September 2015

K Means Clustering Algorithm: Explained

Classification problems are solved by objective segmentation and subjective segmentation.
A non technical explanation ( http://dni-institute.in/blogs/segmentation-a-perspective-2/ ) on when to use subjective segmentation technique such as K means clustering and when to use objective segmentation methods such as Decision Tree.
One of the most frequently used unsupervised algorithms is K Means. K Means Clustering is exploratory data analysis technique. This is non-hierarchical method of grouping objects together.
Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense or another) to each other than to those in other groups (clusters).
In this blog, we aim to explain the algorithm in  a simple steps and with an example.
Business Scenario: We have height and weight information. Using these two variables, we need to group the objects based on height and weight information.
k means clusters
If you look at the above chart, you will expect that there are two visible clusters/segments  and we want these to be identified using K Means algorithm.

Tuesday, 22 September 2015

Analytics Interview Process and Questions: Real Case

Typically an analytics job interview process involves multiple stages or rounds. In each of these stages, various skills are evaluated.
Some of the key skills required for Analytics Professional or Data Scientist roles are
Skills required for an analyst
  • Communication & Leadership Skills
  • Functional and Business Domain Knowledge
  • Logical and Analytical Skills
  • Technical Skills - Machine Learning & Statistical Techniques
  • Technical Skills - Tools & Technology  e.g. analytical tools (e.g. SAS, R, Python etc), visualization tools, databases (e.g. Hadoop, Teradata etc)

Friday, 18 September 2015

SVM for Regression using R

Support Vector Machine for Regression using R

Predictive Modelling problems are classified either as classification or Regression problem. Support Vector Machine (SVM) algorithm could be used for both classification and regression scenarios. In the earlier blog, we have explained SVM technique and its way of working using an example
In regression problems, the target variable is continuous and value of the target/decision variable is estimated using a set of independent variables.
In classification problems, the decision variable is discrete - Binary, Nominal or Ordinal. In the previous blog, we have applied Support Vector Machine (SVM) for Binary Predictive Modelling scenario and using R (an open source Statistical computing system).
svm for regression

Tuesday, 15 September 2015

Support Vector Machine using R

Predictive Modelling problems are classified either as classification or regression problem. Within classification based on the level and type of decision variable (Target Variable), different algorithms could be used. A number of statistical and machine learning techniques are available for both classification and regression type of the problems.
Some of the commonly used techniques for classification business scenarios are:
SVM Planes
In this blog we will focus on using using Support Vector Machine for Classification problems. In the previous blog, we had focused on explaining concept of Support Vector Machine (SVM) algorithm using an example. Aim is to provide a beginner level tutorial on - SVM using R.

Friday, 11 September 2015

Decision Tree Algorithm: CHAID

There a number of different Decision Tree building algorithm available for both Regression and Classification problems. One of the great advantage with Decision Tree algorithm is that the output can be easily explained to business users.
Some of the decision tree building algorithms are
In this blog, the focus will be to explain Chi-squared Automatic Interaction Detection (CHAID) based decision tree building.
CHAID Decision Tree




Read further : How does CHAID algorithm work? Worked out examples of CHAID algoritm

Saturday, 5 September 2015

Support Vector Machine: Simplified

Support Vector MachineSupport Vector Machine (SVM) is one of the machine learning algorithms used for supervised problem sets mainly.
Some of the other algorithms which can be replaced with SVM are Decision Tree, Random Forest, Neural Network or Logistic Regression, specifically for Binary Classification problems.
Due to mathematical complexity involved in SVM algorithm, it is some time difficult for practitioners to understand the Support Vector Machine.
In this blog, we want to take a simple example and work out the Support Vector Machine algorithm steps. Of course, we will not be able to consider all complexities and details, but could be helpful in appreciating the algorithm.