Data Science training in Hyderabad

Loading...

Advanced Regression

Poisson Regression

Nega)ve Binomial

Zero Inflated ial Mul)nom n Regressio

AGENDA

© 2013 ExcelR Solutions. All Rights Reserved

Multinomial Regression •  Logis'c regression (Binomial distribu'on) is used when output has ‘2’ categories •  Mul'nomial regression (classifica'on model) is used when output has > ‘2’ categories •  Extension to logis'c regression •  No natural ordering of categories

Mode of transport

Car

Carpool

Bus

Rail

All modes

Count

218

32

81

122

453

•  Response variable has > ‘2’ categories & hence we apply mul'logit Probability 0.48 0.07 0.18 0.27 •  Understand the impact of cost & 'me on the various modes of transport

© 2013 ExcelR Solutions. All Rights Reserved

1

Multinomial Regression

•  Whether we have ‘Y’ (response) or ‘X’ (predictor), which is categorical with ‘s’ categories ü  Lowest in numerical / lexicographical value is chosen as baseline / reference ü  Missing level in output is baseline level ü  We can choose the baseline level of our choice based on ‘relevel’ func'on in R ü  Model formulates the rela'onship between transformed (logit) Y & numerical X linearly ü  Modeling quan'ta've variables linearly might not always be correct

© 2013 ExcelR Solutions. All Rights Reserved

Multinomial Regression - Output Itera'on History: •  Itera've procedure is used to compute maximum likelihood es'mates •  # itera'ons & convergence status is provided •  -2logL = 2 * nega've log likelihood •  -2logL has χ2 distribu'on, which is used for hypothesis tes'ng of goodness of fit

# parameters = 27

© 2013 ExcelR Solutions. All Rights Reserved

Multinomial Regression - Output • 

‘car’ has been chosen as baseline

• 

x = vector represen'ng the values of all inputs Log(P(choice = carpool | x) / P(choice = car | x) = β20 + β21 * cost.car + β22 * cost.carpool + ……………. This equa'on compares the log of probabili'es of carpool to car

• 

The regression coefficient 0.636 indicates that for a ‘1’ unit increases the ‘cost.car’, the log odds of ‘carpool’ to ‘car’ increases by 0.636

•  • 

Intercept value does not mean anything in this context If we have a categorical X also, say Gender (female = 0, male = 1), then regression coefficient (say 0.22) indicates that rela've to females, males increase the log odds of ‘carpool’ to ‘car’ by 0.22

© 2013 ExcelR Solutions. All Rights Reserved

Probability •  Let p = p(x | A) be the probability of any event (say airi'on) under condi'on A (say gender = female)

Odds

•  Then p(x | A) ÷ (1 - p(x | A) is called the odds associated with the event

Odds Ratio •  If there are two condi'ons A (gender = female) & B (gender = male) then the ra'o p(x | A) ÷ (1 - p(x | A) / p(x | B) ÷ (1 - p(x | B) is called as odds ra'o of A with respect to B

Relative Risk •  p(x | A) ÷ p(x | B) is called as rela've risk

hips://en.wikipedia.org/wiki/Rela've_risk

© 2013 ExcelR Solutions. All Rights Reserved

Odds Ratio •  Odds ra'o is computed from the coefficients in the linear model equa'on by simply exponen'a'ng •  Exponen'ated regression coefficients are odds ra'o for a unit change in a predictor variable

•  The odds ra'o for a unit increase in cost.car is 1.88 for choosing carpool vs car

© 2013 ExcelR Solutions. All Rights Reserved

Goodness of fit Linear

GLM

Analysis of Variance

Analysis of Deviance

Residual Deviance

Residual Sum of Squares

OLS

Maximum Likelihood

•  Residual Deviance is -2 log L •  Adding more parameters to the model will reduce Residual Deviance even if it is not going to be useful for predic'on •  In order to control this, penalty of “2 * number of parameters” is added to to Residual deviance •  This penalized value of -2 log L is called as AIC criterion •  AIC = -2 log L + 2 * number of parameters Note: “Mul'logit Model with Interac(on”

© 2013 ExcelR Solutions. All Rights Reserved

Loading...

Data Science training in Hyderabad

Advanced Regression Poisson Regression Nega)ve Binomial Zero Inflated ial Mul)nom n Regressio AGENDA © 2013 ExcelR Solutions. All Rights Reserv...

770KB Sizes 1 Downloads 0 Views

Recommend Documents

data science training in hyderabad
ExcelR offers Data Science course in Hyderabad, the most comprehensive Data Science course in the market, covering the

best data science training in hyderabad
ExcelR is considered to be the best Data Science training institute in Hyderabad which offers a gamut of services starti

data science course in hyderabad
Business Analytics or Data Analytics or Data Science certification course is an extremely high-in-demand profession whic

data science course Hyderabad
ExcelR is a proud partner of Universit Malaysia Saravak (UNIMAS), Malaysia’s 1st public University and ranked 8th top un

data science course fee in hyderabad
Data Science is all about mining hidden insights of data pertaining to trends, behaviour, interpretation and inferences

data science training in surat
Data Science is all about mining hidden insights of data pertaining to trends, behaviour, interpretation and inferences

Data science training in pune
Business Analytics or Data Analytics or Data Science certification course is an extremely high-in-demand profession whic

Data Science Training in Delhi
Excelr is the best technology training Data Science Training certification providing all the resources for youreffectiv

Data science training
ExcelR offers 160 hours classroom training on Business Analytics / Data Scientist / Data Analytics. We are considered as

business analytics training in hyderabad
ExcelR offers 160 hours classroom training on Business Analytics / Data Scientist / Data Analytics. We are considered as