Mean Continuous Outcome Ratio Generalized Linear Model Log Link

We saw this material at the end of the Lesson 6. But a Latin proverb says: "Repetition is the mother of study" (Repetitio est mater studiorum). Let's look at the basic structure of GLMs again, before studying a specific example of Poisson Regression.

The logistic regression model is an example of a broad class of models known as generalized linear models (GLM). For example, GLMs also include linear regression, ANOVA, poisson regression, etc.

There are three components to a GLM:

  • Random Component – refers to the probability distribution of the response variable (Y); e.g. binomial distribution for Y in the binary logistic regression.
  • Systematic Component - refers to the explanatory variables (X 1, X 2, ... Xk ) as a combination of linear predictors; e.g. β0 + β1 x 1 + β2 x 2 as we have seen in logistic regression.
  • Link Function, η or g(μ) - specifies the link between random and systematic components. It says how the expected value of the response relates to the linear predictor of explanatory variables; e.g. η = logit(π) for logistic regression.

For a more detailed discussion refer to Agresti(2007), Ch. 3, Agresti (2002), Ch.4, (pages 115-118, 135-132), Agresti (1996), Ch.4, and/or McCullagh & Nelder (1989).

Simple Linear Regression

Models how mean expected value of a continuous response variable depends on a set of explanatory variables.

Yi = β0 + βxi + εi

or

E(Yi ) = β0 + βxi

  • Random component: Y is a response variable and has a normal distribution, and generally we assume ei ~ N(0, σ2).
  • Systematic component: X is the explanatory variable (can be continuous, discrete, or both) and are linear in the parameters β0 + βxi
  • Link function: Identity Link η = g(E(Yi )) = E(Yi )

Binary Logistic Regression

Models how binary response variable depends on a set of explanatory variable

  • Random component: The distribution of Y is Binomial
  • Systematic component: Xs are explanatory variables (can be continuous, discrete, or both) and are linear in the parameters β0 + βxi + ... + β0 + βxk
  • Link function: Logit
  • formula

Loglinear Models

Model the expected cell counts as a function of levels of categorical variables

  • Random component: The distribution of counts is Poisson
  • Systematic component: Xs are discrete variables used in cross-classification, and are linear in the parameters formula
  • Link Function: Log η = log(μ)

They are related in a sense that the loglinear models are more general than logit models, and some logit models are equivalent to certain loglinear models (e.g. consider the admissions data example or boys scout example).

  • if you have a binary response variable in the loglinear model, you can construct the logits to help with the interpretation of the loglinear model.
  • some logit models with only categorical variables have equivalent loglinear models

On the next slide we will consider the boys scout data and the homogeneous model (DS, BS, DB), and see once again how this ties in with the discussion in the Section B of Lesson 5.

Loglinear model is also equivalent to poisson regression model when all explanatory variables are discrete. For more on poisson regression models see the next section of this lesson, Agresti(2007), Sec. 3.3, Agresti (2002), Section 4.3 (for counts), Section 9.2 (for rates), and Section 13.2 (for random effects) and Agresti (1996), Section 4.3.

Loglinear model:

formula

If we focus on delinquent status

π ij = Pr(Yes Delinquent | S = i, B = j)

and the logit model for boy's delinquent status is

formula

Compare this to model (4) in Section B of Lesson 5 , where β1, β2 are equivalent to β i for three levels of S, and β3 is equivalent to β j for two levels of B.

GLM Table based on Agresti (2002), pg. 118

Model

Random

Link

Systematic

Linear Regression

Normal

Identity

Continuous

ANOVA

Normal

Identity

Categorical

ANCOVA

Normal

Identity

Mixed

Logistic Regression

Binomial

Logit

Mixed

Loglinear

Poisson

Log

Categorical

Poisson Regression

Poisson

Log

Mixed

Multinomial response

Multinomial

Generalized Logit

Mixed

Advantage of GLM over Traditional Regression

  • We do not need to transform the response Y to have a normal distribution
  • The choice of link is separate from the choice of random component thus have more flexibility in modeling
  • If the link produces additive effects, then we do not need constant variance.
  • The models are fitted via Maximum Likelihood estimation; thus optimal properties of the estimators.
  • All the inference tools and model checking we discussed for logistic regression and loglinear models apply for other GLMs too; e.g., Wald and Likelihood ratio tests, Deviance, Residuals, Confidence intervals, Overdispersion.
  • Often one procedure in a software package, e.g. PROC GENMOD in SAS or glm() in R, etc... with options to vary the three components.

But there are some limitations too

  • Linear function, e.g. linear predictors
  • Responses must be independent

There are ways around these restrictions; e.g. consider our analysis of matched data, or use NLMIXED in SAS, or consider other models and alternative software packages.

Some additional references:

  • Collett, D (1991). Analysis of Binary Data.
  • Fey, M. (2002). Measuring a binary response's range of influence in logistic regression. American Statistician, 56, 5-9.
  • Hosmer, D.W. & Lemeshow, S. (1989). Applied Logistic Regression.
  • Fienberg, S.E. The Analysis of Cross-Classified Categorical Data. 2nd ed. Cambridge, MA
  • McCullagh, P. & Nelder, J.A. (1989). Generalized Linear Models. 2nd Ed.
  • Pregibon, D. (1981) Logistic Regression Diagnostics. Annals of Statistics, 9, 705-724.
  • Rice, J. C. (1994). "Logistic regression: An introduction". In B. Thompson, ed., Advances in social science methodology, Vol. 3: 191-245. Greenwich, CT: JAI Press. Popular introduction.
  • SAS Institute (1995). Logistic Regression Examples Using the SAS System, Version 6.
  • Strauss, David (1999). The Many faces of logistic regression. American Statistician.

Next we will see more on Poisson regression...

ablesneave1973.blogspot.com

Source: https://online.stat.psu.edu/stat504/lesson/beyond-logistic-regression-generalized-linear-models-glm

0 Response to "Mean Continuous Outcome Ratio Generalized Linear Model Log Link"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel