gam: Fitting Generalized Additive Models in gam: Generalized Additive Models (2024)

gamR Documentation

Fitting Generalized Additive Models

Description

gam is used to fit generalized additive models, specified by giving asymbolic description of the additive predictor and a description of theerror distribution. gam uses the backfitting algorithm tocombine different smoothing or fitting methods. The methods currentlysupported are local regression and smoothing splines.

Usage

gam( formula, family = gaussian, data, weights, subset, na.action, start = NULL, etastart, mustart, control = gam.control(...), model = TRUE, method = "glm.fit", x = FALSE, y = TRUE, ...)gam.fit( x, y, smooth.frame, weights = rep(1, nobs), start = NULL, etastart = NULL, mustart = NULL, offset = rep(0, nobs), family = gaussian(), control = gam.control())

Arguments

formula

a formula expression as for other regression models, of theform response ~ predictors. See the documentation of lm andformula for details. Built-in nonparametric smoothing terms areindicated by s for smoothing splines or lo for loesssmooth terms. See the documentation for s and lo for theirarguments. Additional smoothers can be added by creating the appropriateinterface functions. Interactions with nonparametric smooth terms are notfully supported, but will not produce errors; they will simply produce theusual parametric interaction.

family

a description of the error distribution and link function tobe used in the model. This can be a character string naming a familyfunction, a family function or the result of a call to a family function.(See family for details of family functions.)

data

an optional data frame containing the variables in the model.If not found in data, the variables are taken fromenvironment(formula), typically the environment from which gamis called.

weights

an optional vector of weights to be used in the fittingprocess.

subset

an optional vector specifying a subset of observations to beused in the fitting process.

na.action

a function which indicates what should happen when the datacontain NAs. The default is set by the na.action setting ofoptions, and is na.fail if that is unset. The“factory-fresh” default is na.omit. A special methodna.gam.replace allows for mean-imputation of missing values(assumes missing at random), and works gracefully with gam

start

starting values for the parameters in the additive predictor.

etastart

starting values for the additive predictor.

mustart

starting values for the vector of means.

control

a list of parameters for controlling the fitting process.See the documentation for gam.control for details. These canalso be set as arguments to gam() itself.

model

a logical value indicating whether model frame should beincluded as a component of the returned value. Needed if gam iscalled and predicted from inside a user function. Default is TRUE.

method

the method to be used in fitting the parametric part of themodel. The default method "glm.fit" uses iteratively reweightedleast squares (IWLS). The only current alternative is "model.frame"which returns the model frame and does no fitting.

x, y

For gam: logical values indicating whether the responsevector and model matrix used in the fitting process should be returned ascomponents of the returned value.

For gam.fit: x is a model matrix of dimension n * p,and y is a vector of observations of length n.

...

further arguments passed to or from other methods.

smooth.frame

for gam.fit only. This is essentially a subset ofthe model frame corresponding to the smooth terms, and has the ingredientsneeded for smoothing each variable in the backfitting algorithm. Theelements of this frame are produced by the formula functions lo ands.

offset

this can be used to specify an a priori known componentto be included in the additive predictor during fitting.

Details

The gam model is fit using the local scoring algorithm, which iterativelyfits weighted additive models by backfitting. The backfitting algorithm is aGauss-Seidel method for fitting additive models, by iteratively smoothingpartial residuals. The algorithm separates the parametric from thenonparametric part of the fit, and fits the parametric part using weightedlinear least squares within the backfitting algorithm. This version ofgam remains faithful to the philosophy of GAM models as outlined inthe references below.

An object gam.slist (currently set to c("lo","s","random"))lists the smoothers supported by gam. Corresponding to each of theseis a smoothing function gam.lo, gam.s etc that take particulararguments and produce particular output, custom built to serve as buildingblocks in the backfitting algorithm. This allows users to add their ownsmoothing methods. See the documentation for these methods for furtherinformation. In addition, the object gam.wlist (currently set toc("s","lo")) lists the smoothers for which efficient backfitters areprovided. These are invoked if all the smoothing methods are of one kind(either all "lo" or all "s").

Value

gam returns an object of class Gam, which inheritsfrom both glm and lm.

Gam objects can be examined by print, summary, plot,and anova. Components can be extracted using extractor functionspredict, fitted, residuals, deviance,formula, and family. Can be modified using update. Ithas all the components of a glm object, with a few more. This alsomeans it can be queried, summarized etc by methods for glm andlm objects. Other generic functions that have methods for Gamobjects are step and preplot.

The following components must be included in a legitimate ‘Gam’ object. Theresiduals, fitted values, coefficients and effects should be extracted bythe generic functions of the same name, rather than by the "$"operator. The family function returns the entire family object usedin the fitting, and deviance can be used to extract the deviance ofthe fit.

coefficients

the coefficients of the parametric part of theadditive.predictors, which multiply the columns of the model matrix.The names of the coefficients are the names of the single-degree-of-freedomeffects (the columns of the model matrix). If the model is overdeterminedthere will be missing values in the coefficients corresponding toinestimable coefficients.

additive.predictors

the additive fit,given by the product of the model matrix and the coefficients, plus thecolumns of the $smooth component.

fitted.values

the fittedmean values, obtained by transforming the componentadditive.predictors using the inverse link function.

smooth, nl.df, nl.chisq, var

these four characterize the nonparametric aspect ofthe fit. smooth is a matrix of smooth terms, with a columncorresponding to each smooth term in the model; if no smooth terms are inthe Gam model, all these components will be missing. Each columncorresponds to the strictly nonparametric part of the term, while theparametric part is obtained from the model matrix. nl.df is a vectorgiving the approximate degrees of freedom for each column of smooth.For smoothing splines specified by s(x), the approximate dfwill be the trace of the implicit smoother matrix minus 2. nl.chisqis a vector containing a type of score test for the removal of each of thecolumns of smooth. var is a matrix like smooth,containing the approximate pointwise variances for the columns ofsmooth.

smooth.frame

This is essentially a subset of themodel frame corresponding to the smooth terms, and has the ingredientsneeded for making predictions from a Gam object

residuals

the residuals from the final weighted additive fit; also known as residuals,these are typically not interpretable without rescaling by the weights.

deviance

up to a constant, minus twice the maximized log-likelihood.Similar to the residual sum of squares. Where sensible, the constant ischosen so that a saturated model has deviance zero.

null.deviance

The deviance for the null model, comparable withdeviance. The null model will include the offset, and an intercept ifthere is one in the model

iter

the number of local scoringiterations used to compute the estimates.

bf.iter

a vector oflength iter giving number of backfitting iterations used at eachinner loop.

family

a three-element character vector giving the nameof the family, the link, and the variance function; mainly for printingpurposes.

weights

the working weights, that is the weightsin the final iteration of the local scoring fit.

prior.weights

thecase weights initially supplied.

df.residual

the residual degrees offreedom.

df.null

the residual degrees of freedom for the nullmodel.

The object will also have the components of a lm object:coefficients, residuals, fitted.values, call,terms, and some others involving the numerical fit. Seelm.object.

Author(s)

Written by Trevor Hastie, following closely the design in the"Generalized Additive Models" chapter (Hastie, 1992) in Chambers and Hastie(1992), and the philosophy in Hastie and Tibshirani (1991). This version ofgam is adapted from the S version to match the glm andlm functions in R.

Note that this version of gam is different from the function with thesame name in the R library mgcv, which uses only smoothing splineswith a focus on automatic smoothing parameter selection via GCV. To avoidissues with S3 method handling when both packages are loaded, the objectclass in package "gam" is now "Gam".

References

Hastie, T. J. (1991) Generalized additive models. Chapter7 of Statistical Models in S eds J. M. Chambers and T. J. Hastie,Wadsworth & Brooks/Cole.

Hastie, T. and Tibshirani, R. (1990) Generalized Additive Models.London: Chapman and Hall.

Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statisticswith S. New York: Springer.

See Also

glm, family, lm.

Examples

data(kyphosis)gam(Kyphosis ~ s(Age,4) + Number, family = binomial, data=kyphosis,trace=TRUE)data(airquality)gam(Ozone^(1/3) ~ lo(Solar.R) + lo(Wind, Temp), data=airquality, na=na.gam.replace)gam(Kyphosis ~ poly(Age,2) + s(Start), data=kyphosis, family=binomial, subset=Number>2)data(gam.data)Gam.object <- gam(y ~ s(x,6) + z,data=gam.data)summary(Gam.object)plot(Gam.object,se=TRUE)data(gam.newdata)predict(Gam.object,type="terms",newdata=gam.newdata)
gam: Fitting Generalized Additive Models in gam: Generalized Additive Models (2024)

References

Top Articles
Latest Posts
Article information

Author: Nicola Considine CPA

Last Updated:

Views: 5753

Rating: 4.9 / 5 (69 voted)

Reviews: 84% of readers found this page helpful

Author information

Name: Nicola Considine CPA

Birthday: 1993-02-26

Address: 3809 Clinton Inlet, East Aleisha, UT 46318-2392

Phone: +2681424145499

Job: Government Technician

Hobby: Calligraphy, Lego building, Worldbuilding, Shooting, Bird watching, Shopping, Cooking

Introduction: My name is Nicola Considine CPA, I am a determined, witty, powerful, brainy, open, smiling, proud person who loves writing and wants to share my knowledge and understanding with you.