Tags
- MDS

Numerical Optimization

Use cases
- Maximum-likelihood estimation
- \(\chi^{2}\) test of normality
Further resources
Useful packages
Get the article source from GitHub

Use cases

Maximum-likelihood estimation

Weibull distribution

library(MASS)
X <- rweibull(100, shape=1.5, scale=100)
fitdistr(X, densfun="weibull", start=list(shape=1, scale=50))

     shape        scale   
   1.4418009   90.8713787 
 ( 0.1142759) ( 6.5847708)

\(\chi^{2}\) test of normality

The \(\chi^{2}\) normality test that compares observed and expected category frequencies after partitioning the value range into discrete bins (categories) does not work correctly when the sample mean and standard deviation are taken as estimates \(\hat{\mu}\) and \(\hat{\sigma}\). The test statistic then does not actually have a \(\chi^{2}\) distribution.

DV <- rnorm(50, mean=0, sd=1)
mean(DV)

[1] -0.09821994

sd(DV)

[1] 0.9775147

Instead, it is necessary to estimate \(\hat{\mu}\) and \(\hat{\sigma}\) taking into account the chosen categories. One possibility to do this is by finding the minimum \(\chi^{2}\) estimates, i.e., those estimates that minimize the observed \(X^{2}\) statistic.

nCls <- 4  # number of classes

## category limits -> cut points for equal-probability intervals
limits <- qnorm(seq(from=1/nCls, to=(nCls-1)/nCls, length.out=nCls-1),
                mean=0, sd=1)

## determine observed category frequencies
breaks   <- c(-Inf, limits, Inf)
DVcut    <- cut(DV, breaks=breaks)
observed <- table(DVcut)

## function that calculates the empirical X^2 test statistic
## needs to be minimized -> objective function
minFunMin <- function(param, brks, obs) {
    if(param[2] < 1e-4) { return(NA_real_) }
    probs    <- diff(pnorm(brks, mean=param[1], sd=param[2]))
    expected <- sum(obs) * probs
    sum((obs-expected)^2 / expected)
}

Optimization = minimization of objective function

resMinChisq <- optim(c(mean(DV), sd(DV)), minFunMin,
                     brks=breaks, obs=observed, gr=NULL, method="BFGS")
resMinChisq$par                         # min-chi^2-estimate

[1] -0.05177565  0.99517343

The other possibility is a grouped maximum likelihood estimate - where the likelihood comes from the multinomial distribution.

## objective function
## negative multinomial log likelihood
minFunGML <- function(param, brks, obs) {
    if(param[2] < 1e-4) { return(NA_real_) }
    probs <- diff(pnorm(brks, mean=param[1], sd=param[2]))
    -dmultinom(obs, size=sum(obs), prob=probs, log=TRUE)
}

## optimize -> minimize objective function
resGrML <- optim(c(mean(DV), sd(DV)), minFunGML,
                 brks=breaks, obs=observed, gr=NULL, method="BFGS")
resGrML$par

[1] -0.05208171  0.99805581

Further resources

CRAN Task View ‘Optimization’

Useful packages

fitdistrplus

Get the article source from GitHub

R markdown - markdown - R code - all posts