Simple maximum likelihood estimation

By default, this function performs a maximum likelihood estimation for one or several parameters, but it can be used for any other optimization problems. The interface is intented to be rather simple while allowing more advanced parametrizations.

Usage

smle(data, ...)

# S3 method for default
smle(data, f, param_init, max = TRUE, ...)

# S3 method for intensity
smle(data, f, param_init, max = TRUE, ...)

Arguments

data: The data set to work with. It can be a vector (if there is only one variable), a data frame (if there is one or more variables) or an intensity object.
...: Additional arguments to be passed to optim.
f: A function to be maximized, typically a log-likelihood function. This function must have only two arguments: data and param, which must correspond to the data argument of smle and a named vector of the parameter(s) to be estimated.
param_init: Either a named vector with proposed initial values of the parameter(s), or a function that returns such a vector. This parameter is not needed if the parameter param of f is already provided with such a named vector.
max: Does f need to be maximized? Set to FALSE to require a minimization of f.

Value

An object of class smle, which is a list containing the following components:

`call`	The call.
`coef`	The estimated coefficients.
`coef_se`	The standard errors of the estimated coefficients.
`vcov`	The variance-covariance matrix of the estimated coefficients.
`data`	The `data` parameter.
`f`	The `f` parameter.
`nobs`	The number of observations.
`full_input, full_output`	The full input and output of the `optim` function.

Details

The optim tool does the hard work under the hood. Extra arguments (e.g. method of optimization to be used) can be passed to optim through the ... argument. Note that contrary to the default optim arguments, smle tries to solve a maximization problem using the method "L-BFGS-B" by default (see optim documentation for more information).

Examples

set.seed(12345)
data <- rlogis(100, location = 5, scale = 2)
ll_logis <- function(data, param = c(location = 0, scale = 1)) {
    sum(dlogis(x = data, location = param[["location"]],
               scale = param[["scale"]], log = TRUE))
}
res <- smle(data, ll_logis)
res
#> location: 5.111 (± 0.3764)
#> scale: 2.179 (± 0.183)
summary(res)
#> Maximum likelihood estimation
#> 
#> Call:
#> smle.default(data = data, f = ll_logis)
#> 
#> Coefficients:
#>          Estimate Std.Err Z value    Pr(>z)    
#> location  5.11143 0.37642  13.579 < 2.2e-16 ***
#> scale     2.17878 0.18297  11.908 < 2.2e-16 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

# Using the magrittr syntax:
require(magrittr)
data %>% smle(ll_logis)
#> location: 5.111 (± 0.3764)
#> scale: 2.179 (± 0.183)

# Comparision with the output of fitdistr (MASS package), which works for a
# limited number of predefined distributions:
require(MASS)
fitdistr(data, "logistic")
#>    location      scale  
#>   5.1114264   2.1787668 
#>  (0.3764149) (0.1829733)

# Example with an intensity object:
require(magrittr)
require(dplyr)
#> Loading required package: dplyr
#> 
#> Attaching package: ‘dplyr’
#> The following object is masked from ‘package:MASS’:
#> 
#>     select
#> The following object is masked from ‘package:epiphy’:
#> 
#>     count
#> The following objects are masked from ‘package:stats’:
#> 
#>     filter, lag
#> The following objects are masked from ‘package:base’:
#> 
#>     intersect, setdiff, setequal, union
data <- tomato_tswv$field_1929 %>%
    filter(t == 1) %>%
    incidence() %>%
    clump(unit_size = c(x = 3, y = 3))
ll_betabinom <- function(data, param) {
    sum(dbetabinom(x = data[["i"]], size = data[["n"]],
                   prob = param[["prob"]], theta = param[["theta"]],
                   log = TRUE))
}
epsilon <- 1e-7
res <- smle(data, ll_betabinom, param_init = c(prob = 0.5, theta = 0.5),
            lower = c(prob  = 0 + epsilon,
                      theta = 0 + epsilon),
            upper = c(prob = 1 - epsilon,
                      theta = Inf))
res
#> prob: 0.1813 (± 0.01191)
#> theta: 0.04939 (± 0.01976)
summary(res)
#> Maximum likelihood estimation
#> 
#> Call:
#> smle.intensity(data = data, f = ll_betabinom, param_init = c(prob = 0.5, 
#>     theta = 0.5), lower = c(prob = 0 + epsilon, theta = 0 + epsilon), 
#>     upper = c(prob = 1 - epsilon, theta = Inf))
#> 
#> Coefficients:
#>       Estimate  Std.Err Z value  Pr(>z)    
#> prob  0.181326 0.011913 15.2204 < 2e-16 ***
#> theta 0.049393 0.019759  2.4997 0.01243 *  
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

param_init <- data.frame(lower = c(0 + epsilon, 0 + epsilon),
                         start = c(0.5, 0.5),
                         upper = c(1 - epsilon, Inf))
rownames(param_init) <- c("prob", "theta")
res <- smle(data, ll_betabinom, param_init)
res
#> prob: 0.1813 (± 0.01191)
#> theta: 0.04939 (± 0.01976)
summary(res)
#> Maximum likelihood estimation
#> 
#> Call:
#> smle.intensity(data = data, f = ll_betabinom, param_init = param_init)
#> 
#> Coefficients:
#>       Estimate  Std.Err Z value  Pr(>z)    
#> prob  0.181326 0.011913 15.2204 < 2e-16 ***
#> theta 0.049393 0.019759  2.4997 0.01243 *  
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1