Fisher information for irregular likelihood functions


I recently came across the following problem: how do you work out the Fisher information for a parameter \theta \in \Omega when the likelihood function, say p(\theta; x), is continuous, but not everywhere differentiable with respect to the parameter. The standard formula for the Fisher information

 {\rm J}(\theta) = \int_{\Omega} \left(\frac{\partial^2 l}{\partial \theta^2}\right) p(\theta; x) dx

assumes regularity conditions which no longer hold, and hence is not applicable. After hunting around with Google for some time, I came across the following (freely downloadable) paper

H. E. Daniels
The Asymptotic Efficiency of a Maximum Likelihood Estimator
Fourth Berkeley Symp. on Math. Statist. and Prob., University of California Press, 1961, 1, 151-163

which, fortunately, had exactly what I need. It turns out, we can still compute the Fisher information, without existence of second derivatives, using

 {\rm J}(\theta) = \int_{\Omega} \left(\frac{\partial l}{\partial \theta}\right)^2 p(\theta; x) dx

provided a set of weaker conditions holds. To get my head around the issue, I decided to look at a simple problem of working out the Fisher information for the mean \mu \in \Omega = \mathbb{R} of a Laplace distribution

  p(x,\mu; s)= \frac{1}{2s} \exp(-| x - \mu| / s)

The log-likelihood is now given by

  l(\mu; s) = -\log(2s) - \frac{|x-\mu|}{s}  = \left\{\begin{array}{ll} -\log(2s) - \frac{x-\mu}{s} & x > \mu \\ -\log(2s) + \frac{x-\mu}{s} 	& x < \mu \\ \end{array} \right.  [/latex] </center>   The first derivative with respect to [latex]\mu is

\frac{\partial l}{\partial \mu}  = \left\{\begin{array}{ll} \frac{1}{s} & x > \mu \\ -\frac{1}{s} & x < \mu \\ \end{array} \right.  [/latex]</center>   Therefore, the Fisher information is   <center>[latex] {\rm J}(\mu) = \int_{-\infty}^\mu \left(\frac{\partial l}{\partial \mu}\right)^2 p(x,\mu; s) dx + \int_{\mu}^\infty \left(\frac{\partial l}{\partial \mu}\right)^2 p(x,\mu; s) dx = \frac{1}{s^2}

The Fisher information for the scale parameter can be obtained in a similar manner. The first derivative with respect to s is

\frac{\partial l}{\partial s}  = -\frac{1}{s} + \frac{1}{s^2} |x - \mu| = \frac{1}{s^2} (|x - \mu| - s )

Since,

 {\rm E}\{|x-\mu|\} = s,  {\rm E}\{|x-\mu|^2\} = 2 s^2

the Fisher information for the scale parameter is

 {\rm J}(s) = \frac{1}{s^4} {\rm E} \left\{(|x-\mu| - s)^2 \right\} = \frac{1}{s^2}

The paper by Daniels details the exact conditions needed for the Fisher information to be derived in this way. Note, there is a mistake in one of the proofs, the correction is detailed in

J. A. Williamson
A Note on the Proof by H. E. Daniels of the Asymptotic Efficiency of a Maximum Likelihood Estimator
Biometrika, 1984, 71, 651-653

Unfortunately, the Williamson paper requires a JSTOR subscription for download.

  1. No comments yet.
(will not be published)