## Fisher information for irregular likelihood functions

I recently came across the following problem: how do you work out the Fisher information for a parameter $\theta \in \Omega$ when the likelihood function, say $p(\theta; x)$, is continuous, but not everywhere differentiable with respect to the parameter. The standard formula for the Fisher information ${\rm J}(\theta) = \int_{\Omega} \left(\frac{\partial^2 l}{\partial \theta^2}\right) p(\theta; x) dx$

assumes regularity conditions which no longer hold, and hence is not applicable. After hunting around with Google for some time, I came across the following (freely downloadable) paper

H. E. Daniels
The Asymptotic Efficiency of a Maximum Likelihood Estimator
Fourth Berkeley Symp. on Math. Statist. and Prob., University of California Press, 1961, 1, 151-163

which, fortunately, had exactly what I need. It turns out, we can still compute the Fisher information, without existence of second derivatives, using ${\rm J}(\theta) = \int_{\Omega} \left(\frac{\partial l}{\partial \theta}\right)^2 p(\theta; x) dx$

provided a set of weaker conditions holds. To get my head around the issue, I decided to look at a simple problem of working out the Fisher information for the mean $\mu \in \Omega = \mathbb{R}$ of a Laplace distribution $p(x,\mu; s)= \frac{1}{2s} \exp(-| x - \mu| / s)$

The log-likelihood is now given by $l(\mu; s) = -\log(2s) - \frac{|x-\mu|}{s} = \left\{\begin{array}{ll} -\log(2s) - \frac{x-\mu}{s} & x > \mu \\ -\log(2s) + \frac{x-\mu}{s} & x < \mu \\ \end{array} \right. [/latex] The first derivative with respect to $\mu$ is $\frac{\partial l}{\partial \mu} = \left\{\begin{array}{ll} \frac{1}{s} & x > \mu \\ -\frac{1}{s} & x < \mu \\ \end{array} \right.$ Therefore, the Fisher information is
[latex] {\rm J}(\mu) = \int_{-\infty}^\mu \left(\frac{\partial l}{\partial \mu}\right)^2 p(x,\mu; s) dx + \int_{\mu}^\infty \left(\frac{\partial l}{\partial \mu}\right)^2 p(x,\mu; s) dx = \frac{1}{s^2}$

The Fisher information for the scale parameter can be obtained in a similar manner. The first derivative with respect to $s$ is $\frac{\partial l}{\partial s} = -\frac{1}{s} + \frac{1}{s^2} |x - \mu| = \frac{1}{s^2} (|x - \mu| - s )$

Since, ${\rm E}\{|x-\mu|\} = s$, ${\rm E}\{|x-\mu|^2\} = 2 s^2$

the Fisher information for the scale parameter is ${\rm J}(s) = \frac{1}{s^4} {\rm E} \left\{(|x-\mu| - s)^2 \right\} = \frac{1}{s^2}$

The paper by Daniels details the exact conditions needed for the Fisher information to be derived in this way. Note, there is a mistake in one of the proofs, the correction is detailed in

J. A. Williamson
A Note on the Proof by H. E. Daniels of the Asymptotic Efficiency of a Maximum Likelihood Estimator
Biometrika, 1984, 71, 651-653