The Hodges–Lehmann estimator (HL) of the location parameter for samples of data is defined as the median of the averages of all data pairs. This set of all pairwise averages is usually referred to as the Walsh averages. The Hodges-Lehmann estimator has some excellent efficiency and robustness properties if the observed data is generated by a symmetric distribution.

Below is a simple C function to compute the HL estimator for a given array of observations.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 | // Comparison function needed for quicksort int compare (const void * a, const void * b) { const double *da = (const double *) a; const double *db = (const double *) b; return (*da > *db) - (*da < *db); } // Function to compute the Hodges-Lehmann estimator double hodgeslehmann(double *x, int n) { // store and compute Walsh averages unsigned int nw=n*(n+1)/2; double *walsh_avgs=(double*) malloc(sizeof(double)*nw); unsigned int k=0; for(unsigned int i=0; i<n; i++) { for(unsigned int j=i; j<n; j++) { walsh_avgs[k]=(x[i]+x[j])/2.0; k++; } } // sort the vector qsort(walsh_avgs,nw,sizeof(double),compare); // median of walsh averages double m=walsh_avgs[nw/2]; if(nw % 2 == 0) { m=((walsh_avgs[nw/2] + walsh_avgs[nw/2 - 1]) / 2.0); } // done free(walsh_avgs); return m; } |

I have also written a MATLAB driver function that allows the aforementioned C code to be executed within MATLAB, see the Software section to download this implementation. An efficient (pure) MATLAB implementation that avoids the use of nested *for* loops is a little bit tricky. Here is a simple (memmory intensive!) implementation that relies on the *ndgrid(.)* function. This code works reasonably well as long as the sample size is not too big.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | function m=hodglehm_slow(x) %% Length of vector x=x(:); % make sure x is a column vector n=length(x); %% compute cartesian product [a,b]=ndgrid(x,x); %% Sum of all pairs A=triu((a+b)./2); %% Extract upper diagonal Ix=triu(true(n)); A=A(Ix); %% Compute median of averages m=median(A(:)); %% done return; |

We can get the standard error of the HL estimator in MATLAB by using the bootstrap:

1 2 3 4 5 6 | % Generate 50 observations from a Student-t distribution with 2 dof x=trnd(2,50,1); % Compute standard errors for the mean and HL estimators std(bootstrp(1000, @(a) mean(a), x)) std(bootstrp(1000, @(a) hodglehm_slow(a), x)) |