On the implementation of LIR: the case of simple linear regression with interval data
This paper considers the problem of simple linear regression with interval-censored data. That is, <InlineEquation ID="IEq1"> <EquationSource Format="TEX">$$n$$</EquationSource> <EquationSource Format="MATHML"> <math xmlns:xlink="http://www.w3.org/1999/xlink"> <mi>n</mi> </math> </EquationSource> </InlineEquation> pairs of intervals are observed instead of the <InlineEquation ID="IEq2"> <EquationSource Format="TEX">$$n$$</EquationSource> <EquationSource Format="MATHML"> <math xmlns:xlink="http://www.w3.org/1999/xlink"> <mi>n</mi> </math> </EquationSource> </InlineEquation> pairs of precise values for the two variables (dependent and independent). Each of these intervals is closed but possibly unbounded, and contains the corresponding (unobserved) value of the dependent or independent variable. The goal of the regression is to describe the relationship between (the precise values of) these two variables by means of a linear function. Likelihood-based Imprecise Regression (LIR) is a recently introduced, very general approach to regression for imprecisely observed quantities. The result of a LIR analysis is in general set-valued: it consists of all regression functions that cannot be excluded on the basis of likelihood inference. These regression functions are said to be undominated. Since the interval data can be unbounded, a robust regression method is necessary. Hence, we consider the robust LIR method based on the minimization of the residuals’ quantiles. For this method, we prove that the set of all the intercept-slope pairs corresponding to the undominated regression functions is the union of finitely many polygons. We give an exact algorithm for determining this set (i.e., for determining the set-valued result of the robust LIR analysis), and show that it has worst-case time complexity <InlineEquation ID="IEq3"> <EquationSource Format="TEX">$$O(n^{3}\log n)$$</EquationSource> <EquationSource Format="MATHML"> <math xmlns:xlink="http://www.w3.org/1999/xlink"> <mrow> <mi>O</mi> <mo stretchy="false">(</mo> <msup> <mi>n</mi> <mn>3</mn> </msup> <mo>log</mo> <mi>n</mi> <mo stretchy="false">)</mo> </mrow> </math> </EquationSource> </InlineEquation>. We have implemented this exact algorithm as part of the R package <Emphasis FontCategory="NonProportional">linLIR. Copyright Springer-Verlag Berlin Heidelberg 2014