Some sequential estimation problems in logistic regression models

Let $({\bf X}\sb{i},Y\sb{i}), i = 1,2,\cdots,$ be a random sample satisfying a logistic regression model; that is, for each i, log($P(Y\sb{i}$ = $1\vert{\bf X}\sb{i})/P(Y\sb{i}$ = 0$\vert{\bf X}\sb{i})\rbrack$ = ${\bf X}\sbsp{i}{T}\beta\sb0,$ where $Y\sb{i}\in\{$0,1$\},$ ${\bf X}\sb{i}\in{\bf R}\sp{p}$ and $\beta\sb0\in{\bf R}\sp{p}$ is the unknown parameter vector of the logistic regression model. It is known that $\sqrt{n}(\\beta\sb n-\beta\sb0){\buildrel{\cal L}\over{\longrightarrow}} N(0\sb p,\Sigma\sp{-1}),$ where $\\beta\sb n$ is a MLE of $\beta\sb0$ and $\Sigma\sp{-1}$ is the Fisher information matrix. If $\Sigma$ is known then $R\sb d=\{Z\in{\bf R}\sp p:n(Z-\\beta\sb n)\sp T\Sigma(Z-\\beta\sb n)$ $\le n\lambda d\sp2\}$ defines a confidence ellipsoid for $\beta\sb0$, with maximum axis $\le 2d$ and $P(\beta\sb0\in R\sb d)\approx 1 - \alpha$ provided $n\ge a\sp2/(\lambda d\sp2),$ where $\lambda$ is the smallest eigenvalue of $\Sigma$ and a satisfies $P(\chi\sp2(p)\le a\sp2)$ = $1 - \alpha$. If $\Sigma$ is unknown then $\lambda$ usually will be unknown. Hence, there is no fixed sample size that can be used to construct a confidence ellipsoid with prescribed accuracy and confidence level. In this work, a sequential procedure is proposed to overcome this difficulty. The procedure is shown to be asymptotically consistent and efficient. That is to say, as d approaches 0 the coverage probability converges to the required confidence level and the ratio of the expected sample size to the unknown best fixed sample size converges to 1. Similar asymptotic properties for fixed proportional accuracy problems and for two stage procedures have also been obtained.