Estimation of circadian parameters and investigation in cyanobacteria via semiparametric varying coefficient periodic models
This dissertation includes three components. Component 1 provides an estima-tion procedure for circadian parameters in cyanobacteria. Component 2 explores therelationship between baseline and amplitude by model selection under the frameworkof smoothing spline. Component 3 investigates properties of hypothesis testing. Thefollowing three paragraphs briefly summarize these three components, respectively.Varying coefficient models are frequently used in statistical modeling. We pro-pose a semiparametric varying coefficient periodic model which is suitable to studyperiodic patterns. This model has ample applications in the study of the cyanobac-teria circadian clock. To achieve the desired flexibility, the model we consider maynot be globally identifiable. We propose to perform local approximations by kernelbased methods and focus on estimating one solution that is biologically meaningful.Asymptotic properties are developed. Simulations show that the gain by our proce-dure over the commonly used method is substantial. The methodology is illustratedby an application to a cyanobacteria dataset.Smoothing spline can be implemented, but a direct application with the penaltyselected by the generalized cross-validation often leads to non-convergence outcomes. We propose an adjusted cross-validation instead, which resolves the difficulties. Biol-ogists believe that the amplitude function of the periodic component is proportionalto the baseline function. To verify this belief, we propose a full model without anyassumptions regarding such a relationship, and two reduced models with the ratio ofbaseline and amplitude to be a constant and a quadratic function of time, respectively.We use model selection techniques, Akaike information criterion (AIC) and SchwarzBayesian information criterion (BIC), to determine the optimal model. Simulationsshow that AIC and BIC select the correct model with high probabilities. Applicationto cyanobacteria data shows that the full model is the best model.To investigate the same problem in component 2 by a formal hypothesis testingprocedure, we develop kernel based methods. In order to construct the test statistic,we derive the global degree of freedom for the residual sum of squares. Simulationsshow that the proposed tests perform well. We apply the proposed procedures tothe data and conclude that the baseline and amplitude functions share no linear orquadratic relationship.