Bengio, Yoshua; Dugas, Charles - Centre Interuniversitaire de Recherche en Analyse des … - 2002
We consider sequential data that is sampled from an unknown process, so that the data are not necessarily iid. We develop a measure of generalization for such data and we consider a recently proposed approach to optimizing hyper-parameters, based on the computation of the gradient of a model...