Rationale: Accurate estimation and prediction of health care costs play crucial roles in the decisions of health plans and other health care agencies on policies and resource allocation. A particularly problematic issue that underlies difficulties in filling that role is the impact of heterogeneous patients on the process. Most previous attempts to address this issue have not balanced simplicity and transparency and statistical soundness and unbiasedness in the implicit loss function and decision calculus for choosing methods. Objectives: Development of any cost model should take into account three features of cost data: 1) cost data are typically non-negative and skewed to the right, 2) health costs are often hierarchical and such hierarchical structures introduce correlations among the cost data, and 3) there is often a significant amount of heteroskedasticity in the data which generates biased estimates if not accounted for. Many methods have been proposed that address one or more of these three difficulties. However, sophisticated methods that do this require a high level of statistical knowledge. are not easily interpretable or understood by managers of health care agencies, and also fail to predict costs accurately across the continuum of heterogeneity in severity of illness. And less sophisticated methods that do this tend to be biased and/or perform very poorly in particular regions of the cost distribution. Our objective is to develop an pproach that employs simple generally understood ordinary least squares methods in a new way to address the empirical features of cost data. Methodology: Case mix diagnostic and demographic information generally employed for risk adjustment can be used first to classify individual patients into spending type groups, then used again to predict health care costs using a separate ordinary least squares regression in each group. This is a special case of sub-classification models in statistics in a way that is closely related to the mixture model approach that has been applied to this problem by Deb and Burgess (2006), but is computationally much simpler to implement. For comparison, we focus on residual mean square error (RMSE) for overall evaluation, mean absolute prediction error (MAPE) for observation by observation evaluation, and predictive ratios by deciles for evaluation across the distribution of costs. We use a 50/50 split sample validation against an array of alternative methods used in the literature on the population of FY2001 users of the US Department of Veterans Affairs health care system to compare models to a three piece implemenation of the new model proposed here. Results: In the FY2001 VA patient sample, there are 3,744,264 patients, and in populations this large, the 50/50 validation does not reveal any overfitting or any significant differences in the validation sample vs. the estimation sample. In particular, we compare six other models to our proposed three piece OLS model. We compare two other simple models (OLS and no intercept OLS) and four more complex models, two GLM models (log gaussian and log gamma) and two log retransformation models (one using the Duan correction, the other using the Zhou correction). OLS models, including the three piece model, have a mean predicted that is closest to the observed mean, and thus also do best using the RMSE criterion. The GLM log gaussian model comes closest to these among the more complicated models. The three piece model does even better on the MAPE criterion (e.g. $2785 vs. $3312 for the OLS). For the predictive ratio results, the 10th decile always is pulled in and estimated accurately and the three piece model actually does slightly worse in that decile. However, a well known problem with the OLS method also is overprediction in the 7th-9th deciles. The three piece model does much better there. And it also addresses the negative prediction problem well in the first three deciles. The more complex models each have their biases in particular deciles. The GLM log gaussian model dramatically overpredicts most of the lower deciles and underpredicts the 10th decile. The other more complicated methods overpredict the 10th decile by ratios exceeding 2.0, though they do better in the other deciles than the GLM log gaussian model. Conclusion: One problem with ordinary least squares models in general is that they can predict negative cost. Many researchers have estimated no intercept ordinary least squares models and taken other steps to avoid this problem, the approach we propose here accomplishes much the same outcome without the bias problem that no intercept models create and predicts all low cost people better than standard ordinary least squares approaches. This approach also performs much better on all of the evaluation criteria than other more complex statistical approaches. We have taken a simple approach to subclassification, as opposed to more complex methods previously suggested (i.e. mixture models), and this achieves the balance between "simplicity and transparency" and "statistical soundness and unbiasedness" that we established as our objective