Simple Models and Classification in Networked Data
When entities are linked by explicit relations, classification methodsthat take advantage of the network can perform substantiallybetter than methods that ignore the network. This paper arguesthat studies of relational classification in networked data shouldinclude simple network-only methods as baselines for comparison,in addition to the non-relational baselines that generally areused. In particular, comparing more complex algorithms with algorithmsthat only consider the network (and not the features ofthe entities) allows one to factor out the contribution of the networkstructure itself to the predictive power of the model. Weexamine several simple methods for network-only classificationon previously used relational data sets, and show that they canperform remarkably well. The results demonstrate that the inclusionof network-only classifiers can shed new light on studies ofrelational learners