Minimum Data Base Determination using Machine Learning

The exploitation of large data bases frequently implies the investment of large and, usually, expensive resources both in terms of the storage and processing time required. It is possible to obtain equivalent reduced data sets where the statistical information of the original data may be preserved while dispensing with redundant constituents. Therefore, the physical embodiment of the relevant features of the data base is more economical. The author proposes a method where we may obtain an optimal transformed representation of the original data which is, in general, considerably more compact than the original without impairing its informational content. To certify the equivalence of the original data set (FD) and the reduced one (RD), the author applies an algorithm which relies in a Genetic Algorithm (GA) and a multivariate regression algorithm (AA). Through the combined application of GA and AA the equivalent behavior of both FD and RD may be guaranteed with a high degree of statistical certainty.

MoreLess

Year of publication:	2016
Authors:	Kuri-Morales, Angel Ferrnando
Published in:	International Journal of Web Services Research (IJWSR). - IGI Global, ISSN 1546-5004, ZDB-ID 2172665-6. - Vol. 13.2016, 4 (01.10.), p. 1-18
Publisher:	IGI Global
Subject:	Compaction \| Data Bases \| Machine Learning \| Statistics

More details

Type of publication:	Article
Language:	English
Other identifiers:	10.4018/IJWSR.2016100101 [DOI]
Source:	Other ZBW resources

Persistent link: https://www.econbiz.de/10012048740