Minimum Data Base Determination using Machine Learning
The exploitation of large data bases frequently implies the investment of large and, usually, expensive resources both in terms of the storage and processing time required. It is possible to obtain equivalent reduced data sets where the statistical information of the original data may be preserved while dispensing with redundant constituents. Therefore, the physical embodiment of the relevant features of the data base is more economical. The author proposes a method where we may obtain an optimal transformed representation of the original data which is, in general, considerably more compact than the original without impairing its informational content. To certify the equivalence of the original data set (FD) and the reduced one (RD), the author applies an algorithm which relies in a Genetic Algorithm (GA) and a multivariate regression algorithm (AA). Through the combined application of GA and AA the equivalent behavior of both FD and RD may be guaranteed with a high degree of statistical certainty.
| Year of publication: |
2016
|
|---|---|
| Authors: | Kuri-Morales, Angel Ferrnando |
| Published in: |
International Journal of Web Services Research (IJWSR). - IGI Global, ISSN 1546-5004, ZDB-ID 2172665-6. - Vol. 13.2016, 4 (01.10.), p. 1-18
|
| Publisher: |
IGI Global |
| Subject: | Compaction | Data Bases | Machine Learning | Statistics |
Saved in:
Saved in favorites
Similar items by subject
-
timeseriesdb : manage and archive time series data in establishment statistics with R and PostgreSQL
Bannert, Matthias, (2015)
-
Credit risk database for SME financial inclusion
Nguyen, Lan H., (2020)
-
Mortality data correction in the absence of monthly fertility records
Boumezoued, Alexandre, (2021)
- More ...