Transcription factor-binding site identification and gene classification via fusion of the supervised-weighted discrete kernel clustering and support vector machine
The genetic regulatory mechanism heavily influences a substantial portion of biological functions and processes needed to sustain life. For a comprehensive mechanistic understanding of biological processes, it is important to identify the common transcription factor (TF) binding sites (TFBSs) from a set of promoter sequences of co-regulated genes and classify genes that are co-regulated by certain TFs, therefore to provide an insight into the mechanism that underlies the interaction among the co-regulated genes and complicate genetic regulation. We propose a new supervised-weighted discrete kernel clustering (SWDKC) classification method for the identification of TFBS and the classification of gene. Our SWDKC method gave smaller misclassification error rate than the other methods on both the simulated data and the real NF-κB data. We verify that the selected over-represented TFBSs serve informative TFBSs from a biological point of view.
Year of publication: |
2014
|
---|---|
Authors: | Sohn, Insuk ; Shim, Jooyong ; Hwang, Changha ; Kim, Sujong ; Lee, Jae Won |
Published in: |
Journal of Applied Statistics. - Taylor & Francis Journals, ISSN 0266-4763. - Vol. 41.2014, 3, p. 573-581
|
Publisher: |
Taylor & Francis Journals |
Saved in:
Online Resource
Saved in favorites
Similar items by person
-
Sohn, Insuk, (2009)
-
Shim, Jooyong, (2009)
-
Sohn, Insuk, (2008)
- More ...