Global Utilities

Research Publications - Abstract

Department of Computer Science & Computer Engineering

Wang, D., and Li, X.
Publication Year: 2009
Paper Title: GAPK: Genetic Algorithms with Prior Knowledge for Motif Discovery in DNA Sequences
Conference Name: 2009 IEEE Congress on Evolutionary computation (IEEE CEC 2009)
Venue: Trondheim, Norway
Volume: IEEE CEC 2009
Pages: 277 - 284
Abstract: Discovery of transcription factor binding sites (TFBSs) or DNA motifs in promoter regions of genes play a key role in understanding the regulations of gene expression. In the past decade computational approaches, including evolutionary computation techniques, for searching for motifs have demonstrated good potential, and some results reported in literature are quite promising. Recently, some favorable progresses on evolutionary mining of motifs have been made and documented in GAME and GALF-P, where GAME employs a Bayesian-based scoring function and GALF-P aims to improve the algorithm performance with local filtering and adaptive post-processing. To improve discovering performance in terms of the recall, precision rates and algorithm reliability, this paper presents an alternative genetic algorithm termed as GAPK for resolving the problem of motifs discovery. In our proposed GAPK framework, a prior knowledge on motifs in a given dataset is used to initialize a population. Our technical contributions include a matrix representation for k-mers, a mismatch-based filtering method for search space reduction, a model mismatch score (MMS) as fitness function, new genetic operations and a model refinement processing. Some benchmarked datasets associated with eight transcription factors are used in our experiments. Comparative studies were carried out with well-known tools including GAME, GALF-P, MEM, MDScan adn AlignACE. Results show that our method outperforms other techniques in terms of F-measure.
Content Approved by: Head of School
Page maintained by: Applications Programmer
Last Updated: 14 October, 2009