International Conference on Information Technology and Computer Science, 3rd (ITCS 2011)
V. E. Muhin
V. E. Muhin
National Technical University of Ukraine
W. B. Hu
W. B. Hu
Wuhan University
Given a collection of strings S={s1,,sn} over an alphabet Σ , a superstring s of S is a string containing each si as a substring; that is, for each i, 1<i<n, s contains a block of |si| consecutive characters that match si exactly. The shortest superstring problem is the problem of finding a superstring s of minimum length. This problem is NP- hard and has applications in computational biology and data compression. In this paper, we characterize the shortest superstring as a Hamiltonian path in a directed graph, and introduce an efficient (polynomial time) approach for it.

I. Introduction
II. DNA Sequencing Problem
III. DNA Overlap Graph Model
IV. Conclusions
