Dec 21, 2005 - therefore artificial intelligence models are used in designing methods that are more ..... DNA scaffolds, which served as templates for the targeted deposition of ...... Farhat N H and Hernandez E D M 1995 Logistic networks with.
INSTITUTE OF PHYSICS PUBLISHING
NANOTECHNOLOGY
Nanotechnology 17 (2006) R27–R39
doi:10.1088/0957-4484/17/2/R01
TOPICAL REVIEW
DNA computing: applications and challenges Z Ezziane Dubai University College, College of Information Technology, PO Box 14143, Dubai, UAE, Middle East
Received 17 August 2005 Published 21 December 2005 Online at stacks.iop.org/Nano/17/R27 Abstract DNA computing is a discipline that aims at harnessing individual molecules at the nanoscopic level for computational purposes. Computation with DNA molecules possesses an inherent interest for researchers in computers and biology. Given its vast parallelism and high-density storage, DNA computing approaches are employed to solve many combinatorial problems. However, the exponential scaling of the solution space prevents applying an exhaustive search method to problem instances of realistic size, and therefore artificial intelligence models are used in designing methods that are more efficient. DNA has also been explored as an excellent material and a fundamental building block for building large-scale nanostructures, constructing individual nanomechanical devices, and performing computations. Molecular-scale autonomous programmable computers are demonstrated allowing both input and output information to be in molecular form. This paper presents a review of recent advances in DNA computing and presents major achievements and challenges for researchers in the foreseeable future.
1. Introduction DNA (deoxyribonucleic acid) computing research was inspired by the similarity between the way DNA works and the operation of a theoretical device known as a Turing machine. Turing machines process information and store them as a sequence, or list of symbols, which is very naturally related to the way biological machinery works. Biomolecular computing, where computations are performed by biomolecules, is challenging traditional approaches to computation both theoretically and technologically. The idea that molecular systems can perform computations is not new and was indeed more natural in the pre-transistor age. Most computer scientists know of von Neumann’s discussions of self-reproducing automata in the late 1940s, some of which were framed in molecular terms (McCaskill 2000). Important was the idea, appearing less natural in the current age of dichotomy between hardware and software, that the computations of a device can alter the device itself. This vision is natural at the scale of molecular reactions, although it may appear as a fantasy to those running huge chip 0957-4484/06/020027+13$30.00 © 2006 IOP Publishing Ltd
production facilities. Alan Turing also looked beyond purely symbolic processing to natural bootstrapping mechanisms in his work on self-structuring in molecular and biological systems (McCaskill 2000). In biology, the idea of molecular information processing took hold starting from the unravelling of the genetic code and translation machinery and extended to genetic regulation, cellular signalling, protein trafficking, morphogenesis and evolution, which all have progressed independently of the development in the neurosciences. The essential role of information processing in evolution and the ability to address these issues on laboratory timescales at the molecular level was first addressed by Adleman’s key experiment (Adleman 1994), which demonstrated that the tools of laboratory molecular biology could be used to program computations with DNA in vitro. DNA computing approaches can be performed either in vitro (purely chemical) or in vivo (i.e. inside cellular life forms). The huge information storage capacity of DNA and the low energy dissipation of DNA processing led to an explosion of interest in massively parallel DNA computing. For serious proponents of the field however, there never was
Printed in the UK
R27
Topical Review 5’3’-
A T
C G
C G
T A
G C
T A
T T
T A
G C
C G
-3’ -5’
A C C T G G A A T T C C T T A A A T A C G
Figure 1. Example of a DNA molecule.
Figure 2. A DNA molecule with sticky ends.
a question of brute search with DNA solving the problem of an exponential growth in the number of alternative solutions indefinitely. Artificial intelligence methods are used to address the combinatorial issue in DNA computing (Impagliazzo et al 1998, Sakamoto et al 1999), which will be discussed later in this review. Quantum computing is usually compared with DNA computing. Quantum computing involves high physical technology for the isolation of mixed quantum states necessary to implement efficient computations solving combinatorially complex problems such as factorization. DNA computing operates in natural noisy environments, such as a glass of water. It involves an evolvable platform for computation in which the computer construction machinery itself is embedded. Since DNA computing is linked to molecular construction such as nanomechanical devices and other nanoscale structures, the computations may eventually also be employed to build three-dimensional self-organizing partially electronic or more remotely even quantum computers (McCaskill 2000).
complementarity can be used for error correction. If the error occurs in one of the strands of double-stranded DNA, repair enzymes can restore the proper DNA sequence by using the complement strand as a reference. In DNA replication, there is one error for every 109 copied bases whereas hard drives have one error for every 1013 for Reed–Solomon correction (Ryu 2000). From the basic principle of base pair complementarity, DNA contains two elements crucial to any computer: (1) a processing unit in the form of enzymes that denature, replicate and anneal DNA, which are operations capable of cutting, copying, and pasting; and (2) a storage unit encoded in DNA strings (Thaker 2004). Hence, when enzymes work on multiple DNA at the same time DNA computing becomes massively parallel and ultimately very powerful. The power in DNA computing comes from the memory capacity and parallel processing. For example, in bacteria, DNA can be replicated at a rate of about 500 base pairs a second, which is 10 times faster than human cells. This represents about 1000 bits s−1 , but when many copies of the replication enzymes are to work on DNA in parallel, the rate of DNA strands will increase exponentially (2n after n iterations). Subsequently, after 30 iterations it increases to 1 Tbits s−1 .
2. The structure of DNA DNA is the major example of a biological molecule that stores information and can be manipulated, via enzymes and nucleic acid interactions, to retrieve information. Similarly, as a string of binary data is encoded with zeros and ones, a strand of DNA is encoded with four bases (known as nucleotides), represented by the letters A, T, C, and G. Each strand, according to chemical convention, has a 5 and a 3 end; hence, any single strand has a natural orientation. Figure 1 presents a DNA molecule composed of ten pairs of nucleotides. Bonding occurs by the pairwise attraction of bases; A bonds with T and G bonds with C. The pairs (A, T) and (G, C) are therefore known as complementary base pairs. DNA computing relies on developing algorithms that solve problems using the encoded information in the sequence of nucleotides that make up DNA’s double helix and then breaking and making new bonds between them to reach the answer. The nucleotides are spaced every 0.35 nm along the DNA molecule, giving a DNA a remarkable data density estimated as one bit per cubic nanometre, and potentially exabytes (1018 ) amounts of information in a gram of DNA (Chen et al 2004). In two dimensions, assuming one base per square nanometre, the data density is over one million Gbits per square inch, whereas the data density of a typical high performance hard drive is about 7 Gbits per square inch (Ryu 2000). DNA computing is also massively parallel and can reach approximately 1020 operations s−1 compared to existing teraflop supercomputers. Another important property of DNA is its double-stranded nature. The bases A and T, and C and G, can bind together, forming base pairs. Therefore, every DNA sequence has a natural complement. For example, sequence S is AATTCGCGT, its complement, S , is TTAAGCGCA. Both S and S will hybridize to form double-stranded DNA. This R28
2.1. Matching DNA sticky ends Restriction enzymes catalyse the cutting of both strands of a DNA molecule at very specific DNA base sequences, called recognition sites. Recognition sites are typically 4–8 DNA base pairs long. Figure 2 shows a DNA molecule in which its four nucleotides in the left end and five in the right end are not paired with nucleotides from the opposite strand. This molecule has sticky ends. There are over 100 different restriction enzymes, each of which cuts at its specific recognition site(s). A restriction enzyme cuts tiny sticky ends of DNA that will match and attach to sticky ends of any other DNA that has been cut with the same enzyme. DNA ligase joins the matching sticky ends of the DNA pieces from different sources that have been cut by the same restriction enzyme. Many restriction enzymes work by finding palindrome sections of DNA (regions where the order of nucleotides at one end is the reverse of the sequence at the opposite end). The process of joining the matching sticky DNA ends is used extensively in the field of DNA technology to produce substances such as insulin and interferon, and to splice genes that alter a cell or organism from its original DNA for some benefit. For example, in agriculture we have used gene splicing to delay the ripening process of tomatoes, to make more nutritious corn, to make rice that contains carotenes and to produce plants with natural pesticides.
3. DNA computers A DNA computer is a collection of specially selected DNA strands whose combinations will result in the solution to
Topical Review
Universal Computer
Universal Constructor
Figure 3. The von Neumann architecture for a self-replicating system.
some problem, and a nanocomputer is considered as a machine that uses DNA to store information and perform complex calculations. Benenson et al (2003) observed the unique properties of DNA being a fundamental building block in the fields of supramolecular chemistry, nanotechnology, nanocircuits, molecular switches, molecular devices, and molecular computing. Many designs for miniature computers aimed at harnessing the massive storage capacity of DNA have been proposed over the years. Earlier schemes have relied on a molecule known as ATP, which is a common source of energy for cellular reactions, as a fuel source. However, Benenson et al (2003) designed a new model where a DNA molecule provides both the initial data and sufficient energy to complete the computation. Both models of the molecular computer are so-called automatons. Given an input string comprised of two different states, an automaton uses predetermined rules to arrive at an output value that answers a particular question. Then a specific enzyme acts as the computer’s hardware by cutting a piece of the input molecule and releasing the energy stored in the bonds. This heat energy then powers the next computation (Graham 2003). Positional control combined with appropriate molecular tools should enable researchers and practitioners to build a truly overwhelming range of molecular structures. Subsequently, one of the outcomes will be building a general-purpose programmable device, which is able to make copies of itself. von Neumann carried out a detailed analysis of self-replicating systems in a theoretical cellular automata model. In this model, as depicted in figure 3, he used a universal computer for control and a universal constructor to build more automata. The universal constructor was a robotic arm that, under computer control, could move in two dimensions and alter the state of the cell at the tip of its arm. By sweeping systematically back and forth, the arm could eventually build any structure that the computer instructed it to. In his three-dimensional model, von Neumann retained the idea of a positional device and a computer to control it. The architecture for Drexler’s assembler, as depicted in figure 4, is a specialization of the more general architecture proposed by von Neumann. Similarly, there is a computer and constructor. However, the computer has shrunk to a molecular computer while the constructor combines two features: a robotic positional device and a well-defined set of chemical operations that take place at the tip of the positional device (http://www.zyvex.com/nanotech/ MITtecRvwSmlWrld/article.html). The complexity of a self-replicating system must be reasonable and acceptable. In addition, the complexity of an assembler, in terms of bytes, should not be beyond the complexity that can be dealt with by today’s engineering capabilities. As indicated in table 1, the primary observation to
Molecular Computer
Molecular Constructor
Molecular Positional Capability
Tip Chemistry
Figure 4. Drexler’s architecture for an assembler.
Table 1. Complexity of self-replicating systems (Megabytes). von Neumann’s universal constructor Internet Worm Mycoplasma genitalia E. coli Drexler’s assembler Human NASA Lunar Manufacturing Facility
About 0.63 About 0.63 0.14 1.16 12.5 800 13 000
be drawn from these data is that simpler designs and proposals for self-replicating systems both exist and are well within current design capabilities. The engineering effort required to design systems of such complexity will be significant, but should not be greater than the complexity involved in the design of such existing systems as computers. Self-replication is used as a means to an end, not as an end in itself. A system able to make copies of itself but unable to make much of anything else would not be very useful. The purpose of self-replication in the context of manufacturing is to permit the low-cost replication of a flexible and programmable manufacturing system. Hence, the objective is to build a system that can be reprogrammed to make a very wide range of molecularly precise structures (http://www.zyvex.com/nanotech/selfRep.html). 3.1. Self-assembling nanostructures with DNA DNA molecular structures and intermolecular interactions are particularly known to be amenable to the design and synthesis of complex molecular objects. Winfree et al (1998) used a molecular self-assembly approach to the fabrication of objects specified with nanometre precision. Their results demonstrated the potential of using DNA to create selfassembling periodic nanostructures, and therefore leading the way to nanotechnology. A few years later, Mao et al (2000) reported a onedimensional algorithm self-assembly of DNA triple-crossover molecules that can be used to execute four steps of a logical (cumulative XOR) operation on a string of binary bits. Their results suggest that computation by self-assembly may be scalable. Figure 5 depicts a simplified version for the implementation of the XOR cellular automaton using the Sierpinski rules (Rothemund et al 2004). Figure 4 has four horizontal parts: (A), (B), (C), (D), and (E). On the left of (A), the two time steps of the execution drawn are shown as a space– time history and cells are updated synchronously according to XOR function. The right side of (A) shows the Sierpinski triangle. Part (B) translates the space–time history into a tiling, in which for each possible input pair a tile T-x y is generated so that it bears the inputs represented as shapes on the lower R29
Topical Review
A t=1
0
...
t=0
0
1
0
outputs
0
...
0
0
y
T-xy
y
x
inputs
0
T-00
0
0 1
T-11
T-xy
z
z
C
0
0
z = x xor y
z
0
0
1
B
x
1
0 1
1
1
1
T-01
0
1 T-10
1
1
0
Initial conditions for the computation are provided by nucleating structures (0s and 1s)
D
E
Error-free growth results in the Sierpinski pattern
Error-prone growth including mismatch errors
Figure 5. The XOR cellular automaton implementation using tile-based self-assembly. (This figure is in colour only in the electronic version)
half of each side and the output as shapes duplicated on the top half of each side. Part (C) represents the four Sierpinski rule tiles; T-00, T-11, T-01, and T-10, represent the four entries of the truth table for XOR: 0 XOR 0 = 0, 1 XOR 1 = 0, 0 XOR 1 = 1, and 1 XOR 0 = 1. Part (D) is concerned with the growth results in the Sierpinski pattern, and part (E) uses symbols to indicate mismatch errors. DNA nanostructures provide a programmable methodology for bottom-up nanoscale construction of patterned structures, utilizing macromolecular building blocks called DNA tiles based on branched DNA. These tiles have sticky ends that match the sticky ends of other DNA tiles, facilitating further assembly into larger structures known as DNA tiling lattices. In principle, DNA tiling assemblies can be made to form any computable two- or three-dimensional pattern, however complex, with the appropriate choice of the tiles’ component DNA (Reif et al 2005). One potential approach is to use patterned DNA as scaffolds or templates for organizing and positioning molecular electronics and other components such as molecular sensors with precision and specificity. The programmability lets this scaffolding have the patterning required for fabricating complex devices made of these components. Sung et al (2004) discussed the fabrication and characterization of an original class of nanostructures based on the DNA scaffolds. They reported on the self-assembly of one- and two-dimensional DNA scaffolds, which served as templates for the targeted deposition of ordered nanoparticles and molecular arrays. Turberfield (2003) proposed to use self-assembling DNA R30
nanostructures as scaffolds for constructing and positioning molecular-scale electronic devices and wires. A principal challenge in DNA tiling self-assemblies is the control of assembly errors. This is predominantly relevant to computational self-assemblies, which, with complex patterning at the molecular scale, are prone to a quite high rate of error, ranging from approximately between 0.5% and 5% (Reif et al 2005). The limitation and/or elimination of these errors in self-assembly represent the most important major challenge to nanostructure self-assembly. 3.2. DNA nanomachines DNA has been explored as an excellent material for building large-scale nanostructures, constructing individual nanomechanical devices, and performing computations (Seeman 2003). A variety of DNA nanomechanical devices have been previously constructed that demonstrate motions such as open/close (Yurke et al 2000, Simmel and Yurke 2001, 2002, Liu and Balasubramanian 2003), extension/contraction (Li and Tan 2002, Alberti and Mergny 2003, Feng et al 2003), and motors/rotation (Mao et al 1999, Yan et al 2002, Niemeyer and Adler 2002), mediated by external environmental changes such as the addition and removal of DNA fuel strands (Li and Tan 2002, Alberti and Mergny 2003, Simmel and Yurke 2001, 2002, Yan et al 2002, Yurke et al 2000) or the change of ionic composition of the solution (Mao et al 1999, Liu and Balasubramanian 2003). The DNA walker could ultimately be used to carry out computations and to precisely transport nanoparticles of
Topical Review
material. The walker can be programmed in several ways in this direction. For example, information can be encoded in the walker fragments as well as in the track so that, while performing motion, the walker simultaneously carries out computation. Yin et al (2005a), (2005b) designed an autonomous DNA walking device in which a walker moves along a linear track unidirectionally. Sherman and Seeman (2004) have constructed a DNA walking device controlled by DNA fuel strands. Reif (2003) designed an autonomous DNA walking device and an autonomous DNA rolling device that move in a random bidirectional fashion along DNA tracks. Shin and Pierce (2004) designed the DNA walker for molecular transport. Recently, Yin et al (2005a), (2005b) encoded computational power into a DNA walking device embedded in a DNA lattice and therefore accomplished the design for an autonomous nanomechanical device capable of universal computation and translational motion. Implementing controllable molecular nanomachines made of DNA is one of the objectives of DNA computing and DNA nanotechnology (Takahashi et al 2005). Controlling DNA machines have been implemented using different methods: (1) DNA strands that hybridize with target machines and drive their state transition, (2) DNA strands can also be used as catalysts for the formation of double helices in such nanomachines, and (3) B–Z transition of DNA capable of switching the confrontation of the DNA motor (Mao et al 1999). Various approaches have implemented the first method. Yurke et al (2000) reported the construction of a DNA machine in which DNA is used not only as a structural material, but also as ‘fuel’. Simmel and Yurke (2001) described a DNAbased molecular machine, which has two movable arms that are pushed apart when a strand of DNA, the fuel strand, hybridizes with a single-stranded region of the molecular machine. Yan et al (2002) implemented a robust DNA mechanical device controlled by hybridization topology. On the other hand, implementations of the second method have also been reported. Seelig (2004) presented experimental results on the control of the decay rates of a metastable DNA ‘fuel’. They also discussed how the fuel complex can serve as the basic ingredient for a DNA hybridization catalyst. They also proposed a method for implementing arbitrary digital logic circuits. Turberfield and Mitchel (2003) described kinetic control of DNA hybridization, which has the potential to increase the flexibility and reliability of DNA self-assembly through inhibiting the hybridization of complementary oligonucleotides. The proposed DNA catalysts were shown to be effective in promoting the hybridization and for using DNA as a fuel to drive free-running artificial molecular machines.
4. DNA computing DNA computing is a novel and fascinating development at the interface of computer science and molecular biology. It has emerged in recent years, not simply as an exciting technology for information processing, but also as a catalyst for knowledge transfer between information processing, nanotechnology, and biology. This area of research has the potential to change our understanding of the theory and practice of computing.
4.1. Biomolecular computing Biomolecular computers are molecular-scale, programmable, autonomous computing machines in which the input, output, software, and hardware are made of biological molecules (Benenson and Shapiro 2004). Biomolecular computers hold the promise of direct computational analysis of biological information in its native biomolecular form, avoiding its conversion into an electronic representation (Adar et al 2004). This has led to pursing autonomous, programmable computers which are considered as finite automata (McAdams and Arkin 1997). An automaton can be stochastic, namely has two or more competing transitions for each state-symbol combination, each with a prescribed probability. A stochastic automaton is useful for processing uncertain information, like most biological information. Because of the stochastic nature of biomolecular systems, a stochastic biomolecular computer would be more favourable for analysing biological information than a deterministic one (McAdams and Arkin 1997). Stochastic molecular automata have been constructed in which stochastic choice is realized by means of competition between alternative paths, and choice probabilities were programmed by the relative molar concentrations of the software molecules coding for the alternatives. This approach was used in the construction of a molecular computer capable of probabilistic logical analysis of disease-related molecular indicators (Adar et al 2004). Benenson et al (2001) described a programmable finite automaton comprising DNA and DNA-manipulating enzymes that solves computational problems autonomously. The automaton’s hardware consists of a restriction nuclease and ligase, the software and input are encoded by doublestranded DNA, and programming amounts to choosing appropriate software molecules. Their experiments used 1012 automata, which were sharing identical software, and running independently and in parallel on inputs in 120 µl solution at room temperature at a combined rate of 109 transitions s−1 with a transition fidelity greater than 99.8%, consuming less than 10−10 W. It has also been demonstrated that a single DNA molecule can provide both the input data and all of the necessary fuel for a molecular automaton (Benenson et al 2003). Those experiments showed that 3 × 1012 automata µl−1 performing 6.6 × 1010 transitions s−1 µl−1 with transition fidelity of 99.9% dissipating about 5 × 10−9 W µl−1 as heat at ambient temperature. An autonomous biomolecular computer was described recently (Benenson et al 2004) which analyses the levels of messenger RNA (mRNA) species, and in response generates a molecule capable of affecting levels of gene expression. The designed biomolecule computer works at a concentration of close to 1012 computers µl−1 . The modularity of their design facilities improved each biomolecular computer component independently. They demonstrated how computer regulation by other biological molecules such as proteins, the output of other biologically active molecules such as RNA interference, can all be explored concurrently and independently. Progress in the development of molecular computers may lead to a ‘Doctor in Cell’ which is represented by a biomolecular computer that operates inside the living R31
Topical Review
organism, for example the human body, programmed with medical information to diagnose potential diseases and produce the required drugs in situ. This will ultimately lead to a device capable of processing DNA inside the human body and finding abnormalities and creating healing drugs. However, major changes will be needed for the molecular computer to operate in vivo (Shapiro et al 2004). Shapiro Lab is renowned for the creation of biomolecular computing devices, which are so tiny that more than a trillion fit into one drop of water. These manufactured devices are made entirely of DNA and other biological molecules. A recent version was programmed by Shapiro and his research team to identify signs of specific cancers in a test tube, to diagnose the type of cancer and to release drug molecules in response. Though cancer-detecting computers are still in the very early stages, and can thus far only function in test tubes, Shapiro and his research team envision future biomolecular devices that may be injected directly into the human body to detect and prevent or cure disease. At the Shapiro Lab, their recent research mainly deals with the aspect of energy consumption by a computer. They were able to construct a molecular computer whose sole energy source is its input, a combination that is unthinkable in the realm of electronic computers. This energy is extracted as the input data molecule is destroyed during computation (http:// www.wisdom.weizmann.ac.il/ ∼udi/). Recently, they initiated the BioSPI project which is concerned with developing predictive models for molecular and biochemical processes. Such processes, carried out by networks of proteins, mediate the interaction of cells with their environment and are responsible for most of the information processing inside cells. To this end, they developed a new computer system, called BioSPI, for representation and simulation of biochemical networks (Shapiro et al 2002). 4.2. Solving problems using DNA computing 4.2.1. Finite state problems. To compete with silicon, it is important to develop the capability of biomolecular computation to quickly execute basic operations, such as arithmetic and Boolean operations, that are executed in single steps by conventional machines. In addition, these basic operations should be executable in massively parallel fashion (Reif 1998). Guarnieri and Bancroft (1999) developed a DNA-based addition algorithm employing successive primer extension reactions to implement the carries and the Boolean logic required in binary addition (similar methods can be used for subtraction). Guarnieri, Fliss, and Bancroft prototyped (Guarnieri et al 1996) the first biomolecular computation addition operations (on single bits) in recombinant DNA. They presented the development of a DNA-based algorithm for addition. The DNA representation of two non-negative binary numbers was presented in a form permitting a chain of primer extension reactions to carry out the addition operation. They demonstrated the feasibility of this algorithm through executing biochemically a simple example. However, it suffered from some limitations: (1) only two numbers were added, so it did not take advantage of the massive parallel processing capabilities of biomolecular computation; and (2) the outputs were encoded distinctly from the inputs, hence it did not allow for repetitive operations. R32
Subsequent proposed methods (Orlian et al 1998, Leete et al 1997, Gupta et al 1997) for basic operations such as arithmetic (addition and subtraction) allow chaining of the output of these operations into the inputs to supplementary operations, and to allow operations to be executed in massive parallel fashion. Rubin et al (1997) presented an experimental demonstration of a biomolecular computation method for chained integer arithmetic. 4.2.2. Combinatorial problems. DNA computing methods were employed in complex computational problems such as the Hamilton path problem (HPP) (Adleman 1994), maximal clique problem (Ouyang et al 1997), satisfiability problem (SAT) (Liu et al 2000), and chess problems (Faulhammer et al 2000). The advantage of these approaches is the huge parallelism inherent in DNA-based computing, which has the potential to yield vast speedups over conventional electronicbased computers for such search problems. The computational problem considered by Adleman (1994) was a simple instant of the directed travelling salesmen problem (TSP) also called Hamilton path problem (HPP). The technique used for solving the problem was a new technological paradigm, termed DNA computing. Adleman’s experiment represents a landmark demonstration of data processing and communication on the level of biological molecules. It was the first DNA computer set up to solve the TSP. This problem uses the scenario of a door-to-door salesman who must visit several connected cities without going through any city twice. To solve this problem using DNA, the first step is to assign a genetic sequence to each city. For example, the city of Los Angeles might be coded GCACAGT. If two cities connect, then the connecting genetic sequence is assigned the first three letters of one city and the last three letters of the other. For example, if Los Angeles connected to New York, the first three letters of Los Angeles (GCA) would connect to the last three letters of New York (CGT). The TSP seems a simple puzzle; however, the most advanced supercomputers would take years to calculate the optimal route for 50 cities (Parker 2003). Adleman solved the problem for seven cities within a second, using DNA molecules in a standard reaction tube. He represented each of the seven cities as separate, single-stranded DNA molecules, 20 nucleotides long, and all possible paths between cities as DNA molecules composed of the last ten nucleotides of the departure city and the first ten nucleotides of the arrival city. Mixing the DNA strands with DNA ligase and adenosine triphosphate (ATP) resulted in the generation of all possible random paths through the cities. However, the majority of these paths were not applicable to the situation because they were either too long or too short, or they did not start or finish in the right city. Adleman then filtered out all the paths that neither started nor ended with the correct molecule and those that did not have the correct length and composition. Any remaining DNA molecules represented a solution to the problem. The DNA computer provides enormous parallelism in one fiftieth of a teaspoon of solution, approximately 1014 DNA representing flight numbers were simultaneously concatenated in about one second. The Adleman approach to the HPP is shown in figure 6. An instance of the HPP which is solved
Topical Review Start
Start Generate all possible n-bit strings in S
Generate strands encoding random paths
Let j = 1, w and x represent literals Keep only the potential solutions
Y j
Identify uniquely the HP solution
Discard the strands
Generate a new set S by merging extracted strings, increment j
N i