The concept of hidden genes was recently introduced in genetic algorithms (GAs) to handle systems architecture optimization problems, where the number of design variables is variable. Selecting the hidden genes in a chromosome determines the architecture of the solution. This paper presents two categories of mechanisms for selecting (assigning) the hidden genes in the chromosomes of GAs. These mechanisms dictate how the chromosome evolves in the presence of hidden genes. In the proposed mechanisms, a tag is assigned for each gene; this tag determines whether the gene is hidden or not. In the first category of mechanisms, the tags evolve using stochastic operations. Eight different variations in this category are proposed and compared through numerical testing. The second category introduces logical operations for tags evolution. Both categories are tested on the problem of interplanetary trajectory optimization for a space mission to Jupiter, as well as on mathematical optimization problems. Several numerical experiments were designed and conducted to optimize the selection of the hidden genes algorithm parameters. The numerical results presented in this paper demonstrate that the proposed concept of tags and the assignment mechanisms enable the hidden genes genetic algorithms (HGGA) to find better solutions.

## Introduction

Systems architecture optimization problems arise in several applications such as in automated construction (in which hundreds or thousands of robots fabricate large, complex structures), autonomous emergency response, smart buildings, transportation, medical technology, and electric grids [1]. In these complex systems, the automated system design optimization is crucial to achieve design objectives. The task of design optimization includes optimizing the system architecture (topology) in addition to the system variables. Optimizing the system architecture renders the problem a variable-size design space (VSDS) optimization problem (the number of design variables to be optimized is a variable.). Consider, for example, the optimization of a space interplanetary trajectory. The objective is to design a trajectory for a spacecraft to travel from the home planet to the target planet with, for instance, a minimum fuel consumption. As can be seen in Fig. 1, the spacecraft can apply deep space maneuvers (DSMs) which are propulsive impulses used to change the velocity of the spacecraft instantaneously; these DSMs consume fuel proportional to the amount of the DSMs impulse. The spacecraft can also benefit from free change in momentum, through as many as needed flybys of other planets. When the spacecraft performs a flyby maneuver, we need to determine the height of closest approach to the flyby planet as well as the plane of the flyby maneuver. Hence, by changing the number of flybys, the total number of variables change.

The segment between any two planets is called a leg. A leg can have any number of DSMs. The architecture of a solution refers to the sequence of flybys and the number of DSMs in each leg. The determination of the mission architecture then means the determination of the number of flybys, the planets of flybys, and the number of DSMs in each leg. Other nonarchitecture variables include launch and arrival dates, dates and times of flybys, dates and times of DSMs, amounts and directions of DSMs impulses. This is a VSDS optimization problem. Another example is the optimization of a microgrid system, where there are several energy sources and co-located energy storage devices that can either sink or source power with their corresponding sources. The net power at each source/storage is metered to the grid main bus using a boost converter. For an efficient design of the microgrid, the number of storage elements (*N*) and their capacities need to be optimized. Storage is expensive and designing a microgrid, with storage-sized properly, is an open problem. Associated with computing the optimal *N* is the optimal values for the duty ratios at the converters that control the power metered to the main bus from each source. A more complex situation is when we have *M* microgrids that have the ability to interconnect. This provides a large number of permutations for exchanging power.

where $x=[x1,x2,\u2026,xN]T$, *N* is the number of design variables, $xu$ and $xl$ are the upper and lower bounds of the variables **x**, respectively. The number of variables *N* in this formulation is variable, and its value dictates the architecture of the solution. The vector **g** includes all the inequality constraints whereas the vector **h** includes all the equality constraints.

The research on developing algorithms that can handle VSDS optimization problems (sometimes referred to as variable-length optimization) has started since about two decades. GAs are not suitable for VSDS problems because they are designed to work only on problems of fixed number of variables. Reference [6] presents a variable-length genetic programming and compares it to the simulated annealing and the stochastic-iterated hill climbing methods, on program discovery problems. A VSDS GA is presented in Ref. [7] in which a random operator is introduced to change the chromosome length, for the problem of Kauffman NK model. This random operator depends on the identity of genes which is given by their position relative to one end of the genotype. Reference [8] is a continuing work of Ref. [7] and analyzes the optimal location for the crossover point in VSDS problems. When two parents have different chromosome lengths and given a selection for the crossover point in parent 1, Ref. [8] suggests that the crossover point in parent 2 be chosen such that the difference between the swapped segments is minimized. The method proposed in Ref. [8] is a search on all the possible crossover points in parent 2 to find the best cutoff point. The VSDS GA in Ref. [9] uses a two-point crossover, with different cutoff points in each parent, resulting in different lengths of the children chromosomes. This method is most useful in problems with variables of the same identity, like angles of a polyhedral, where adding or removing one angle will result in a new polyhedral (e.g., triangle to rectangle or vice versa). Reference [10] presents a number of variable-length representation evolutionary algorithms that improve the sampling of a VSDS, with application in evolutionary electronics. In Ref. [11], the number of different chromosome lengths is set a priori, and both parents have the same crossover point (same gene index of the cutoff). Therefore, the length of the chromosome is switched from parents to children in Ref. [11] (the length of child 2 is equal to the length of parent 1 and the length of child 1 is equal to the length of parent 2). This method does not provide information regarding the optimal length of a solution. A different approach in VSDS GA is to have equal-length chromosomes in each generation, yet the chromosome length is allowed to change among different generations as presented in Refs. [12] and [13]. In this method, the GA starts with short-length chromosomes and the best solution in a generation is transferred to the next generation with a longer chromosome length. In this way, the GA handles fixed-size chromosomes in each generation, and there is no need to define new evolutionary operations for GA. A dynamic-size multiple population GA was developed in Ref. [14], where each generation consists of a number of subpopulations; all chromosomes in each subpopulation are of the same length. Hence, each subpopulation evolves over subsequent generations as in a standard GA. The size of each subpopulation, however, changes dynamically over subsequent generations such that more fit subpopulations are allowed to increase in size whereas lower fit subpopulations decrease in size. This approach has been applied to the trajectory optimization problem and demonstrated success in finding best-known solution architectures. The computational cost of this method, however, is relatively high since it implements GA over several subpopulations in parallel. Also, only a finite number of architectures (assumed a priori) can be investigated using the method in Ref. [14]. A structured chromosome GA was developed in Refs. [15] and [16], where the standard one layer chromosome is replaced with a multilayer chromosome for coding the variables; the number of genes in one layer is dictated by the values of some of the genes in the upper layers. Hence, it was possible to code solutions of different architectures. Yet, this structured chromosome approach introduces new definitions for the crossover operation such that meaningful swapping between chromosomes of different layers is guaranteed. Some other algorithms are designed for specific problems. For instance, Refs. [17] and [18] present tailored algorithms that search for the optimal structural topology in truss and frame structures, respectively. The dissertation in Ref. [19] presents a study on topology optimization of nanophotonic devices and makes a comparison between the homogenization method [20] and GAs [2]. As can be seen from the above discussion, many of the VSDS optimization algorithms are problem-specific. The dynamic-size multiple population GA has a high computational cost [14]. The structured chromosome GA is relatively complex to develop since it requires new definitions for all GAs operations.

Inspired by the concept of hidden genes in biology, Ref. [21] presented the method of hidden genes genetic algorithms (HGGA) for solving VSDS optimization problems. In space trajectory optimization, the HGGA successfully found the best-known solution architectures as reported in Refs. [22] and [23]. Genetic algorithms is one of the methods being used for systems architecture optimization problems that are not VSDS. It is because the HGGA is based on GAs that it can handle systems architecture optimization problems. The added capability of the HGGA is that it can optimize among different solution architectures and can also develop new architectures that might not be known a priori, and hence it can handle VSDS problems. The method used in Ref. [21] to determine which genes are hidden in each chromosome in each generation, however, was very primitive. In Ref. [21], genes in a chromosome will only be hidden if a chromosome represents a nonfeasible solution. Hence, the HGGA will not attempt to hide genes if the chromosome is a feasible solution. Subsequent developments on HGGA has introduced tags for the genes [24], where each gene is assigned a binary tag that determines whether it is hidden or not; hence, a gene can be hidden even in feasible solutions if hiding that gene results in a more fit solution. In this paper, the problem of selecting the hidden genes in each generation is addressed. This paper develops mechanisms for the tags to evolve over generations. Two new concepts for hidden genes selection are presented in this paper. Section 2 presents a review for the most recent developments on HGGA. Section 3 presents the new methods of tags evolution. Section 4 presents VSDS mathematical functions and tests on the tags evolution methods. Section 5 presents the results of implementing these methods to solve an interplanetary space trajectory optimization problem. Finally, Sec. 6 presents a statistical analysis for the proposed methods.

## Hidden Genes Genetic Algorithms

### The Hidden Genes Concept in Biology.

In genetics, the DNA is organized into long structures called chromosomes. Contained in the DNA are segments called genes. Each gene is an instruction for making a protein. These genes are written in a specific language. This language has only three-letter words, and the alphabet is only four letters. Hence, the total number of words is 64. The difference between any two persons is essential because of the difference in the instructions written with these 64 words. Genes make proteins according to these words. Since, not all proteins are made in every cell, not every gene is read in every cell. For example, an eye cell does not need any breathing genes on. And so they are shut off in the eye. Seeing genes are also shut off in the lungs. Another layer of coding tells what genes a cell should read and what genes should be hidden from the cell [25]. A gene that is being hidden will not be transcribed in the cell. There are several ways to hide genes from the cell. One way is to cover up the start of a gene by chemical groups that get stuck to the DNA. In another way, a cell makes a protein that marks the genes to be read; Fig. 2 is an illustration for this concept. Some of the DNA in a cell is usually wrapped around nucleosomes but lot of DNAs are not. The locations of the nucleosomes can control which genes get used in a cell and which are hidden [25].

### Concept of Optimization Using Hidden Genes Genetic Algorithms.

In the HGGA, the concept of hidden genes is used in GA optimization to hide some of the genes in a chromosome (solution); these hidden genes represent variables that do not appear in this candidate solution. In a topology optimization problem, the number of design variables depends on the specific values of some of the system’s variables. Selecting different values for the system’s variables changes the length of the chromosome. Different solutions have different number of design variables. So, in the design space, we have chromosomes of different lengths. Let *L*_{max} be the length of the longest possible chromosome. In the hidden genes concept, all chromosomes in the population are allocated a fixed length equal to *L*_{max}. In a general solution (a point in the design space), some of the variables (part of the chromosome) will be ineffective in objective function evaluations; the genes describing these variables are referred to as hidden genes. The hidden genes, however, will take part in the genetic operations in generating future generations. To illustrate this concept, consider Fig. 3. Suppose we have two chromosomes, the first chromosome is represented by five genes (represented by five binary bits in this example) and the second chromosome is represented by three genes. Suppose also that the maximum possible length for a chromosome in the population is fixed at 7. Hence, we can say that the first chromosome is augmented by two hidden genes and the second chromosome is augmented by four hidden genes. The hidden genes are not used to evaluate the fitness of the chromosomes. Because all chromosomes have the same length, standard definitions of GAs operations can still be applied. Mutation may alter the value of a hidden gene. A crossover operation may swap parts of the chromosome that have hidden genes. A hidden gene in a parent may become an effective gene in the offspring. These hidden genes that become effective take part in the objective function evaluations in the new generations. Figure 4 shows a simple example for two parents with hidden genes and the resulting children after a crossover operation. In Fig. 4, genes are binary numbers and hidden genes are shown by gray color. The crossover point is between genes 2 and 3. As can be seen, the parent chromosomes swap the genes from crossover point and as a result the gene values change in both of the children chromosomes, and the number of hidden genes and/or the location of the hidden gene change. Assigning which gene in the children that need to be hidden is crucial for the efficiency of HGGA; assignment mechanisms are developed in this paper.

## Hidden Genes Assignment Methods

This paper addresses the question of which genes are selected to be hidden in the HGGA, in each chromosome, in each generation, during the search for the optimal solution (optimal configuration). This mechanism of assigning hidden genes in a chromosome is vital for the efficient performance of HGGA. In previous work of HGGA [21], the mechanism that was used to assign the hidden genes (called “feasibility mechanism”) was primitive. The feasibility mechanism rule assumes initially no hidden genes in a chromosome; if the obtained chromosome is feasible then there is no hidden genes. If the solution is not feasible, then starting from one end of the chromosome the algorithm hides genes—one by one—until the chromosome becomes a feasible chromosome.

In genetics, as discussed in Sec. 2.1, a cell makes a protein that marks the genes to be read. Inspired by genetics, it is proposed to use a tag for each of the genes that have the potential to be hidden (configuration gene) [24]. This tag determines whether that gene is hidden or not. The tag is implemented as a binary digit that can take a value of “1” or “0,” as shown in Fig. 5. For each gene *x _{i}* that can be hidden, a tag

*is assigned to decide whether it is hidden or not. If tag*

_{i}*is 1, then*

_{i}*x*is hidden, and if it is 0,

_{i}*x*is active.

_{i}The values of these tags evolve dynamically as chromosomes change during the optimization process. Preliminary work in Ref. [24] suggests mechanisms for tags evolution. This paper presents two different concepts for tags evolution. Both concepts are shown below with different variations on each concept.

### Logical Evolution of Tags.

During the chromosome crossover operation, the tags for the children are computed from the tags of the parents, using the logical OR operation. Using a logical operation is not new to GA. Reference [27] used them in the chromosomes crossover operation in GAs. Reference [28] presents a crossover operation, where the similar genes in the parents are copied to the two children while the remaining genes in each child are randomly chosen from the two parents. Here, however, the logical operator is applied only on the tags. The crossover of two parents results in two children. Three logics are studied in this paper for the logical evolution of tags:

*Logic A:* For one child, a gene is *hidden* if the same gene is hidden in any of the parents (Hidden-OR). For the second child, a gene is *active* if the same gene is active in any of the parents (Active-OR). It is possible to think of the logic of the second child as the AND logical operation when used with the hidden state—that is, in the second child, a gene is hidden if the same gene is hidden in both of the parents (Hidden-AND). The resulting children from crossover operation are shown in Fig. 6 along with the resulting new tags for the children.

*Logic B:* The Hidden-OR logic is used for both children. Even though the tags will be the same for both children, the two children represent two different solutions because they have different gene values.

*Logic C:* The Active-OR logic is used for both children.

### Stochastic Evolution of Tags.

In this concept, the tags are evolved using crossover and/or mutation operations, in a similar way to that of the design variables. Eight mechanisms are investigated using this concept. These mechanisms are:

*Mechanism A*: tags evolve through a mutation operation with a certain mutation probability. In this mechanism, the tags are separate from the design variables in the chromosome.

*Mechanism B*: tags evolve through mutation and crossover operations. In this mechanism, the tags are considered as discrete variables similar to the design variables in the chromosome. The tags are appended to the design variables; and hence their values are optimized along with the other variables through the selection, crossover, and mutation operations. In this mechanism, the number of design variables is increased. The computational cost of evaluating the cost function is not changed though.

*Mechanism C*: tags evolve through a crossover operation. In this mechanism, the tags are considered as discrete variables similar to the design variables in the chromosome; yet only crossover operation can be applied to the tags.

*Mechanism D*: tags evolve through a mutation operation. In this mechanism, the tags are considered as discrete variables similar to the design variables in the chromosome; yet only mutation operation is applied to the tags. The mutation probability in this mechanism is the same as that used to mutate the main chromosomes.

*Mechanism E*: tags crossover independently from the genes. In other words, the tags may swap while the genes do not, or vise versa. This mechanism can be interpreted as a $2\u2212D$ multiple crossover operator, one direction through tags and one direction through genes. Before applying the crossover operator, tags undergo a mutation operation.

*Mechanism F*: tags crossover using a logical fitness-guided (arithmetic) crossover in which two intermediate chromosomes are produced. In these intermediate chromosomes, the genes are produced from a single crossover operator on parents and the tags are the outcome of the Active-OR logic on the parents’ tags. In other words, parent *X* will have intermediate offspring *Xx*, and parent *Y* will have intermediate offspring *Yy* using the arithmetic crossover operation. The actual offspring is then created by a fitness-guided crossover operation on the parents, and it is closer to the parent whose intermediate offspring has better cost.

*Mechanism G*: the arithmetic crossover is used with a modified cost function based on the number of genes that are hidden. The offspring is biased toward better parent (lower cost) with more hidden genes. The cost function is modified as follows: $fmodified(X)=f(X)\u2212\u2211i=1M(flagi)$, where flag* _{i}* is the value of the tag for gene

*i*.

*Mechanism H:* the arithmetic crossover is used with a modified cost function based on the number of hidden genes. The offspring is biased toward better parent (lower cost) with less hidden genes. The cost function is $fmodified(X)=f(X)+\u2211i=1M(flagi)$.

The last four mechanisms (E through H) are presented in detail in Ref. [24]; hence no results will be presented for them in this paper. However, rankings are presented in Sec. 6 for all the mechanisms for comparison. Also, the numerical results presented in this paper will be compared to those presented in Ref. [24]. Numerical testing for the first four mechanisms (A through D) is presented in Sec. 4.

The HGGA method presented in this paper is relatively simple to implement. The genes undergo the standard GAs operations while the binary tags undergo different operations depending on the selected mechanism/logic as detailed above. Hence, any existing code for GAs can be appended by a code that handles the tags to create the proposed VSDS GAs.

## Test Cases: Variable-Size Design Space Mathematical Functions

For each gene *x _{i}*, if its associated tag, tag

*, is 1 (hidden) then the gene*

_{i}*x*is hidden and hence $f(xi)$ is set to zero (does not exist). This is consistent with the physical system test cases presented in Sec. 5. Unlike the hidden gene tags, the chromosomes evolve through the standard GA selection, mutation, and crossover operations. For the chromosomes, a single-point crossover and adaptive feasible mutation operators are selected.

_{i}In the test cases presented in this section, the population size is 400, the number of generations is 50, the elite count is 20, the genes mutation probability is 0.01, and the crossover probability is 0.95. For the purpose of statistical analysis, each numerical experiment is repeated *n* times (*n* identical runs) and the success rate in finding the optimal solution is assessed. The success rate *S _{r}* is computed as $Sr=js/n$, where

*n*is the number of runs and

*j*is a counter that counts how many times the optimal solution is obtained as the solution at the end of an experiment [29].

_{s}### Results Using Stochastically Evolving Tags

#### Schwefel 2.26 Function.

Here, it is assumed that *N* = 5 (maximum possible number of variables is 5). Minimizing *f*(*X*), the optimal solution is known, and it is $fmin=\u2212418.9829$. First, the standard GA is used to solve this problem. In using the standard GA, all variables are active. The results show that a minimum of −418.8912161 is obtained using the standard GA. For the HGGA, there are *N* tags in this case. The results of HGGA mechanism A for different mutation probability values are presented in Table 1.

As shown in Table 1, the minimum obtained function value is −418.8967549 which is lower than the solution obtained using the standard GA. The occurrence probability (success rate) for mechanism A is 100%. For mechanism B, the minimum obtained function value is −418.4088183 with the occurrence probability of 100%. For mechanism D, the results show that the minimum obtained function value is −418.1721324 with the occurrence probability of 90%. For mechanism C, the minimum obtained function value is −418.2119 with the occurrence probability of 100%. The results of these four HGGA mechanisms on the Schwefel 2.26 function are summarized in Fig. 7, where the percentages on the figure indicate the success rate of the mechanism.

#### Egg Holder Function.

where $\u2212512\u2264xi\u2264512$.

This is an interesting case study for the HGGA. Each function $fi(xi,xi+1)$ is a function of two variables. Hence if a variable *x _{i}* is hidden, this does not necessarily mean the function

*f*goes to zero. There are few ways of handling this type of functions depending on what these functions represent in a physical system. For example, in the interplanetary trajectory optimization problem (will be discussed in detail later in this paper), an event of planetary flyby is associated with few variables, of them are the flyby height and the flyby plane orientation angles. If one of these variables is hidden then that implies the whole flyby event is hidden and hence the other variables in this event are also hidden. This example suggests that a function $fi(xi,xi+1)$ (could be representing the cost of a flyby) would assume a value of zero if any of the variables

_{i}*x*or $xi+1$ is hidden. In other situations, however, this is not the case. A function $fi(xi,xi+1)$ may have a nonzero value despite one of the variables

_{i}*x*or $xi+1$ being hidden. Here in this mathematical function, a tag tag

_{i}*is assigned to each function*

_{i}*f*and hence the tag

_{i}*value determines whether the value of the function*

_{i}*f*is zero or not.

_{i}First, the standard GA is used to optimize this function assuming *N* = 5. The best solution found by the standard GA has a function value of −3657.862773. For the HGGA, there are *N* − 1 tags for the *N* − 1 functions *f _{i}*. Different mutation probability values are tested for mechanism A and the results are presented in Table 2 for

*N*= 5.

As shown in Table 2, the mutation probability of 0.03 has the lowest cost function of −3692.314081 with the occurrence probability of 60%. This solution is better than the solution obtained using the standard GA.

A similar analysis is conducted for mechanism B. The results show that the lowest obtained function value is −3599.845154 with mutation probability of 0.0001 and occurrence probability of 10%. For mechanism D, the crossover and mutation operations are applied to the *N* design variables (*x _{i}*) and only mutation is applied for the rest

*N*− 1 variables (the tags). The results show that the lowest obtained function value is −3484.255119 with mutation probability of 0.01 and occurrence probability of 10%. For mechanism C, the tags are variables (total number of GA variables $2N\u22121$) and crossover and mutation operations are carried out on the

*N*design variables. Only the crossover operation is applied on the next

*N*− 1 variables (the tags). The minimum obtained function value is –3559.5751 with the occurrence probability of 10%. Mechanism A produced the best solution in this test for the Egg holder function. The results of these tests are summarized in Fig. 8, where the percentages on the figure indicate the success rate of the mechanism.

#### Styblinski-Tang Function.

*N*= 5. Standard GA results in a minimum obtained function value of −195.8304861. For the HGGA mechanism A, the obtained function value is −195.8306992, with a mutation probability of 0.03, with the occurrence probability of 100%.

For mechanism B, the results show that the minimum obtained function value is −195.828109 with mutation probability of 0.01 and occurrence probability of 100%. For mechanism D, the results show that the minimum obtained function value is −195.8261996 with mutation probability of 0.005 and occurrence probability of 100%. For mechanism C, the minimum obtained function value is −195.82982 with the occurrence probability of 100%.

### Results Using Logically Evolving Tags.

The logical mechanisms for tags assignment presented in Sec. 3.1 are tested on the three mathematical functions described above. The results are summarized in Table 3 that lists the obtained solution using each logic, for each of the functions.

Comparing the results obtained in this section to the results obtained using the stochastic assignment mechanisms, it can be concluded that for the Egg holder function, logic C works best among all the methods while mechanism A works best for the Schwefel 2.26 and the Styblinski-Tang functions.

## Test Cases: Interplanetary Trajectory Optimization

In general, interplanetary trajectory optimization is usually a challenging problem, and in some cases, it is nearly impossible to find the optimal solution. Several examples of such complex problems can be found in The Global Trajectory Optimization Competition online portal of the European Space Agency [30]. A typical problem statement can be written as follows: For a given range of the departure date from the home planet Earth, a given range of the arrival date to a target planet, and a given dry mass of the spacecraft, find the mission architecture (that is: how many flybys in the mission, and how many DSMs in each leg), as well as the dates and times of flybys, the flybys planets, the dates and times of DSMs, and the amounts and directions of these DSMs, and the exact launch and arrival dates, such that a given objective is optimized (e.g., the fuel mass needed for the whole mission is minimized). This VSDS optimization problem is used in this paper to test the ability of the HGGA, with the new definitions of hidden genes tags and evolution mechanisms, in autonomously searching for the optimal architecture as well as the nonarchitecture variables. This type of problems is characterized by the high computational cost of evaluating the cost function; and hence the high computational cost of the optimization process in general. Both logic A and mechanism A for the evolution of hidden genes tags are tested. For each interplanetary trajectory optimization problem, the numerical experiment is repeated identically 100 times and the success rate is calculated.

### Earth–Jupiter Mission Using Hidden Genes Genetic Algorithms.

The Earth–Jupiter mission is optimized using the HGGA and the results are compared to the best-known solution in the literature. The design variables and their given upper and lower bounds are listed in Table 4. In this test, it is assumed that the maximum number of flybys is 2. Hence, the chromosome has 2 genes for the flyby planets; each gene carries the planet identification number (the planet identification number ranges from 1 to 8 for all the planets in the solar system as shown in Table 4). Each of these two genes has a tag. If both tags have a value of 1, then the two genes are hidden and that solution has no flybys. If one gene is hidden and one is active, then the solution has one flyby and the flyby planet identification number is the value of that gene. The same concept is applied regarding the DSMs. In this test case, it is assumed that the maximum number of DSMs in each leg is 2. Since the maximum number of flybys is 2 then the maximum number of legs is 3, and hence the maximum number of DSMs is 6. For each DSM, we need to compute the optimal time (TD) at which this DSM occurs. A gene and a tag are added for each DSM time TD, and hence there are 6 genes and 6 tags for TD* _{i}* ($i=1\cdots 6$) in this mission. Note that if a flyby is hidden, then its leg disappears and the DSMs in that leg automatically become hidden. Note also that even if a flyby exists, a DSM in its leg can be hidden depending on the value of its own tag. Each DSM is an impulse represented by a vector of three components (it has the units of velocity). So, the chromosome will have genes for 6 × 3 = 18 scalar components of the DSMs. Note that these 18 genes are grouped in groups of three genes since each three are the components of one DSM vector; hence, if one DSM is hidden then its three genes get hidden together. Table 4 shows the ranges for 6 DSMs, each has three components. The time of flight (TOF) in each leg is also a variable; there is a gene for each TOF in the mission. Hence in this Jupiter mission, we have three genes for the TOFs. Note that there are no tags associated with the TOF genes since the state of each gene (hidden or active) is determined based on the flyby tags. If a flyby exists, then there is an active gene for a TOF associated with it. There is at least one TOF in a mission; this case of only one TOF corresponds to the case of no flybys. To complete a flyby maneuver, we need to calculate the optimal values for the normalized altitude

*h*of the spacecraft above the planet as well as the plane angle

*η*of the maneuver. Hence, two genes for the altitudes and two genes for the angles are added. Similar to the TOF variables, no tags are needed for the

*h*and

*η*genes. There are also six genes for the departure impulse, flight direction, the arrival date, and the departure date.

Due to the high computational cost of evaluating the fitness of a solution and hence the high computational cost of searching for the optimal solution, the problem is usually solved in two steps [23]. The first step is to assume no DSMs in the trajectory, and this step is called zero-DSM. In the second step, the flyby planets sequence is obtained from the first step, and held fixed, to search for the optimal values of the remaining variables; this step is called the multigravity assist with deep space maneuvers (MGADSM). Each of the two steps has an element of the architecture to be optimized: the first step optimizes the sequence of flybys and the second step optimizes the number of DSMs in each leg. This two-step approach is detailed in Ref. [23] and has shown to be computationally efficient. The number of generations and population size are selected to be 100 and 300, respectively.

### Earth–Jupiter Mission: Numerical Results and Comparisons.

All mechanisms evolving the tags over subsequent generations were tested on the Earth–Jupiter problem. Mechanism A (with a mutation probability of 5%) generated the best solution and so its solution is presented here in detail. The solution of the first phase shows that the trajectory consists of 2 flybys around planets Venus and then Earth; the mission sequence is then Earth–Venus–Earth–Jupiter (EVEJ). The second phase results in adding one DSM, and the total mission cost is 10.1266 km/s (the fuel consumption can be measured in velocity units). The resulting mission is detailed in Table 5.

Also, both logics C and A were tested on this problem; logic A demonstrated superiority compared to logic C in this problem and hence only the results of logic A are here presented. The total cost for the mission is 10.1181 km/s. The detailed results of both steps are presented in Table 6.

Comparing the results of logic A with mechanism A, we can see that the cost of the mission using logic A is slightly better than that obtained using mechanism A. The mission architecture is the same from both methods, while the values of the other variables are slightly different. The success rate of an algorithm is a measure of how many times the algorithm finds the best-found solution in a repeated experiment. This experiment was repeated 200 times using logic A and the success rate is 75.5%, as shown in Fig. 9.

Previous solutions in the literature for this problem can be divided into two categories. The first category of methods does not search for the optimal architecture; rather, the trajectory is optimized for a given architecture. Reference [31], for instance, presents a minimum cost solution trajectory for this Earth–Jupiter mission, assuming a fixed planet sequence of EVEJ. The departure, arrival, and flybys dates were also assumed fixed, with a launch in 2016 and a mission duration of 1862 days. The primer vector theorem solution has four DSMs. Two DSMs are applied in the first two legs. The total cost for this solution is 10.267 km/s, which is about slightly higher than the obtained cost in this paper. The method presented in this paper, however, has the advantage of the autonomous search for the optimal architecture of the solution. The obtained solution in this paper has the same planet sequence of EVEJ but a different DSM architecture compared to Ref. [31]. Reference [23] presents the solution to this problem using the HGGA but without the tags concept. Reference [23] implements a simple feasibility check in assigning the hidden genes in each chromosome. The solution in Ref. [23] also finds the planet sequence of EVEJ, and has a total mission cost of 10.182 km/s, which is slightly higher than the cost obtained in this paper. This problem was also solved using mechanisms E and F (presented in Sec. 3) and the results were presented in Ref. [24]. The total cost obtained using mechanism E is 10.1438 km/s and using mechanism F is 10.9822 km/s, which are higher than the cost obtained in this paper. The mission trajectory obtained using mechanism A is shown in Fig. 10.

As a demonstration of how the tags evolve over subsequent generations, consider this Earth–Jupiter problem solved using logic C. The population size is 300 and the number of generations is 100. Six tags are examined. Figure 11 shows the number of times each tag has a value of “1” in each generation. For example, tag 6 takes a value of “1” in all the population members in generations 55 and above. In the 30th generation, for instance, tag 6 takes a value of “1” in only 40 chromosomes and takes a value of “0” in the other 260 chromosomes. The other five tags converge to a value of “0” in the last population in all the chromosomes.

## Statistical Analysis

A statistical analysis is conducted on the methods presented in this paper. Two different analysis tools are implemented. The first is evaluating the success rate for each method in solving different problems. The second tool ranks the proposed algorithms using the Sign test [32].

For each mathematical function presented in Sec. 4, the success rate of each logical and stochastic mechanism is calculated numerically. By repeating the same numerical experiment, the obtained solution in each experiment is compared to the best-obtained solution and a success rate can be updated as the experiment being repeated. For each of the three logical mechanisms presented in this paper, the success rates in finding the best solution for the Schwefel 2.26 function is shown in Fig. 12. Logic B has a success rate of about 70% which is less than that of logics A and C, which is about 100%. For each of the three logical mechanisms presented in this paper, the success rates in finding the best solution for the Egg Holder function is shown in Fig. 13. Logic B has a success rate less than that of logics A and C. Both logics A and C settle at a success rate of about 30%. In all three mathematical functions, logics A and C have very close success rates and logic B has a lower value for the success rate.

For the stochastic mechanisms presented in this paper, the success rate of the Schwefel 2.26 function is shown in Fig. 14. In Fig. 14, the mutation rate of mechanism A is 0.001, for mechanism B is 0.01, and for mechanism D is 0.03. As shown in Fig. 14, mechanisms A, B, and C have higher success rates compared to mechanism D. Note that mechanism D has also resulted in higher function values.

The success rate was also computed for the Earth–Jupiter trajectory optimization problem, using both mechanism A and logic A. The results are plotted in Fig. 9. The success rate for both approaches is about 75%.

The Sign test is a pairwise comparison between different algorithms. It has been applied to the algorithms presented in this paper. Each algorithm is ranked based on the total number of cases in which the algorithm produces the best function value. Table 7 summarizes the results obtained in Sec. 4, where the best function value obtained using each algorithm is listed. Table 7 also lists the results obtained using mechanisms E, F, G, and H which are detailed in Ref. [24]; they are used here for the purpose of ranking. Listed also in Table 7 is the solutions obtained using the No-tag HGGA which is originally developed in Ref. [21].

Using the data presented in Table 7, the Sign rank is computed by counting how many times each algorithm resulted in the best function value among all functions. Table 8 lists the Sign rank for each algorithm. Mechanism A then logic A have the highest ranks. It is to be noted here, though, that this metric is best when the number of tested functions is high.

## Conclusion

The concept of binary tags is introduced in GAs to enable hiding some of the genes in a chromosome so that they can be used to search for optimal architectures in VSDS problems. The proposed binary tags concept mimics biological cells in hiding the genes that are not supposed to be effective in the cell, while they could be effective in other cells. Mechanisms for assigning the chromosome hidden genes are proposed and investigated in this paper. Two categories of mechanisms for evolving the values of these tags in subsequent generations are proposed in this paper. The first one uses logical operations to evolve the tags while the other one uses stochastic operations for tags evolution. Numerical tests were conducted on mathematical optimization problems as well as the interplanetary trajectory optimization for a spacecraft mission from Earth to Jupiter. The implementation of the new hidden genes assignment mechanisms to the space trajectory optimization problem and the mathematical optimization problems demonstrated its capability in searching for the optimal architecture, in addition to improving the solution compared to the original hidden genes HGGA approach that does not implement the tags concept. It is demonstrated in this paper that, for the trajectory optimization problem, it is possible to autonomously compute the optimal number of flybys, the planets to flyby, and the optimal number of deep space maneuvers, in addition to the rest of the design variables using the proposed algorithms. Statistical analysis conducted in this paper on the mathematical optimization problems showed that, in terms of optimality of the solution, mechanism A and logic A performed better than the other algorithms.

## Acknowledgment

Superior, a high-performance computing cluster at Michigan Technological University, was used in obtaining the results presented in this publication.

## Funding Data

Directorate for Engineering National Science Foundation (1446622).