Train dispatching program for high-speed railway station based on genetic algorithm

. In case of train delays, centralized traffic control system become disabled, and the workload of dispatchers increases dramatically. Based on genetic algorithm, the author designs a program to appropriately reschedule trains in terms of delays, minimizing the total delay time and changes of gate. The author transformed the initial problem to a compromised combinatorial optimization model, with total delay time, changes of gate and conflicting routes as objectives. The high weighting in conflicting routes ensures efficiency and high probability of obtaining a feasible solution. With discreate variants, the author designs special coding and evolving method suitable for this problem. Using a special treatment for conflicts and initializing chromosomes, the program can construct new timetable quickly given the scheduled timetable, predicted arrival time and order of trains (optional), which promotes the efficiency and security of dispatching in high-speed railway stations. The method was tested with a synthetic data of Shanghai-Kunming section of Hangzhou East Railway Station.


Introduction
Large high-speed railway stations usually have a large scale of tracks and platforms and are often adjacent to the EMU depot.With a variety of types of trains such as origination, termination, entering and leaving the depot, and passing through, those stations possess a large number of conflicting and opposing train routes, increasing great difficulty in the safety and efficiency of train dispatching.Especially during peak hours of the station, once a train is delayed, it would have a consequential impact on the station's operational plans and adjacent operating trains.In severe cases, it may fully disrupt the existing timetable.When there is a large-scale train delays, the difficulty of manual adjustments become relatively high.Dispatchers need to comprehensively consider the nature and level of trains, as well as various factors such as water supply, waste disposal, track switching operations and EMU depot operations, thus difficult to make the optimal decision in a short period purely relying on manual experiences.
Kroon used Amsterdam Central Station and Utrecht Station in the Netherlands as examples to study the compilation of train routes [1].However, the calculated results cannot effectively solve the problem.Subsequently, in 2001, the model was optimized, dividing the train routes into three types: arrival, departure, and origination-termination routes.The aim was to optimize the selection of preferred routes for trains [2].
Liu abstracted the arrangement of throat section turnouts as a network graph with multiple starting points and multiple junctions [3].By establishing an optimization model and using computational programming methods, they conducted calculations on existing passenger stations.Conclusions were drawn regarding the reconstruction of throat sections of Liuzhou Station and Guiyang Station.Zhou optimized the issue of technical station turnout occupation [4].Employing a method similar to Liu, after determining a fixed usage plan for station tracks, the author focused on selecting arrival and departure routes.Shi, with the optimization objective of ensuring the priority of high-level trains and maximizing the efficiency of station tracks, utilized modern simulated annealing algorithm to solve the problem [5].
In Li's work, a detailed analysis of the technical tasks and workflow of train stations was conducted [6].With the background, an in-depth study of optimizing route selection for trains was carried out.Firstly, based on the description method of establishing station networks, a refined model covering various station elements was constructed.Secondly, train routes were described from both mathematical and formal perspectives, leading to the development of methods for generating train route tables.Thirdly, through an information model, the construction of station train operations and shunting operations was carried out.By avoiding train crossings on station routes based on completed planned operations, a train route selection model was established, resulting in optimal reductions in train travel time and delay rates.By analyzing the temporal correlations of route selection and route arrangement, the route selection problem was analogized to a 0-1 integer programming problem.Research results showed that the immune evolutionary algorithm could compute route selections satisfying constraints with good convergence.Addressing nonlinear models, new optimization algorithms were proposed by Long based on modern optimization algorithms [7].With calculations it was demonstrated that the new optimization algorithms have higher solving speeds than modern optimization algorithms.
Chen conducted a comprehensive study on the route selection problem of train stations [8].Firstly, comprehensive optimization was conducted for the utilization of station tracks and the selection of arrival-departure routes in a throat section.Secondly, the extension of arrival-departure routes from one end of the throat area to both ends of the station throat section, combined with arrival-departure lines, was termed as the "through-station route".Thirdly, aiming for the comprehensive optimization of through-station routes and shunting locomotive utilization, a 0-1 programming model for route selection was constructed, with compatibility between arrival-departure lines and turnouts as constraints, and the algorithm approach of Long was adopted to solve the model.In Li's work, the focus was on the utilization of station tracks at large passenger stations during peak hours, considering the dual objectives of minimizing delay time for station tracks and maximizing passenger service quality [9].The author proposed a multi-objective optimization model.Xia considered various uncertainties in station track utilization and established a stochastic planning model to improve efficiency and facilitate passengers [10].
With the contribution of previous scholars, the study focuses on researching computational methods for dealing with train delays at large high-speed stations.In case of train delays, the author aims to generate efficient auxiliary decision-making suggestions for track utilization adjustments at the station within a short time, thereby reducing the operational burden and enhancing the work efficiency of dispatchers.

Problem statement
The information given for the problem, or the input data for the program are as follows: trains entering the station at the Shanghai-Kunming section of Hangzhou East Railway Station, and the order of trains entering or leaving each direction of the section.Information of trains includes the unique number of trains, scheduled arrival and departure time, predicted arrival time, direction of entry and exit, and the station track occupied by the train.
After processing the inputs, the program will provide the final decision of the arrival and departure time, and station track of trains, fulfilling the goal that the total delay and changes of gate are as small as possible.The author makes several assumptions to reduce problem complexity while maintaining the effect of analysis.The first assumption is for the minimum dwell time and available tracks as follows (Table 1, 2 The third assumption is that the route selection of each train from each direction to each platform is unique.According to analysis for the topology structure of this case, which is the Shanghai-Kunming section of Hangzhou East Railway Station, for any situation with parallel routes, there is always a route that minimizes its conflict with all other routes, so the author selects a unique route for each train.If this model is adopted to other stations it required to satisfy this assumption.
Moreover, the writer assumes that the actual departure time cannot be earlier than the scheduled departure time as a requirement for transportation organization.And the writer intends to research on discreate variables, so the given arrival and departure time is only accurate to the minute.

Symbol table and index description
First, the number of trains are indexed  ∈ 1,2, … , where  is the total number of trains.Second, the writer indexed tracks and routes uniformly, where the track index are their original numbers from 14 to 25, and the route is indexed in a certain order starting from 31 to 87.About the direction  of entering and leaving the station, the meaning of each  is (Table 3), EMU depot line B Last the assignment type  (Table 4) is as follows.) represents the actual departure time vector of all trains. = ( 1 ,  2 , … ,   ) represents the station track number vector of all trains.The total decision vector is synthesized in the following way:   = ( 1, ,  2, ,   ) and  ⃗ = ( 1 ,  2 , … ,  n ) , where,  1, =  ⃗ (3 − 2),  2, =  ⃗ (3 − 1),   =  ⃗ (3) and  ⃗ () represents the  -th component of vector  ⃗ .

Constraints.
The first constraint is the ingress and egress direction order constraint.Since highspeed railway stations may do not have the right to change the order of trains thus require a higher-level dispatcher to decide the order of trains entering and leaving the station, the author considers this as an optional constraint.
Suppose direction  has an order  1 ,  2 , … ,   , indicating that train   needs to pass this direction ahead of train +1 .Let δ k be the access symbol of train   (δ k = 1 as enter, δ k = 2 as exit), then the direction order constraint is: Where ∆   indicates the minimum time interval of direction , which is: The second constraint is the arrival time constraint.The actual arrival time of the train must not be earlier than the predicted arrival time, that is: The third constraint is the departure time constraint.The actual departure time of the train can neither be earlier than the scheduled departure time nor the theoretical earliest departure time.The theoretical earliest departure time means, the time train finishes its assignment after it arrives the station.
The fourth constraint is the station track constraint.
14G, 15G, … ,18G, 21G, . . .,25G At the same time, the ingress and egress direction of each train also affects the station tracks it can occupy.The final available station tracks for the train are the intersection of both constraints (Table 5).
Before moving to next section, there are some explanations for why conflicting routes are not considered as a constraint.To ensure the effectiveness of the algorithm and prevent the feasible region from possessing a bad topological structure, the author places the number of route conflicts in the optimization function and assign it a large weight.The purpose enabling the algorithm to quickly obtain an acceptable solution that is unlikely to contain conflicts.

Objective function.
Denote ( ⃗ ) = λ 1  1 + λ 2  2 + λ 3  3 , where λ 1 , λ 2 , λ 3 are the weighting coefficient,  1 is the total delay minutes,  2 is the changes of track, and  3 is number of conflicting routes.In practical application, the author takes λ 1 = 1, λ 2 = 10, and λ 3 = 100.Ratio λ 2 λ 1 can be viewed that in practice, the impact of changing one station track approximates the impact of λ 2 λ 1 = 10 minutes of total delay.It can be deduced that, and where (  ≠   ℎ ) represents the Boolean value of the difference between the actual station track and the scheduled station track of train , that is, To calculate   , the number of conflict routes, let   1 , 2 indicate the conflicting Boolean value of  1 ,  2 , where   1 , 2 = 1 when  1 ,  2 conflict and   1 , 2 = 0 when they do not.Notice that  , = 1.
Define occupation function  , indicating the number of trains occupying route or track  during [,  + 1).Then the program marks the time each train occupies the route or track in  , .Then it can be deduced that, The first sum represents the number of conflicts for all routes and tracks, but it includes an overcount of  1 =  2 , which means self-conflict.Because any route or track can afford occupation up to 1, the second sum needs to subtract the maximum of 1 occupation.

Algorithm design 2.4.1. Initializing chromosomes.
The goal is to generate random vectors satisfying the constraints in 2.3.2. as quickly.The initial idea was to first randomly generate random vectors in a large enough hyperrectangle and repeat the process until the vectors satisfy all constraints.But after testing the writer found that its efficiency is far from satisfactory, costing 10 times longer time than other steps.Then the author transformed this step into a graph theory problem and solved it with topological sorting, which was proved effective.
It can be observed that any time constraint in 2.3.2.have a format of either   ≥   or   −   ≥  , .The former can be transformed to the latter by adding a virtual time  0 which is set to 0, thus   ≥   becomes   −  0 ≥  ,0 .All the time variants   and the virtual time  0 can be viewed as nodes of a graph, and constraint   −   ≥  , represents an edge from   to   with weight  , .
Since the existence of a solution is guaranteed by the problem itself, there is no ring in the graph thus a topological ordering exists.One can observe that the only node with indegree 0 is the virtual time node  0 , so  0 must be ordered first.Initialize all nodes with value 0 (noticing that  0 = 0), and visit all the nodes and their outgoing edges in the topological ordering.For each edge from   to   , set the value of   to the maximum of the current value and   +  , .Then the smallest vector  satisfying all constraints is obtained.
To get random vectors to form the population of chromosomes, simply enlarge the weight on all edges randomly and execute the topological sorting method.After  repeated processes the initial chromosomes are generated.

Crossover.
Define the parameter   the probability of crossover operation.This probability indicates that there are expected   •  chromosomes subjected to crossover operations.
To determine the parents of cross operation, repeat the following process for  in 1~: generate a random number from [0,1] denoted .If  <   , select   ⃗⃗ as a parent.Randomly permute the selected parents and combine them into the following: If the number of vectors is odd, the last one will be discarded.
Intuitively, the two chromosomes exchanged strategies for the first  trains.If either chromosome is illegal after crossover, the crossover operation is abandoned.

Variation.
Define the parameter  _ the probability of mutation of time.This probability indicates that there are expected  _ •  chromosomes subjected to variation operations.
First, determine the parents of the time mutation operation like the crossover operation.For each selected mutant parent  ⃗ , denote its variants of time as vector  .For this mutation operation, Give the upper bound of time variance  and randomly select the mutation position ,  ∈ [1,2𝑛].Generate a random integer from [−, ] denoted ∆ and add it to the -th dimension of  , and call the modified vector   ′ .If  ′ is illegal, let  be its half and mutate again until a feasible time vector is obtained (ultimately  < 1 and  ′ =  must be legal).
To mutate station track, define the parameter  _ .Select parents as above and randomly select the mutation position ,  ∈ [1, 𝑛].For the -th train, randomly select one of its available station tracks.

Evaluation and selection.
The author uses an order-based evaluation function to evaluate and select chromosomes.For chromosome population  1 ⃗⃗⃗ ,  2 ⃗⃗⃗ , … ,   ⃗⃗⃗⃗ , calculate the objective function of each chromosome (  ⃗⃗ ).Then, sort the chromosomes by objective function value from smallest to largest.
Denote   the position of   ⃗⃗ after sorting.Thus, the evaluation function is defined as, where  ∈ (0,1) is a predetermined parameter, and the author takes  = 0.05 hhen, select chromosomes based on this evaluation function For each chromosome, calculate the cumulative probability: Generate a random number  from interval (0,   ].If  ∈ ( −1 ,   ], select the chromosome   ⃗⃗ .Repeat the procedure  times to get  newly selected chromosomes.
2.4.5.Output.After each evaluation, update the chromosome    ⃗⃗⃗⃗⃗⃗⃗⃗⃗⃗ with the minimum objective function at all times.Finally, the program outputs the optimal solution during the execution process.

Case testing
The author created a running schedule for 21 trains in the Shanghai-Kunming section of Hangzhou East Railway Station in one hour and set delays randomly.After several executions of the program, the optimal solution for this case is 149 minutes of delay with 1 change of gate.Both scheduled and adjusted train operation are drawn below, where each bar represents the time interval a train occupies a station track or direction (Figure 1, 2).The author uses different color to represent different status of trains.Blue trains are on their schedule, purple trains are ahead of schedule, and red trains are delayed.The program slightly adjusted the dwell time of trains to minimize overall delay time, and the station track of G3 is changed from 22G to 17G to avoid conflicts with G12 and G13 to cause widespread delay.Intuitively the program is relatively effective and further analysis of the algorithm is conducted below.

Algorithm analysis
3.2.1.Overview.In terms of solving efficiency, genetic algorithms use natural selection and genetic mechanisms to find the optimal solution, which can usually find relatively good solutions in a relatively short period of time.In addition, the convergence speed of the algorithm is also affected by parameter settings, such as population size, crossover rate, and mutation rate.By reasonably adjusting and optimizing these parameters, the solving efficiency of the algorithm was improved.In testing the author adjusted   ,  _ and  _ , and the size of the chromosome population is reduced to shorten the time each loop without losing much accuracy.
In terms of solution quality, genetic algorithm can find relatively good solutions, but in most cases, it may not be able to find the optimal solution or even feasible solution.To increase the solving accuracy, the writer tested the stability of this program and calculated the required execution times for high probability of success.

Efficiency analysis.
The time complexity and space complexity of computer program are important indicators for measuring the efficiency of the algorithm.Time complexity focuses on the change in the running time as the problem size changes, while space complexity focuses on the storage space.
For to the genetic algorithm the author adopts, the complexity of calculating the objective function is ( 3 ), where  represents the sum of all routes, ports, and tracks, in the test case  = 90.The determining factor is calculating conflicting routes ∑   1 ,   2 ,   1 , 2  1 , 2 , in  3 .The time complexity of crossover and mutation is both ().Therefore, the time complexity of performing the genetic algorithm is (( 3 + )).
The space complexity mainly depends on factors such as population size and encoding method.Let the population size be , the space complexity of storing chromosomes is ().When calculating the objective function, the program constructs an occupation matrix , whose space occupation is ().Since the chromosomes and occupation matrix is updated in every loop, the total space complexity is independent of the number of evolutions, which is ().

Stability analysis.
The stability of genetic algorithm refers to the ability of the algorithm to converge steadily to the optimal or feasible solution.To evaluate the stability of the program, the author used the method of repeated experiments.The author executed the program 200 times for the above case and analyzed the results and running time.For each execution of the genetic algorithm, if the optimal objective function value did not change after 50 evolutions, the execution was terminated.
The 200 executions of the program took 791.26seconds, with an average of 3.96 seconds per execution.Sorting all the results outputted, the writer obtained the following plot of the objective function values for the 200 executions: From the figure 3, the running results have roughly 5 "ladders", and only on the first ladder the objective function values are below 200, which is feasible.Analyzing the data in detail, the frequency of obtaining a feasible solution ( 3 = 0) is 39.5%.The frequency of optimal solution 159 is 10%, and the frequency of the objective function value below 170 is 20.5%.
Suppose that obtaining feasible solution obeys a binomial distribution with success rate  = 39.5%.If the success rate reaches 99.5% then it is necessary to execute the genetic algorithm at least 11 times, which takes about 43.52 seconds.The predicted running time is acceptable for practical situation.

Conclusion
The problem aroused from the fact that centralized traffic control system is not capable of dispatching trains in terms of large-scale train delays.The program uses scheduled timetable, predicted arrival time and order of trains (optional) as inputs, and outputs an adjusted timetable minimizing the total delay time and changes of gate.The author innovatively treats the conflicting routes as an optimization goal rather than a constraint to increase the computational efficiency, and transformed all linear constraints to a graph to accelerate initializing chromosomes by topological sorting.
With the arrival time, departure time and station track as discrete variants of genetic algorithm, the author modified the common operations of initialization, crossover, and mutation of genetic algorithm.Intuitively, crossover operation switches the strategies of the first half trains for each pair of parents, and mutation operation changes one time variant or one station track variant for selected chromosomes.Initialization operation requires to transform all constraints for the difference between two time variants to a graph and do topological sorting.Based on the hypothesis of unique train routes, the program pretreats a conflict matrix from each direction to each station track.Then it can calculate the conflicts in each chromosome by matrix multiplication.Combining this quality with the total delay time and changes in gate with different weights, the program can calculate the evaluation function and select better-performed chromosomes.
The program is tested effective in the case the author synthetized.Then this paper further analyzed its efficiency and stability.By setting the weighting coefficient of conflicting routes relatively high, the program has a success rate of nearly 40%.Calculating by binomial distribution it is obtained that to get a success rate of 99.5% approximately requires a running time of 40 seconds.Considering the test case lasts for about one hour, the efficiency and accuracy of the program is generally acceptable.

Figure 3 .
Figure 3.The graph of objective function values running 200 times.

Table 1 .
). Assumption of minimum dwell time and available track.

Table 2 .
Assumption of train intervals.

Table 3 .
The meaning of numbering .

Table 5 .
Available station tracks based on assignment type.