Actually, I did this test:
Given a grid of size NxN with F pieces of food randomly distributed in it, try to build an agent which moves inside the space to get all the food. The agent has a DNA which consists in a set of directions: North, South, West and East. These are the D steps that it will try to do starting from the location (0, 0).Well, I did this program in Python in 195 lines of code (and still I didn't make my best to avoid verbose code). About 100 are comments. This gives an idea about how fast to write in Python is.
Anyway, I tried a simulation with these parameters:
- Grid is 3x3, that is 9 cells.
- 3 food pieces in the space.
- The max number of steps is 7. This is sufficient to reach all the 3 pieces of food on a 9-cells grid.
- The population is of 5 agents, every agent is tested on a copy of the grid, so no race condition.
- 100 tests for each variation.
- 10000 maximum ages of evolution.
So, each of the 100 tests will have a maximum of 10000 ages. If you can't get the food within this range, the test fails, else it pass.
Now, the test is on how the population is built: in the first age, the 5 agents are randomly built.
But from the second, the genetic algorithm apply:
- calculate the fitness of the 5 agents;
- cross the 2 most fitting agents, and produce 2 new agents;
- keep the most fitting agent in the set;
- mutate the most fitting agent;
- mutate the second most fitting agent OR the the least fitting agent;
The mutation is of ONE base of the DNA. Which is rather limited mutation.
Of course, the fit function calculates how much food do you get with that DNA sequence.
Now, it seems logical that keeping the most fitting agents in the set is better, because it provides higher probability that a single mutation will give the optimum. But this isn't true at all (and this isn't a news :D), because a single mutation often isn't enough to push the most fitting agents toward a better direction.
For example, if the food has a total value of 10 (4 + 5 + 1), an agent may use a DNA of length 7 to get food for a value of 9. Which is pretty good in nature, but it's the optimal solution.
The problem is that if the 1 food is far, while the 4 and 5 foods are near the origin, the agent may have a DNA which is totally wrong and just can't reach the last piece of food with a single mutation.
It may happen that all the 5 agents in the population will evolve toward a local optimum, which is the DNA of the fittest, which can reach only 9 food.
Now, we can introduce more mutations, OR we can just keep an high variation in the DNA by not discarding the least-fittest agent.
In fact, with this test program I show that this is a better strategy. Here a test result:
('Executed ', 100, ' tests with passing rate ', 100.0, ' and total iterations: ', 10249)
('Executed ', 100, ' tests with passing rate ', 68.0, ' and total iterations: ', 320407)
As you see, leaving the least-fit agent in the population, will produce a 100% hit in 10249 ages (being 100 tests, we have an average of 102 ages to get all the food).
While discarding the least-fit agent and keeping all the best ones, won't always succeed, as the hit ratio is 68% - giving a limit of 10000 iterations - and with a total number of iterations that is 32 times higher.
Without the limit of 10000, some configuration won't evolve and the problem will be unsolved.
This is done with DNA long 7. Reducing it to 6, will show that the least-fitting method is anyway faster and more efficient. Actually 6 is the lowest-limit to get 3 foods in the space. In fact, the least-fit method will achieve the solution more than the best-fit method.
EDIT: I fixed the code, now (again) it runs happily with a limit of 10000 iterations.
Here with a DNA long 6
('Executed ', 100, ' tests with passing rate ', 100.0, ' and total iterations: ', 26707, ' with limit ', 10000)
('Executed ', 100, ' tests with passing rate ', 70.0, ' and total iterations: ', 300610, ' with limit ', 10000)
Here with a DNA long 5
('Executed ', 100, ' tests with passing rate ', 77.0, ' and total iterations: ', 247228, ' with limit ', 10000)
('Executed ', 100, ' tests with passing rate ', 50.0, ' and total iterations: ', 500384, ' with limit ', 10000)
EDIT: I fixed the code, now (again) it runs happily with a limit of 10000 iterations.
Here with a DNA long 6
('Executed ', 100, ' tests with passing rate ', 100.0, ' and total iterations: ', 26707, ' with limit ', 10000)
('Executed ', 100, ' tests with passing rate ', 70.0, ' and total iterations: ', 300610, ' with limit ', 10000)
Here with a DNA long 5
('Executed ', 100, ' tests with passing rate ', 77.0, ' and total iterations: ', 247228, ' with limit ', 10000)
('Executed ', 100, ' tests with passing rate ', 50.0, ' and total iterations: ', 500384, ' with limit ', 10000)
Here the source code GenFood.py.
Stay --sync
10 commenti: