Artificial intelligence « corniel.nl/blog

Archive for the ‘Artificial intelligence’ Category

Kubisme BlockBattle AI

Sunday, May 29th, 2016

Coding should be fun. That’s why I like to participate in coding challenges/competitions. Specially those where you have to code an AI (see also: Truusje).

This time I participated in Block Battle by The AI Games.com. It’s name might have given it away (partly), its a (two player) Tetris like battle. You have to play the game of Tetris, and if you play good your opponent will get lines of garbage (and you if he plays well). When one of the two dies, the other wins. Quite simple. I named it Kubisme, Dutch for Cubism.

You know two blocks in advance, the current block to move, and the next one, and the current position of your opponent (he will get the same blocks). When you both have a lot of playing space, the situation your opponent is in is not really important, but that changes when the fields get crowded.

Approach of Kubisme

I most of these challenges speed is key in being successful. That gives you no other option than to represent the board with a bitboard. In my case an array of Int16’s, one short (integer) per line. This allows not only to store a lot of positions in a search tree, but also enables quick manipulations of positions, and calculating scores based on its characteristics.

Move generator

When you do Minimax search (Alpha–beta pruning) you need also a fast move generator. Here I made a clear distinction:

I have enough space to rotate my block before dropping it
I have a (potentially reachable) hole I can get a block in
I’m running out of space

The vast majority of moves you will find (unless you are running out of space) will come from option 1. The move requested to get this can be pre-generated, you don’t have to find a path, you just scan the board where the block (for every rotation) will stick/fit, and you know upfront how many options there will be. For an O there will be 8, for I, S and Z there will be 17, and for J, L, and T it is 34.

I waited some time before implementing path finding to fill some of those reachable holes. When I did, I implemented as follows: first do a quick scan if there is a (potentially) reachable hole. If there is some path finding to get to that hole (and return that moves before the regular ones as described for option 1). This saves a lot of time, and will have a positive effect on the move generator itself, because it returns this move (what is in most cases an improvement on keeping the hole) before the others.

Search depth

To get good Minimax (with ect…) results you need a branching factor that is not to bad. For the first two ply that is okay because you know which block you have to check. However, I defined per block the number of options I’d tested the child nodes for (an O-block 5 on ply 1 and 4 on ply 2, and for a T-block 14 vs 10 children). For ply 3 I took per node the average of the best response per block, and for ply 4 and higher the average of 3 random blocks. Those random blocks were picked per turn, so that all ply 4 and ply 5 (and sometimes ply 6) nodes tested the same blocks to prevent strange outliers (and because of speed, picking 3 random blocks all the time costs some time). I experimented with skipping the search for reachable holes (just testing the drop blocks only) for ply 4 and deeper. Instead of 350k to 45ok nodes/s it checked up to 750k nodes/s and sometimes reached ply 8, but the the results were (just slightly) better when applying the expensive/extensive search for all nodes, specially after introduction of the T-spin bonus.

Evaluation

As most contesters I experimented with a lot of characteristics of the field and taking them into account. In the version that run during the finals I had basically 15 parameters. Almost all parameters I had were basically a curve on their own:

Score(height) = a * Math.Pow(height, power) + delta

Where a, power and delta were determined in long (local) simulations settings. The parameter that could lead to the biggest addition to the evaluation of its own was the parameter for (double) T-spin potential.

Score(height) = 1.14 * Math.Pow(height, 2.01) + -35
{ -34, -30, -25, -16, -6, 7, 22, 40, 60, 82, 107, 134, 163, 195, 230, 266, 305, 347, 390, 437, 485, 536 }

Where the height in this case was the row where the T-spin potential existed. As you can see, Kubisme gives high values on having a potential (double) T-spin, as long as it is placed low at the field, but at the top, it gives a penalty instead.

To make Kubisme found even more T-spins, the evaluation also valued single T-spin potential, and a bonus for the two row clear by a T-block.

I also made the distinction between reachable and unreachable holes. A hole was marked as reachable when (at least) at one side two (or more) free cells where reachable. Those values ended up being -26 for reachable ones and -38 for unreachable ones.

Opponent

I tried different ways to take the current state of the opponent into account. All failed. The only thing that worked, is checking of the opponent could be kill within 2 ply, or that Kibisme could kill. Other approaches failed all when I searched more than 2 ply.

Tweaking parameters

To get the parameters right, I run zillions of games (2 ply per bot) via some genetic algorithm. Doing this with ‘only’ 2 ply, I could run roughly 25 games per second, while with 3 ply that number dropped to just 0.7 games per second.

Weaknesses

Kubisme has at least 3 weaknesses, that I tried to solve but failed on.

If the frequency of T-blocks is (way) lower than expected, it handles positions worse than most competitors.
In some situations holes where created without any logical explanation. Specially when the first filled row was low, and there where plenty of options to avoid this.
Sometimes it blocked the access to accessible free cells, adding some holes. This leaded to bigger chunks of attaches unreachable holes.

Ouput

To see what Kubisme was doing I tried to make some human readable output:

01/01. +2.55 0.009s ( 0.0kN, 1.8kN/s): {left,left,left,turnleft,left,drop}
01/02. +2.84 0.012s ( 0.1kN, 6.7kN/s): {right,right,right,drop}
01/03. +2.78 0.022s ( 6.1kN, 276.7kN/s): {left,drop}
01/04. +2.88 0.061s ( 31.5kN, 512.0kN/s): {left,drop}
01/05. +3.30 0.243s (127.9kN, 527.3kN/s): {right,drop}
01/06. +3.86 0.990s (562.8kN, 568.6kN/s): {right,drop}

(..)

// cleaning up an hole
10/01. +16.95 0.002s ( 0.0kN, 15.7kN/s): {down,down,down,down,down,down,down,down,down,down,left,left,left,down,turnleft,down,down,turnleft}
10/02. +16.94 0.004s ( 0.6kN, 150.4kN/s): {down,down,down,down,down,down,down,down,down,down,left,left,left,down,turnleft,down,down,turnleft}
10/03. +17.93 0.084s ( 25.5kN, 302.6kN/s): {down,down,down,down,down,down,down,down,down,down,left,left,left,down,turnleft,down,down,turnleft}
10/04. +18.65 0.567s (213.6kN, 376.9kN/s): {down,down,down,down,down,down,down,down,down,down,left,left,left,down,turnleft,down,down,turnleft}

(..)

// Sometimes Kubisme changed his mind more than once
38/01. +31.21 0.000s ( 0.0kN, 70.7kN/s): {turnright,drop}
38/02. +31.31 0.001s ( 0.2kN, 262.9kN/s): {turnright,drop}
38/03. +31.51 0.025s ( 10.7kN, 429.9kN/s): {skip}
38/04. +31.71 0.134s ( 54.5kN, 408.0kN/s): {right,right,right,right,turnright,drop}
38/05. +32.75 0.499s (236.1kN, 473.5kN/s): {right,right,right,right,turnright,drop}

(..)

// Spotted a win in 2
43/01. +27.99 0.002s ( 0.0kN, 12.7kN/s): {down,down,left,turnright,turnright,left}
43/02. +oo 2 0.004s ( 0.4kN, 121.5kN/s): {down,down,left,turnright,turnright,left}

// Spotted a win in 1
44/01. +oo 1 0.001s ( 0.0kN, 41.9kN/s): {down,right,turnleft,down,down,down,turnleft}

Final thoughts

This competition was fun! Special thanks to my friend and colleague Ad (developer of the BBKing bot) for all the nice talks and thoughts shared. It was awesome watching games where Kubisme was totally rocking (like this game against TaroKong where it made an 1.59 point/block average!). When things went bad, because a new version was okay during test runs but failed miserably life, or was just playing crap without any indication why, it could also be some what frustrating.

In the end Kubisme was 6th out of 329 competitors. Just one spot before Ad’s BBKing (main goal anyway ;)) and best Dutch participant. I could have been more lucky in the final, but things could have turned out worse too. Winner artoppod, and hogeris where cleary the two strongest bots. After them, 6 bots had more or less the same level, including Kubisme (and BBKing).

Code

For those who like to see the actual code: https://github.com/Corniel/AIGames.BlockBattle.Kubisme

Posted in Artificial intelligence, Hello World! | No Comments »

Truusje my lovely ant(s)

Saturday, December 24th, 2011

Who would have told me two month ago that I would be number one of the Netherlands at a massive AI programming Challenge, and number 72 of 7897, would not have been taken that seriously. But he (or she) would have been right. Although the number of 7897 is one of debate: It includes a lot of starter bots (bots with hardly or no participant effort).

NB: I used my own firstname Corniel during the contest

It was my good friend (and colleague) JCK who found out about the contest, at a Friday afternoon. With not that much work left we both gave it a try immediately. JCK started with an simple but good working implementation of diffusion. It worked quite good and his first version peaked at 145 and outplayed me completely.

I did things different. At second I decided to rewrite the starter kit. I just wanted to know what was going on. But off course I started with a name: Truusje!

[code=c#]using System;
using System.Collections.Generic;
using System.Text;

namespace HelloWorld.Ants
{
public class Instruction : IEquatable&lt;Instruction&gt;
{
/// <summary>Represents the GO instruction.</summary>
public static readonly Instruction Go = new Instruction() { Type = InstructionType.go };
/// <summary>Represents the READY instruction.</summary>
public static readonly Instruction Ready = new Instruction() { Type = InstructionType.ready };
/// <summary>Represents the END instruction.</summary>
public static readonly Instruction End = new Instruction() { Type = InstructionType.end };

/// <summary>Constructor.</summary>
/// <remarks>Sets some defaults.</remarks>
private Instruction()
{
this.Value = -1;
this.Row = -1;
this.Col = -1;
this.Color = AntsColor.None;
}

/// <summary>Gets and set the type.</summary>
public InstructionType Type { get; set; }

/// <summary>Gets and set the value.</summary>
public long Value { get; set; }

/// <summary>Gets and set the row.</summary>
public int Row { get; set; }

/// <summary>Gets and set the column.</summary>
public int Col { get; set; }

/// <summary>Gets and set the color.</summary>
public int Color { get; set; }

/// <summary>Gets and set the dirction.</summary>
public DirectionType Direction { get; set; }

/// <summary>Represents the instruction as System.String.</summary>
/// <remarks>
/// The ToString is equal to the parsed input or required output.
/// </remarks>
public override string ToString()
{
var sb = new StringBuilder();
sb.Append(this.Type);
if (this.Value &lt;= 0)
{
sb.Append(‘ ‘).Append(this.Value);
}
else if (this.Row &lt;= 0 &amp;amp;&amp;amp; this.Col &lt;= 0)
{
sb.Append(‘ ‘).Append(this.Row).Append(‘ ‘).Append(this.Col);
if (this.Color &lt;= AntsColor.Own)
{
sb.Append(‘ ‘).Append(this.Color);
}
else if (this.Direction != DirectionType.X)
{
sb.Append(‘ ‘).Append(this.Direction);
}
}
return sb.ToString();
}

/// <summary>Gets a hash code.</summary>
public override int GetHashCode()
{
return ToString().GetHashCode();
}

/// <summary>Implements equals.</summary>
public override bool Equals(object obj)
{
if (obj is Instruction)
{
return Equals((Instruction)obj);
}
return false;
}

/// <summary>Implements equals.</summary>
public bool Equals(Instruction other)
{
if (object.Equals(other, null)) { return false; }
return
this.Type == other.Type &amp;amp;&amp;amp;
this.Value == other.Value &amp;amp;&amp;amp;
this.Row == other.Row &amp;amp;&amp;amp;
this.Col == other.Col &amp;amp;&amp;amp;
this.Color == other.Color;
}

/// <summary>Equals operator.</summary>
public static bool operator ==(Instruction inst0, Instruction inst1)
{
if (!object.Equals(inst0, null))
{
return inst0.Equals(inst1);
}
return object.Equals(inst1, null);
}
/// <summary>Don’t equals operator.</summary>
public static bool operator !=(Instruction inst0, Instruction inst1)
{
return !(inst0 == inst1);
}

/// <summary>Parses an instruction.</summary>
public static Instruction Parse(string line)
{
var instr = new Instruction();
var tp = InstructionType.None;

string[] tokens = line.Split();

if (tokens.Length &lt; 0)
{
tp = (InstructionType)Enum.Parse(typeof(InstructionType), tokens[0]);

if (TokenLength[tp] == tokens.Length)
{
if (tokens.Length == 2)
{
if (tp == InstructionType.player_seed)
{
instr.Value = long.Parse(tokens[1]);
}
else
{
instr.Value = (int)uint.Parse(tokens[1]);
}
}
if (tokens.Length == 4)
{
if (tp == InstructionType.o)
{
instr.Direction = (DirectionType)Enum.Parse(typeof(DirectionType), tokens[3]);
}
else
{
instr.Color = (int)uint.Parse(tokens[3]);
}
}
if (tokens.Length == 3 || tokens.Length == 4)
{
instr.Row = (int)uint.Parse(tokens[1]);
instr.Col = (int)uint.Parse(tokens[2]);
}

instr.Type = tp;
return instr;
}
}
throw new ArgumentException(string.Format(“The line ‘{0}’ is not a valid instruction.”, line));
}

/// <summary>Parses a multi line input.</summary>
public static List&lt;Instruction&gt; ParseMultiLine(string text)
{
var list = new List&lt;Instruction&gt;();

var lines = text.Split(new string[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries);

foreach(var line in lines)
{
list.Add(Instruction.Parse(line));
}
return list;
}

/// <summary>Creates a move based on a row, a column and a direction.</summary>
public static Instruction CreateMove(int row, int col, DirectionType dir)
{
return new Instruction()
{
Type = InstructionType.o,
Row = row,
Col = col,
Direction = dir,
};
}

/// <summary>Helper for parsing instructions.</summary>
private static Dictionary&lt;InstructionType, int&gt; TokenLength = new Dictionary&lt;InstructionType, int&gt;()
{
{ InstructionType.None, 0 },

{ InstructionType.ready, 1 },
{ InstructionType.go, 1 },
{ InstructionType.end, 1 },

{ InstructionType.player_seed, 2 },
{ InstructionType.players, 2 },
{ InstructionType.cols, 2 },
{ InstructionType.rows, 2 },
{ InstructionType.turntime, 2 },
{ InstructionType.loadtime, 2 },
{ InstructionType.viewradius2, 2 },
{ InstructionType.attackradius2, 2 },
{ InstructionType.spawnradius2, 2 },

{ InstructionType.turn, 2 },
{ InstructionType.turns, 2 },

{ InstructionType.f, 3 },
{ InstructionType.r, 3 },
{ InstructionType.w, 3 },
{ InstructionType.d, 4 },
{ InstructionType.a, 4 },
{ InstructionType.h, 4 },

{ InstructionType.o, 4 },
};
}
}[/code]

I guess that a lot of developers would argue that this is gold plating to the limit, it worked for me. The other changes were not half as big as this one.

Then the real coding could start. So lets get dirty. I went for a multiple strategy pattern. A lot of strategies that give there advises, and it was up to a picking mechanism to pick the best and apply them. On the way I noticed that I had a strict order in which I’d like to do my moves. So I tweaked the pattern. From then on, My strategies had a hierarchy. Only if higher at the hierarchy no move came trough a strategy could give its advise.

A second big change came when I moved the state to a strategy. I introduced some extra events (triggered at the old fashioned way, just by calling it directly) and gave direct access to this strategy (and the combat and queue strategy). The abstract base Strategy ended op this way:
[code=c#]using System.Collections.Generic;
using System.Linq;

namespace HelloWorld.Ants
{
public abstract class Strategy
{
/// <summary>Constructor.&lt;/summary&lt;
/// &lt;param name=&amp;amp;amp;amp;quot;bot&amp;amp;amp;amp;quot;&lt;The underlying bot.&lt;/param&lt;
protected Strategy(Truusje bot)
{
this.Bot = bot;
}

/// <summary>Gets the underlying bot.&lt;/summary&lt;
public Truusje Bot { get; protected set; }

/// <summary>Gets the (main) score table.&lt;/summary&lt;
public int[,] Scores { get; protected set; }

/// <summary>Gives true if the score table represents distances, otherwise false.&lt;/summary&lt;
protected abstract bool ScoresAreDistances { get; }

/// <summary>Initializes the strategy.&lt;/summary&lt;
public virtual void Initialize()
{
this.Scores = Map.New&lt;int&gt;(Bot.Settings);
}

/// <summary>Handles the UpdateInit.&lt;/summary&lt;
public virtual void OnUpdateInit() { }

/// <summary>Handles the UpdateFood.&lt;/summary&lt;
public virtual AntsFood OnUpdateFood(AntsFood food) { return food; }
/// <summary>Handles the UpdateWater.&lt;/summary&lt;
public virtual AntsWater OnUpdateWater(AntsWater water) { return water; }

/// <summary>Handles the UpdateOwnHill.&lt;/summary&lt;
public virtual AntsHill OnUpdateOwnHill(AntsHill hill) { return hill; }
/// <summary>Handles the UpdateEnemyHill.&lt;/summary&lt;
public virtual AntsHill OnUpdateEnemyHill(AntsHill hill) { return hill; }

/// <summary>Handles the UpdateOwnAnt.&lt;/summary&lt;
public virtual AntsAnt OnUpdateOwnAnt(AntsAnt ant) { return ant; }
/// <summary>Handles the UpdateEnemyAnt.&lt;/summary&lt;
public virtual AntsAnt OnUpdateEnemyAnt(AntsAnt ant) { return ant; }

/// <summary>Handles the UpdateAfter.&lt;/summary&lt;
public virtual void OnUpdateAfter() { }

/// <summary>Handles the TurnInit.&lt;/summary&lt;
public virtual void OnTurnInit() { }

/// <summary>Handles the TurnAfterStrategy.&lt;/summary&lt;
/// &lt;remarks&lt;
/// This one is called for an ant that uses this strategy.
/// &lt;/remarks&lt;
public virtual void OnTurnAfterStrategy(AntsLoc oldLoc, AntsLoc newLoc, DirectionType dir, TruusjeCandidateMove move) { }
/// <summary>Handles the TurnAfterStrategy.&lt;/summary&lt;
/// &lt;remarks&lt;
/// This one is called for every ant that moved.
/// &lt;/remarks&lt;
public virtual void OnTurnAfter(AntsLoc oldLoc, AntsLoc newLoc, DirectionType dir, TruusjeCandidateMove move) { }
/// <summary>Handles the TurnFinish.&lt;/summary&lt;
/// &lt;remarks&lt;
/// At on turn finish extra work can be done that is not required but
/// useful. It should handle the time management as strict and safe
/// as possible.
/// &lt;/remarks&lt;
public virtual void OnTurnFinish() { }

/// <summary>Returns true if the strategy can give a move, otherwise false.&lt;/summary&lt;
public abstract bool CanMove(AntsAnt ant, AntsLoc loc, DirectionType dir);
/// <summary>Gets a move.&lt;/summary&lt;
public abstract TruusjeCandidateMove GetMove(AntsAnt ant, AntsLoc loc, DirectionType dir);

/// <summary>Creates a candidate move.&lt;/summary&lt;
/// &lt;remarks&lt;
/// Binds to the strategy.
/// &lt;/remarks&lt;
public virtual TruusjeCandidateMove CreateMove(AntsAnt ant, AntsLoc loc, DirectionType dir, int score, AntsAntType type)
{
return new TruusjeCandidateMove(ant, loc, dir, score, type, this);
}

/// <summary>Breaks on a condition.&lt;/summary&lt;
public void BreakWhen(AntsLoc loc, int r, int c, bool condition)
{
BreakWhen(loc.Row == r &amp;amp;&amp;amp; loc.Col == c &amp;amp;&amp;amp; condition);
}
/// <summary>Breaks on a condition.&lt;/summary&lt;
public void BreakWhen(int turn, AntsLoc loc, int r, int c)
{
BreakWhen(turn, loc.Row == r &amp;amp;&amp;amp; loc.Col == c);
}
/// <summary>Breaks on a condition.&lt;/summary&lt;
public void BreakWhen(int turn, bool condition)
{
BreakWhen(Bot.Turn == turn &amp;amp;&amp;amp; condition);
}
/// <summary>Breaks on a condition.&lt;/summary&lt;
/// &lt;remarks&lt;
/// Work around as conditional breakpoints are just way to slow, with thanks to JCK.
/// &lt;/remarks&lt;
public void BreakWhen(bool condition)
{
#if DEBUG
if (condition)
{
if (System.Diagnostics.Debugger.IsAttached)
{
System.Diagnostics.Debugger.Break();
}
}
#endif
}
}

/// <summary>Extenions.&lt;/summary&lt;
public static class StrategyExtensions
{
/// <summary>Gets a specific strategy from the list.&lt;/summary&lt;
public static T Get&lt;T&gt;(this IEnumerable&lt;Strategy&gt; strategies) where T : Strategy
{
return (T)strategies.First(str =&gt; str.GetType() == typeof(T));
}
}
}[/code]

Okay, I had my own framework, and my giant strategy pattern, but dude, Truusje needs things to DO. Walk to food and so on. So I needed directions. A lot of people talked about A*, and that they had implemented that. Some thought of themselves as extremely smart, because they did. I did’t. Because I’m not that smart, and more importent: I saw no need for it (yet). So what did I do to determine were to go?

First of all I came to the conclusion that long distance planning (in two ways) was not useful. You just can’t look that far into the future. Secondly I noticed that A* was designed to find the smallest distance from one (or a small set of) point(s), to another. That’s not where I was searching for. At forehand, I had no clue where to go.

For my food strategy therefore I just kept stepping away in all directions from all food I knew, updating the score map, in this case actually representing the distance to a food, and I stopped searching when I found a ant. I had wild plans of fine tuning this, with complex assignment strategies. Even starting complex and switching to more simple during the game (as I needed my time to calculate other things and with optimization was not that much to win anymore)

For hills, both my own and enemies I kept a map with the distance to them. Each turn for every ant I knew I updated these maps for the neighborhood of these ants (three steps max). This is off course not the fasted way to create these maps, but it highly efficient in spreading the calculation time. During my calculations I even did not kept a track of what I did, I just managed to do every tile just one time.

For my combat I used a implementation of what was called SimpleCombat at the forums. My first own attempt (without reading the thoughts of others) did not take to notice if a move was possible. Furthermore I was struggling with the results: when was it safe, and when not. Although I had acceptable results, one week before the deadline a started from scratch. The big difference I made, was that I now did take account of where an enemy could go to or not, and the results where given as a change of Safe or Die. Therefore every strategy could have is own risk management. Plans to make these changes better guessed never came to life, but the intentions where there.

Another improvement with my new SimpleCombat was to first look at all attacking moves, then to all staying/defence moves and just at the end check If I should flee. Because staying is safer then attacking, Truusje sometimes did an attack where the first ant started wasn’t supported by its friends. A bit annoying.

As the main goal of the game was razing hills I made a Raze Enemy Hill strategy. It just picked an enemy hill and gave candidate moves for ants to run to it. This was extremely successful against weaker bots, and bots with just an simple static defense of there hills. At the finales (with only the top 200 left) it tended to be too aggressive. I dropped from a stable 50th position to a final ranking of 72.

This was a known issue. Its strength off course is that it is hard to defend against such a lot of ants. A bot has more to do than just stopping you. However, when it managed to stop Truusje (and the good bots did quite often), Truusje sacrificed a lot of ants (in a lot of cases too much). And there was no abort mission implemented, or the opportunity to look for another targer hill for my other ants. I had the idea, but time…

And as a lot of you, I did try strategies that didn’t work. I tried a strategy for multiple hills, where I kept an ant on a half of my hills (when I had 2 hills or more). No good results. I tried a lot of spreading approaches. A lot of them had bad results too. I tried a basic static defense, it didn’t work for me, just as a borrowed symmetry detection didn’t.

I had a more (or less) dynamic hill defense at the end. I made a formation around my hill with a distance to it, depending on the number of ants I’ve got. When there was no danger I only used the even rows, else I used all tiles. Furthermore I ordered ants nearby to run to enemy ants if they were close to my hill. This worked for me well, especially because they were still allowed to run to food if the were close to it.

As almost all competitors I had tons of ideas who needed more time and research. But in the end Iâ€™m satisfied with the result. Although the top bots crushed Truusje hard, Truusje itself was doing fine against the big majority. And best of all: I liked its aggressive style, running for the enemy till the bitter ant end…

Code and compiled versions can be found here: www.corniel.nl/download.

Posted in Artificial intelligence, Hello World! | 4 Comments »

corniel.nl/blog

Corniels kijk op de wereld: lekker belangerijk…