1
votes

(Tic-Tac-Toe: a 2 player game (player x and player o) on 15x15 board, where one player who forms a chain of 5 stones first wins)

More description


So I've implemented a Tic-Tac-Toe algorithm that uses simple alpha beta pruning..

This is what I have so far:

def ab_minimax(state):
    max_position, max_value, alpha, beta = None, None, float('-inf'), float('inf')
    positions = getAvailablePositions()

    for position in positions:
        temp = next_state(state, position)
        value = self.ab_min_move(temp, 3, alpha, beta)

        if max_value < value:
            max_position = position
            max_value = value

    return max_position

def ab_max_move(state, depth, alpha, beta):
    if game_ended(state) or depth == 0:
        return get_score(state)

    value = float('-inf')

    positions = getAvailablePositions()

    for position in positions:
        temp = next_state(state, position)
        value = max(value, self.ab_min_move(temp, depth-1, alpha, beta))

        if value >= beta:
            return value

        alpha = max(alpha, value)

        if alpha >= beta:
            break

    return value

def ab_min_move(state, depth, alpha, beta):
    if game_ended(state) or depth == 0:
        return get_score(state)

    value = float('inf')

    positions = getAvailablePositions()

    for position in positions:
        temp = next_state(state, position)
        value = min(value, ab_max_move(temp, depth-1, alpha, beta))

        if value <= alpha:
            return value

        beta = min(beta, value)

        if alpha >= beta:
            break           

    return value

This works fine, but obviously it requires too much time for the agent to return a move..

Then I came across this idea of Threat-Space search, which basically aims at placing "threats."

In this tic-tac-toe, threats are attacking sequences like ".ooo." "xoooo." (if I'm player o)

The problem is, I have no idea where I should place this threat-space search in my

alpha beta function....

I'm pretty sure the idea is to combine the threat space search with the original alpha beta minimax

algorithm,,, but I'm not sure where and how this should be done...

Can someone give me some explanation or really brief pseudocode..?

Thanks!

1

1 Answers

2
votes

Another idea here is that localize your minimax search , that is if your grid is sparsely occupied then no need to evaluate mini-max search on whole of the grid only evaluate mini-max on next positions which are in vicinity of already occupied region. Like at the beginning of the game you can consider only 5*5 grid as your state spaces rather than a 15*15 grid, this small optimization can save you lots of time. And as the grid gets filled you can see that the no of valid states itself get reduced hence your algorithm will remain as fast.