Caltech’s Universal Robots
A decade ago – before every man and his neopet were able to run a neural net to produce greeting cards, Love Hearts, or a new Harry Potter novel – for most people the image of AI was something like the HAL 9000 in Kubrick’s 2001: A Space Odyssey: a computer with a mind of its own, capable of emotion, creativity, and cold blooded space-murder.
Nowadays AI means something more prosaic. Your phone’s keyboard to predict text is a result of an AI, as is Google’s route-generating and safety-conscious self-driving car. The power of these more practical – and less space-murderous – AIs are expanding. As they expand they are turning their disinterested minds towards areas where there is Scrooge McDucky coin to be made; the stock markets are under constant surveillance by AI traders, as are the forexes. It was only a matter of time before they turned their curious and disinterested silicon minds towards the game of poker.
Poker bots have been a worry since the early days of the game moving online, but the reality of them was never particularly threatening. While short-stacking could be reduced to a number of push fold options the edges were limited. The constraints of limit hold’em were largely GTO solved by AIs back in 2015 but no limit with its cloaks/daggers within levels within wheels held out a little longer.
That all changed late last year with an Human vs Robot showdown in which Liberatus, Caltech’s digital poker shark, played a series of heads up matches against Dong Kim, Jason Les, Bjorn Li, and Doug Polk.
In the 120,000 hands played Liberatus was able to liberate shekel from pro at a rate of 14.8 big blinds per hundred hands. An astonishing win rate against players of that tier.
To work out what the implications are for poker we might need to look a little more at the context of this victory.
Artificial Intelligence and the Problem of Consciousness
The Blessed Ramon Llull of the Franciscan Third Order is – controversially – sometimes attributed with being the earliest of the AI pioneers. His often bizarre and esoteric works were written up in the 1200s and include a diagram for a kind of mechanical knowledge generator. A series of wheels within wheels on which fundamental facts and logical instructions were to be written; by rotating the wheels new truths could be created entirely mechanically.
In practice much of what is considered AI works like this, taking in vast amounts of information and either brute forcing or neural netting its way to something that looks like a conclusion. Whether this sort of action constitutes intelligence is often debated and there is a joke among those who create these sorts of things that AI is just whatever hasn’t been done yet.
This is perhaps due to a kind of high-court pornography definition of intelligence (“I know it when I see it”) that makes the unconquered mountain range seem more like a job for “real” intelligence than the individual peaks reached by individual bot. For example visual character recognitions and the ability to recognise human speech were both suggested as true problems for AI until Captcha security and Siri became part of our everyday world.
Deep Blue Chess and Creativity
The classic example was chess. We are introduced to HAL in 2001 playing and beating a human astronaut over a digital chessboard. For centuries chess was seen as a test of intelligence for the human mind. The myth that the smarter (rather than more experienced) chess player will always win, made the game a favorite for generals and politicians.
So when computers seemed like they might be able to do some of what human brains can do, the race was on to produce a chess computer that could put a Grandmaster in the shade. The day came on the 10th February 1996 when Deep Blue won its first game against Kasparov. While Kasparov went on to win the six-game match with three wins and two draws, when he came back for a rematch a year later Deep Blue won three-and-a-half to two-and-a-half. Kasparov – whose reputation as a belligerent sonofabitch is almost as notable as his chess record – accused the Deep Blue team of cheating and cried for a rematch. Job already done, Deep Blue was retired.
Deep Blue used a huge library of established “best moves” to guide its play in recognisable or common openings, augmenting this with an ability to brute force the huge number of possible moves and assess the results with a point system. It could hardly be considered a creative machine.
Since then, the computers have really upped their game. Specifically the game of Go.
AlphaGo and the Impossible Mountain.
Go was the next obvious mountain to climb. The game a simple set of rules but incredibly complex play. But on top of the complexity, the game is impossible to beat with naked brute force because of the size of the board. The grid of 19 x 19 intersections makes for 361 possible opening moves, and the number of possible moves exceeds the number of atoms in the observable universe (roughly 1080) after about forty moves. This in a game where most matches take three to five times as long as that.
So AlphaGo had to get smarter, not faster. Here the key was the huge records available of top level go games played by humans. By analysing these games AlphaGo essentially taught itself how to beat the players in each scenario on over 100,000 human played games.
This prepped it to beat Lee Se-Dol four games to one. The next move was AlphaGo Zero which learned entirely by playing itself. Instead of a database of games it was simply given the rules and set about playing game after game. Initially the moves it made were random. But each win led to an update in strategy. It took just three days before AlphaGo Zero could beat the original AlphaGo in a match. Forty days later it was winning 90% of the games it played against the final version of AlphaGo original.
The progress also marked a move from game specific, AIs to the general purpose AI that Libreratus used.
This gives some insight into how Liberatus is designed. The poker playing bot is in fact not just a poker player, but an all rounder designed just to learn. Give it the rules for medical diagnostics, international diplomacy, or texas hold’em and then let it play. The bot has a set of initial strategies that are determined by quick game theory calculations, which it then corrects as it plays. The longer it plays the better it gets, until you’re cured, at peace, or hitting the ATM again.
The main difference in this sort of game is the lack of definite information. The opponent’s position is known in go and chess. At the poker table not only must you figure out strategy and counter-strategy for you hand, but also account for the full range of hands your opponent may have, then weight those strategies by probability, and throw in some randomising balance (one roots of liberatus is the latin libro or ‘balance’).
The continuous range for bet sizing (in no-limit a stack of $20,000 represents $20,000 different bet sizes to be analysed) is got around by using a series of bet sized ranges and rounding actual bets to the nearest of these.
To reduce variance, the players (each playing 30,000 hands heads up against Liberatus) were paired by the software and with each paired player got the same deck but swapped positions so that Liberatus received hand A in one match and hand B in the other. This was to reduce the influence of variance on the overall result.
Then at the end of each day of play the team from Caltech came in and made adjustments to the machine, secretively pointing the software in the right direction.
What this Means for Poker?
The real question is what this means for poker. No one need immediately panic since both the bot and the team required to run it are safely stowed away in an academic lab somewhere, and are prohibitively costly to run. It’s also only proved itself heads up and a ring game may still be too many variables to crack right now.
But in the longer run, as the AI improves and cheaper versions become more available, planning to mitigate the hurt to the game will become urgent. Especially online where a bot could go anonymously about its business without input by human hands.
The impact will depend, on a number of things. How effectively websites can counter the bots will be key, and perhaps more importantly how well they can make us the players trust their systems. Perception will make all the difference.
The other matter will be how prevalent they become. It could be that if they remain in relatively short supply that they become simply a cost of doing business, like the rake, or abusive chat messages. But in backgammon, where bots were easy to design, online games have pretty much disappeared.
There is a possibility, as bots outstrip humans but not each other, that the whole online poker economy will be bot vs. bot earning money for their owners. Cheap badly made AIs playing the microstakes trying to learn enough fast enough to move up stakes and compete in the big leagues with the digital equivalent of Isildur. Meanwhile with handheld computers banned at the live tables, I guess the rest of us will have to go back to the six hands an hour we play at the local Grosvenor’s cardroom.
On the other hand there is an optimistic note to all this: while playing Se-dol, AlphaGo made an extremely erratic move, the kind no human would make or recommend and which left the commentators baffled. As the game progressed, the quality of that move slowly emerged. The AI was being genuinely creative, adding something to the game that had never been seen before.
Similarly Liberatus seemed to massively overuse overbets. Sometimes going all in over a pot that was just a few thousand dollars big. The pros were baffled, but continued to lose their shirts. These computers are pushing the envelope, making new moves and creating the strategies that move the games on.
All we need to do is pay attention and put a little of our natural intelligences into learning from them.
Liberatus, if you’re reading this, we might just have a coaching position for you.