How AlphaStar Became a StarCraft Grandmaster | AI and Games


One of the biggest headlines in AI research
for 2019 was the unveiling of AlphaStar – Google DeepMind’s project to create the worlds best
player of Blizzard’s real-time strategy game StarCraft II. After shocking the world in
January as the system defeated two high ranking players in closed competition, an updated
version was revealed in November that had achieved grandmaster status: ranking among
the top 0.15% in Europe’s 90,000 active playerbase. I’m Tommy Thompson and in this episode of
AI and Games, we’re going to look at how AlphaStar works, the underpinning technology and theory
that drives it, the truth behind the media sensationalism and how it achieved Grandmaster
status in online multiplayer. You might be wondering why DeepMind is so
interested in building a StarCraft 2 bot? Ultimately it’s because games – be they card
games, board games or video games – provide nice simplifications or abstractions of real
world problems and by solving these problems in games, there is potential for it to be
applied in other areas of society. This led AI researchers to explore games such as Chess
and Go given they’re incredibly challenging problems to master. This is largely due to
the number of unique configurations of the board in each game – often referred to as
a state space. Research in Go estimates there are 2×10^170 valid layouts of the board: that’s
2 with 170 zeroes after it… it’s a helluva big number. Meanwhile StarCraft is even more
ambitious given the map size, unit types and the range of actions at both micro level for
unit movement and macro level for build behaviours and strategy – a topic I recently discussed
in episode 47’s examination of the AI of Halo Wars 2. Now even naive estimates by researchers
on the number of valid individual possible configurations of a game of StarCraft suggest
it is around 2×10^1685. In the case of both Go and StarCraft, these are – on-paper – incredibly
difficult problems to achieve expert level knowledge for an AI system. As I explored back in episode 26 – universities
around the world have seen the potential of StarCraft as an AI research problem for well
over ten years now. It’s a game that has no clearly definable best strategy to win, requires
you think about the effectiveness of your decisions in the long-term and without complete
knowledge of the game world. Plus you have to think about all of this and react in realtime.
Hence research competitions such as the StarCraft AI Competition and the SSCAIT have operated
for the best part of a decade to try and solve the problem. However, neither of these competitions
have the support of StarCraft’s creator Blizzard. Meanwhile, the AlphaStar project is an official
collaboration with the games creator, resulting in new tools such as the open source toolkit
PySC2 – which allows training of AI systems directly in StarCraft 2 game, as well as the
largest collection of anonymised replay data from the game. Since its inception DeepMind has directed
a tremendous amount of money and effort into researching two specific disciplines within
artificial intelligence: deep learning and reinforcement learning. To explain as simply
as possible, Deep Learning is a process where a large convolution neural network attempts
to learn how to solve a task based upon existing training data. It recongfigures parts of the
network such that it gives the correct answer to the training data it’s tested against with
very high accuracy. Meanwhile reinforcement learning is an AI that learns to get better
at a particular task by learning through experience. These experiences then update the knowledge
the system stores about how good a particular decision is to make at that point in time.
Often you can use these techniques together: first the networks are modified to learn from
good examples already recorded, and then the reinforcement learning kicks in to improve
existing knowledge by solving new problems it comes up against. These approaches have
proven very effective in a variety of games projects for DeepMind: first creating AI that
can play a variety of classic Atari games, then defeating 9-dan professional Go player
Lee Sedol with AlphaGo and the creation of AlphaZero that achieved grandmaster status
in Chess, Shogi and Go. And the next step was to take their expertise in these areas
and apply it to StarCraft II. Headed up by Professor David Silver and Dr
Oriol Vinyals – who longtime viewers will remember from episode 26 as the former Spanish
StarCraft champion and co-creator of the Zerg ‘bot’ Overmind at the StarCraft AI competition
in 2010. The team behind AlphaStar is comprised of over 40 academic researchers, not to mention
additional support throughout DeepMind and Blizzard in order to build the tools and systems
needed for the AI to interface with the game. Once again, another massive endeavour with
Google money helping to support it. So let’s walk through how AlphaStar works and how it
achieved grandmaster status. AlphaStar has – at the time of this video
– had two major releases unveiled in January and November of 2019. The core design of how
alphastar is built and learns is fairly consistent across both versions. But AlphaStar isn’t
just one AI, it’s several that that learn from one another. Each AI is a deep neural
network that reads information from the games interface and then outputs instructions that
can be translated into actions within the game: such as moving units, issuing build
or attack commands etc. It’s configured in avery specific way such that it can handle
the challenges faced in parsing the visual information of a StarCraft game alongside
making decisions that have long-term ramifications. For anyone who isn’t familiar with machine
learning this is a highly specific and complex set of decisions that I’ll refrain from talking
about but for those who want to know more all the relevant links are in the video description. Now typically when you start training neural
networks, they’ll be given completely random configurations, which means the resulting
AI will make completely random decisions and it will take time during training for them
to learn how to change their anarchic random behaviour into something that’s even modestly
intelligent. When you’re making an RTS bot, that means figuring out even the most basic
of micro behaviours for units, much less the ability to build them or more coordinated
strategies using groups of them at a time. So the first set of AlphaStar bots or agents
are trained using Deep Learning by taking real-world replay data from human StarCraft
matches provided by Blizzard. Their goal to reproduce specific behaviours they observe
from the replay data to a level of accuracy. Essentially they learn to imitate the players
behaviour – both micro actions and macro tactics – by watching these replays. Naturally, the
replays are based on high-level play within the game, but of course the data is anonoymised,
so we don’t know whose these players are. Once training is completed, these AlphaStar
agents can already defeat the original Elite AI built by Blizzard for StarCraft 2 in 95%
of matches played. But learning against the human data is just
the start of the learning process. It’s not enough to replicate our behaviour, they need
to find a way surpass it – and they’ll do that by playing against each other and learn
from this experience. The technique adopted, population-based reinforcement learning embraces
a common principle in computational intelligence algorithms where you can improve the best
solutions to a problem by have them compete with one another – effectively creating an
arms race dynamic between multiple competing agents. To address this, DeepMind created
the AlphaStar League, where several of these pre-trained AlphaStar agents battle it out,
enabling them to learn from one another. But there is an inherent danger that a machine
learning algorithm can accidentally convince itself it has found the best StarCraft AI,
especially if it evaluates how good it is by playing against other StartCraft AI that
are also training. It might be a good player, but it might have lost good knowledge along
the way, because all its competitors play very similarly and its trying to find a new
strategy or tactic that will give it an edge. But that might make it a worse player overall.
In AI research we call this converging within local optima. If you imagine the space of all the best StarCraft
strategies is like a series of rolling hills, where each hilltop represents a particular
strategy or style of play. Every AI player is sitting somewhere in these rolling hills
and the reinforcement learning is helping them climb to the top of the nearest hill.
But once they reach the top of the hill, they can’t go any higher and might be convinced
they’ve achieved the best possible StarCraft strategy because they defeat all the other
players. But all those other players are on the same hill – because they use a similar
strategy – they won’t even realise there might be a better hill for them to climb elsewhere. Hence you could have three StarCraft AI bots
A, B and C that are stuck in a situation where A can defeat B in a match, B can defeat C
but C defeats A. Because the strategy behind A is so specialised it’s only good against
a certain type of opponent. This was evident in early training, where ‘cheese’ strategies
such as rushing with Photon Cannons or Dark Templars dominated the learning process: it’s
a risky move and won’t always work. Hence there’s a need for dominant strategies to
be overcome within the training process. DeepMind address this by creating two distinct
group of AI agents in the league – the Main Agents and the Exploiters. The Main Agents
are the ones I’ve already been talking about: AI that are learning to become the best StarCraft
2 players. New main agents are added to the league based on learned experience, while
existings ones are kept around to help ensure information isn’t lost along the way and will
be pitted against the new players periodicially in combat. Meanwhile exploiters are AI agents
in the league whose job isn’t to become the best StarCraft player, but to learn the weaknesses
that exist within the Main Agents and beat them. By doing so, the Main Agents will be
forced to learn how to overcome the weakeneses found by the exploiters which will improve
their overall ability. This will prevent them from creating weird specialist strategies
that will actually prove to be useless in the long-run. There are two types of exploiter:
the Main Exploiters, that targets the latest batch of Main Agents to be created. And the
League Exploiters: whose goals are to find exploits across the entire league and punish
them accordingly. The entire AlphaStar League process is trained
using Google’s distributed compute power running in the cloud, using their proprietary Tensor
Processing Unit’s or TPUs. The actual training is broken up into two batches, one for each
version of AlphaStar that DeepMind have published. So now that we know the inner workings, let’s
look at each version. How it was evaluated and what differentiates them from one another. The first version of alphastar was trained
within the league for 14 days, using 16 TPUs per agent resulting in 600 agents being built.
Each agent experienced the human equivalent of 200 years of StarCraft playtime. Already
surpassing any human equivalent. To test them out, DeepMind invited two professional players
to their London offices in December 2018: first Dario Wunsch aka TLO, followed by Grzegorz
Komincz known as MaNa, both of whom play for the esports organisation Team Liquid. Playing a typical 1v1 match-up of 5 games
under pro match conditions, AlphaStar both defeated TLO and MaNa handsomely – making
it the first time that an AI successfully defeated a professional StarCraft player.
Now while this was a significant achievement, there was still a lot of improvements that
needed to be made to the system. Given that many concessions were made in the design choices
for AlphaStar and the test-matches at that time. First of all, version 1 of AlphaStar was only
trained to play as Protoss and was evaluated against Protoss-playing opponents. As StarCraft
players will know, while a given pro player will typically focus on only one species,
they do need to be aware of and can counteract strategies from Terran and Zerg players. As
a result TLO was at a disadvantage during these first test matches, given while he does
rank grandmaster level for Protoss, he plays professional as the Zerg. However this was
mitigated somewhat by MaNa who is one of the strongest professional Protoss players outside
of South Korea. Secondly, version 1 of AlphaStar could only
play on one map of the game: Catalyse LE. This means that the system had not learned
how to generalise the strategies it was learning such that it could apply them across different
maps. The third alteration was that the original AlphaStar bots did not look at the game through
the camera: they had their own separate vision system that allowed for it to see the entire
map. While fog of war was still enabled, it did allow for AlphaStar to have an advantage
of other players by letting it see the rest of the visible world. Now DeepMind insists
that this was actually a negligible feature, given that the bots behaviour suggest they
were largely focussed on areas of the map like a human was, but it was still something
that needed to be removed for version 2. What is undoubtedly the cheekiest part of
this whole experiment: is that in order to keep TLO and MaNa on their toes, they never
played the same bot twice across the 5 matches. As I mentioned earlier, AlphaStar is technically
a collection of bots learning within the league. Hence after each match, the AlphaStar bot
was cycled out. With each of them at that time optimised for a specific strategy. This
meant that TLO and MaNa couldn’t exploit weaknesses they’d spotted in a previous match. But interestingly, despite all these advantages
over TLO and MaNa, the one area that many would anticipate the bot to have the upper
hand is the actions-per-minute or APM: the number of valid and useful actions a player
can execute in one minute of gameplay. During these matches MaNa’s APM averaged out at around
390, while TLO was just shy of 680, but AlphaStar had a mean of 277. This is significantly lower
and it’s for two reasons: first that because it’s learning from humans, it’s largely duplicating
their APM. In addition, AlphaStar’s ability to look at the world and then act has a delay
of around 350ms on average, compared to the average human reaction time of 200ms. With the first version unveiled and its success
noted, the next phase was to eliminate the limitations of the system and have it play
entirely like a human would. This includes being able to play on any map, with any race
and using the main camera interface as human players would. With some further improvement
to the underlying design, the AlphaStar league ran once again but instead of running for
14 days, this around it ran for 44 days, resulting in over 900 unique AlphaStar bots being created
during the learning process. During training the best three main agents – one per race:
Terran, Protoss or Zerg – were always retained, with three main exploiters (again one for
each race) and six league exploiters (two for each race) forcing the main agents to
improve. But this time instead of playing off against professional e-sports players,
the bots being trained would face off against regular players in StarCraft II’s online ranked
multiplayer. Three sets of AlphaStar agents were chosen
to play off against humans. The first batch being the bots that had only completed the
supervising learning from the anonoymised replay data – referred to as AlphaStar Supervised
– and then two sets of Main Agents that were trained in the AlphaStar league called AlphaStar
Mid and AlphaStar Final. AlphaStar Mid are the Main Agents from the league after being
trained for a total of 27 days, while AlphaStar Final is the final set after 44 days of training.
Given each Main Agent only plays as one racce, this allows for a separate Match Making Rating
or MMR for each AI to be recorded. To play off against humans, AlphaStar snuck
into the online multiplayer lobbies of Battle.net: Blizzard’s online service and through ranked
matchmaking would face off against an unassuming human player, provided they were playing on
a European server. While Blizzard and DeepMind announced that players could opt-in to play
against DeepMind after patch 4.9.2 for StarCraft II, the matches were held under blind conditions.
Meaning that AlphaStar didn’t know who it was playing against, but also human players
would not be told they were playing against the AI, just that there is a possibility it
could happen to them when playing online. This anonymity was largely to prevent humans
recognising it was AlphaStar, discovering it’s weaknesses and then chasing it down in
the matchmaking to exploit that knowledge, which could really known it down in the ratings
– given it can’t learn for itself outside of the league. To establish their MMR, the
supervised agents played a minimum of 30 matches each, while the mid agents ran for 60 games.
With the final agents to the mid agents MMR as a baaseline, then playing for an additional
30 games. The AlphaStar supervised bots wound up with
an average MMR of 3699, which puts it in the top 84% of all human players and that’s without
use of the reinforcement learning. However the big talking point is that AlphaStar Final’s
MMR for each race places it within Grandmaster rank on the StarCraft 2 European servers:
5835 for Zerg, 6048 for Terran and 6275 for Protoss. Of the approximately 90,000 players
that play StarCraft 2 on the European servers, this places AlphaStar within the top 99.85%
of all ranked play. All that said, there is still work to be done.
There are challenges in ensuring the learning algorithm can continue to adapt and grow with
further interactions with humans. Addressing the need for human play data and also addressing
some of the more quirky strategies that AlphaStar has exhibited. Sadly I’m no StarCraft expert,
so I can’t really speak to that in detail, but it sounds like there are still plenty
of future avenues for this research to take. Not to mention taking the challenge to the
South Korean e-sports scene, which is significantly stronger than it is in the rest of the world.
Who knows, we may well see new experiments and innovations being stress tested against
the player base in the future. But in the meantime I hope that having watched
this video you’ve got a clearer idea of how AlphaStar works and why it’s such a big deal
for AI research. This video is – in many respects – a massive simplification of the system and
in the comments there are links to the research papers and other videos and blogs that cover
the topic in a lot more detail! If you’re new to this channel and want to
know more about StarCraft AI, be sure to check out my episode from 2018 looking at academic
research in the original StarCraft and the competition APIs you can download and use
to build your own StarCraft bot. These competitions are still ongoing and still seeking new entrants.
So while you might not be challenging AlphaStar, you can certainly try your hand with fellow
hobbyists and academics. Plus keep an eye on the channel for a new Design Dive episode
where I talk about why the technology behind AlphaStar isn’t ready to be rolled out across
the games industry just yet. And of course, don’t forget to subscribe to the channel,
given only a mere 20% of peple watching these videos are subscribed to the channel, so get
clicking and you can get all my latest episodes straight to your eyeballs. Thanks for watching this episode on AI and
Games, I figured for the first episode of 2020, we should start with arguably the biggest
games and AI story of 2019. As always, my work is only possible thanks to the support
from Patreon and YouTube memberships. With a special shout out to Devin Allen, James
Anfone, 3kliksphilip and Zoe Nolan. To get your name in the credits and watch
episodes first, come join us in the AI and Games Patreon by clicking the links on screen
and in the description. See ya!

76 comments

  1. I'm back! Let's kickstart 2020 with the biggest AI story of 2019: Google DeepMind's AlphaStar achieving grandmaster ranking in StarCraft 2 multiplayer.
    Sorry it's been so long, I was busy with a consultancy gig, then a conference, then I managed to be ill twice in one month! But that means I had plenty time to write new episodes for y'all to watch.

  2. Awesome! Was watching some stuff about this the other week, can’t wait until aliens use games to practice the perfect game vs humans and then skynet us

  3. Sorry, but it sounds funny to me when you've read Komincz like Kominktsch. You were so close, just drop second 'k' and will be way better 😉

  4. If AlphaStar could learn from watching human replays and from playing against other AlphaStar bot why couldn't it learn from playing against humans live?

  5. 16:40 Obviously the most important information to take from all of this is that zerg is OP and protoss needs a buff (sarcasm, obviously.)

  6. A lot of Alphastar advantages comes form it's better interface with the game world compared to humans players. Same with the dota Ai.

  7. Hey! I actually calculated that 10^1685 number for one of our survey papers. It's on page 2 here: http://www.cs.mun.ca/~dchurchill/pdf/starcraft_survey.pdf

    It's actually much, much, much higher than that. That number is just how many different ways you can place up to 400 units onto an average sized StarCraft map at the BUILD tile level. This doesn't include unit hit points, shields, energy, etc etc. It's so much higher that it's not even worth calculating at this point 😀

  8. The mention of version 1 having "normal" APM is misleading. True, it's average APM was low. But it was able to spike it's actions for few seconds. Which is why for they second version, they made APM limits even stricter, and added APS (actions per second, or was it 5 second?) limit also.

    It is obvious to anyone knowledgable about StarCraft, that in match with TLO, AlphaStar was able to micro it's Stalkers to inhumane levels. Juggling them around with Blink to the point of getting massive value from each individual unit.

  9. Alphastar already showed superhuman unit micromanagment vs Mana. I bet the latest agents that run against the grandmasters are intentionally kept on a human level for some not quite clear reason

  10. While this is exceptionally impressive. In a pro competition you know the opponent you know what they like todo and then you plan accordingly. So changing the AI every time after a match because it can’t learn form it seems a little cheap. If I was a pro player I would watch how my opponent plays before hand. Not giving that opportunity in a pro competition seems a little off it’s like being able to swap out a player at will. Like I’m going to play first round, then my buddy is going to play second round unannounced so my opponent doesn’t get to plan ahead. Then we bring in some other player to play third. All while the opponent was just planing on playing me. Now the multiplayer is exceptional more interesting and good show of the AI. But even then most player can go oh I played this guy before he likes this and this ill counter with this, so completely anonymizing it seems a unrealistic to how normally multiplayer works. Now that’s not to say it’s not exceptional cool. But not testing it like it a true real player takes away form it a little. I have no doubts that this AI would have climbed the leader boards anyway. But putting these on these helping factors to me takes always form it a little.

  11. Is it still cheating with inhuman EAPM bursts during fights despite having an acceptable average ( which means nothing )? If so… meh.

  12. I kind of want to see an inhumane version of alpha star where there are no limitations attached. I really wonder have inhumanely good it could get and what the top end of Starcraft would be like.

  13. I wonder if DOTA2 would be considered easier or harder to build an AI for than Starcraft? Less micromanagement but much more communication and coordination between different players. Different builds and selecting appropriate heroes could also add complexity. In starcraft you can pick basically any race and have a reasonable chance of victory, however DOTA requires you and other intelligent agents to select characters to balance against your opinions. Combat also requires a lot more coordination and timing

  14. Amazing video as allways!

    If theres more info now: Could you maybe do a video on Elon Musks Dota AI? in your 2018 video on MOBAs in general you said "there isnt much public information on how it works yet", did that change? Is there another MOBA AI im unaware of?

  15. How the hell can Deepmind claim that the unlimited vision of the first version was insignificant? Mana immediatly won vs AS once its vision was limited…

  16. He does know that out of the top 3 players in the world only 1 of them is Korean right, like the considered best player is finish the second is Italian and the third best is Korean

  17. I remain unconvinced by the current definition of AI learning systems.
    It could be a mental block but it feels more like a brute force approach to mirror human experience at a faster rate but with an identical method of decision making, that they need a LOT of human time and input to deal with even fixed state systems is a cutting edge research reality (you may have 100000 lines of code acting but maybe 1000 is doing the bulk of the work, good luck parsing that with a small team).

    Impressive in its own way but still very much an expensive starting block rather than an actual runner.
    Then again, it’s a field where all you need it to get it right once (and recognise so) and the methodology will branch out exponentially from there point. Assuming of course that the current methods are not in and of themselves a form of scientific dead end.

  18. 16:55 "This places alphastar in the top 99.85% of all ranked play" you mean top 0.15%. Or above 99.85%. Top 99% is the majority of all players.

  19. I always felt Alphastar won most of it's matched through superior unit control, while generally failing at long term strategy, planning and improvisation.

  20. So it needs a crap ton of professional human data to even grasp the basics?

    Call me when it can figure that part out itself in 100 years.

  21. What was found by watching replays is alphastar kept its apm low most of the time and then would jump to almost 1000 apm during a fight. This kept the overall average low but its peak was way too high

  22. I have to commend you on the way you’ve set up your ad timings. It’s rare to see a video where an ad popping up doesn’t cause an immediate desire to close the whole video. I watch on a mobile app so there’s no need for anyone to recommend ad blockers.

  23. When alpha star gets uploaded at the pentagon and takes over all the Terran military drones…do you want to play a game?

  24. Last month, the world champion Serral tried a build invented by AlphaS, he lost, but that pro players learn moves that the AI ​​develops is incredible.

  25. As amazing as Alphastars progress was it’s interesting that it still loses unlike in Go or Chess. It seems the massive solution space coupled with the lack of perfect knowledge means it’s not possible to always have the perfect strategy.

    Case in point, while there is no video I’m aware of epic cheeser Florencio took a shot at Alphastar and apparently crushed it. I believe he did a variation of his drop a nexus in your opponent base and recall your units into their base. Not a playstyle Alphastar likely saw before.

  26. Of course this isn't the AI Activision Blizzard is looking for. They are more interested in AI that makes you buy microtransactions.
    They want AI salesmen, not AI players.

  27. It would be nice to mention that at BlizzCon 2019 everyone there could try playing against AlphaStar Final. And the best player in the world, the most consistent Zerg on the planet, Serral, played 5 games against the AI and lost 4 of them. That is a big accomplishment for AlphaStar, bigger than Gary Kasparov vs DeepBlue.

  28. You forget to mention:
    1) Google hide Alpastar and did not allow people to publicly play with it.(they hide bot under barcode)
    Google Aplastar hides behind bar code, so Google was scared to play openly vs opponents.
    In opposite OpenAi open Dota bot to anyone publicly.

    2) Alphastar MMR was not calculated properly. Google Trick with formula to achieve expected result
    From news.ycombinator.com:
    There's a great video highlighting irregularities in AlphaStar MMR calculations and Blizzard matchmaking, notably, Alphastar, playing protoss, on average played with people 700 MMR below and victories for much weaker enemies gave unexpectedly huge boost to calculared MMR. When only matched with >6000MMR players, alphastar final had 6 win and 15 losses.

    There is a problem with MMR calculation used for alphastar. More specific, there is matchmaking problem when alphastar did not get matched against equal-MMR and most of his protos games were against much lower-MMR players, skewing the data for MMR calculations.

    The example given in video: you won 10 times against 5100, you lost against 7200. The calculated MMR would probably be around 6300. The problem is you wont be matched like this in real game. When matched mostly against the weaker enemies, MMR calculations have a lot of uncertainty.

    You can google "Aphastar MMR" and you will find difference between usual MMR and Aphastar special formula MMR.

    3) they expose Alpastar in some event and hidden(looks like they did not provide information about it to players) record every game.
    Then they took games vs Serral and expose them, they even hired professional commentator(Artosis) to comment games that were hidden recorded.
    Serral was not happy about this type of rat play from Google, but Google don't care.

    Looks like all statistics that they had they use to achieve "better" results then OpenAI and show this results to investors/Google parent company to obtain more money.

  29. It seems like they are going about this whole AI learning all wrong. AI should not blindly try and imitate players. Instead the AI should understand the parameters of the universe its in, aka Starcraft 2. Why it understand what a unit is, what the units stats are, what the units cost are.. Players understand the math behind everything, an AI should be able to run 100% accurate spreadsheets real live, knowing that 11 of x unit beats 15 of z units.

    It should understand building placement and micro as well.

    It's like when you see the AI's trying to learning from playing Mario or some other 2d platformer. The AI is so retarded, that it has to relearn to jump and move every single level. That's not how it should work, it should understand the parameters of the world and the character. No human would suddenly forgot how everything works, just because it's a new level.

  30. Holy moly, that's incredible I even understood something, you made a great job considering the complexity of the AI structure.

  31. Well it depend alot on what they want the AI to do (AKA if they want it to win or if they want it to play humain like)
    Doing the 2 thing would be hard to do since humain are ultra complex haveing choice made simply by how cool you fell of what you eat or stuff like that. And alot of Mistake being made not on purpuse but to learn things out. Like AI do I guess?…

  32. Alpha star last year won in unfair matches vs tlo and mana. It had unfair map vision, and unfair micro. Those two things make alphastars basic strategy of going mass stalkers easy. Had it gone vs another race, mass stalkers is easier to counter.

    Deep mind still hasn't conquered starcraft. The fact that it's gm and not #1 Gm proves that it has weaknesses that humans will take advantage.

    Why aren't there more recent games being uploaded about this? Why are we still talking about it's first wins? This is all old news.

  33. Please talk to BeastyQT (here on Youtube) about AlphaStars strategies and weaknesses. He is a grandmaster player who is very experienced and knowledgable about the game. He played some games vs AlphaStar and he would certainly enjoy sharing his thoughts with you on the matter!

  34. Man you kinda missed the point with apm, realisticly i expect bots to have much less apm but have near perfect timing and decisions when to do it, most of the pro league apm is trash apm it doesnt do anything it simply helps with focus it feels good to do, example in any micro battles between armys we as humans like to click alot like crazy even tho it doesnt matter we still go in the same direction, an ai can just click 1 time and then another if any actual change needs to be done

  35. They actually just recreated ladder. You can get to a competent level if you only focus on basics. But to rise higher you need to form coherent strategies. Then those strategies need to hold up against cheese or you need too be able to spot the cheese quick enough that you can change stratagems. This leads to thinking on the fly and changing builds according to the game state so you don't lose simply because you lost the build coin flip. Couple all of this together with the ability to optimize your strategy, excellent Micro/Macro (read: Basics), and the ability to adapt to a patch and you've got a grand master. Everything they've got going on already naturally happens, leading to the best rising to the top. Which is really cool. It's sort of a form of validation. These AI are meant to learn in a way similar to a human, so putting them in competitions similar to ladder, makes sense.

  36. Now release this AI in the wild, as a general controlling soldiers, tanks and jets on the battlefied. Or perhaps a gazillion flying and crawling drones… If it ever got a will of its own, it would mean the end of the world. Well, for humans, anyway.

  37. Now we can let the best AI protoss, zerg, and terran duke it out in order to find out which race is truly unbalanced

  38. Alphastar needs some value for committing to something. Just another value thrown in to the blender but it should help stop those situations where it is using energy to speed something along then it just cancels it when it was 95% done.

  39. I think Alphastar has the advantage of being able to macro very effectively.
    Having watched quite a few of these matches, it seems to be human like in micro management, albeit with some incredible moves sometimes, but simultaneously is able to macro, and does not fatigue like a human does.

  40. At 13:15 you say each Action is useful. Thats not true. Especially in the beginning of a match you spam quite a bit just to get yourself going and especially when it comes to TLO he spikes at over 1000 APM due to a bug/feature in the game. Therefore the effective APM(EAPM) is quite a bit lower than the APM. For the AI I'd assume that APM=EAPM

  41. alphastar was capped at a certain average apm, wich is why its so much lower than the pro players not because it was learning from players with lower apm

  42. 16:52 "This places Alphastar in the top 99.85 percent" – You realize almost everyone is in the top 99.85% right?
    But on a serious note, the English language is a bit misleading when interpreting a sentence like that.

  43. Alphastar agent's weren't that impressive, they mostly stuck with one strategy per match up which was early pressure following it with a timing attack and had completely unnatural reactions to things they haven't seen before, i am not trying to diminish the achievements of deep mind but they over hyped it. In go and chess they help the games evolve but in sc2 aplhastar didn't even put a dent in the meta and how the game is played. Also it might have lower apm than the average GM but all the actions of the AI were epm which means that no action was spamming.

  44. This was a very well produced video. However, I'm not sure how all popular analysis of A* misses the insights on its super-human EPM in the first iterations that beat TLO/Mana. Putting the "top 0.15%" in context is also necessary. AlphaZero got well above the top professional ELO rating for Chess and Go, but nowhere near that in Starcraft, despite DM throwing zillions of resources at the problem. Ignoring that distinction misses interesting features of the hardness of the game and the true challenges AI still faces even on fully computable environments.

  45. 05m00s AlphaStar doesn't receive any visual information !!! It only "sees" symbolic information about the entities, for example position, etc.

Leave a Reply

Your email address will not be published. Required fields are marked *