MTGO DE Results - the post-LSG-restriction metagame



  • There has been quite a bit of change recently in the Vintage format. Wizards has been more active in managing the restricted list and printing powerful Eternal relevant cards like the Delve spells, Dack Fayden, and Monastery Mentor. After the restriction of Lodestone Golem, I wanted to take this opportunity to look at how the metagame evolved following the removal a key card. I felt that while players understandably will have different views on what the Vintage format should be like, we should also have as much information as possible available to us that we can use to construct informed opinions and arguments going forward. Ryan Eberhart (aka @diophan) and I have been collecting and disseminating data from MTGO Power 9 events, but we have also been collecting data on the Vintage Dailies and paper tournaments around the world. I would like to share with you now the data we have collected on the MTGO Daily Events since the Lodestone Golem restriction took effect on April 13th (paper results will be following shortly).

    We have classified decks according to the following archetypes and broken them down further into sub-archetypes in an effort to more accurately convey the metagame.

    • Gush - If Gush was a primary component of a deck's gameplan, it was put into this category. We then broke this down essentially by win condition: Delver, Mentor, Pyromancer, Combo (Doomsday and Gushbond), and Other (Thing in the Ice or Vault/Key/Tinker, mainly).
    • Shops - The Shops archetype was obviously hit hard by the restriction of Lodestone Golem and went through quite a transitional period. Over the last three months, the archetype has reestablished itself by turning to Thought-Knot Seer as a replacement for Golem. The most successful build has been the Ravager TKS deck though other lists have incorporated TKS and put up results. A third category includes the non-TKS Shops lists but these have been a minority of lists and slanted towards April.
    • Eldrazi - An archetype that emerged from the LSG restriction, the most popular variant of the archetype has been White Eldrazi which pairs the colorless creatures with White Hatebears like Thalia and Vryn Wingmare. A minority of decks have fully embraced the tribal element of Eldrazi, i.e. Jaco-Drazi.
    • Dredge - Divided by sideboard strategies based on whether they intended to combat opposing hate head with Creature, Enchantment, and Artifact removal or Transform post SB. The former approach remains the most popular.
    • Combo - Predominantly Dark Petition Storm but also a few Belcher decks and odd-balls (like Two-Card Monte and Rector Flash)
    • Blue Control - The more controlling remnants of the Mana Drain pillar like Landstill in various colors and Blue Moon.
    • Big Blue - Less controlling artifact-based combo decks like Control Slaver, Painter-Grindstone, Academy combo.
    • Oath - If it contained maindeck Oaths, it found it's way here. Variants include Salvagers Oath, Control Oath (Fenton Oath with Griselbrand as the primary win condition), Combo Oath (i.e. Burning Oath), Oathstill, and other Oath (odd Oath).
    • Null Rod - The various Fish decks that have historically belonged to the Null Rod Pillar. These types of decks are almost nonexistent on MTGO but include BUG Fish, Hatebears (White Trash and 5c Humans), Merfolk, and Other (in this case, a monored 8 Moons deck).

    We kept track of 4-0 and 3-1 finishes and used these to create a category called Total Wins ( # of 4-0 finishes * 4 + # of 3-1 finishes * 3). This more heavily weighted the 4-0 finishes, from which we calculated the % of Total Wins for that archetype/subarchetype. Comparing the totals reflects performance - a positive Delta % Total means the deck disproportionately put up 4-0 finishes. However, the sample size is not really large enough to infer much from this.

    There is a function in Google Sheets that allows you to count unique entries within a data set. We used this to calculate the number of unique players both overall and within archetypes/subarchetypes. Over time, you would expect the majority of MTGO Vintage players to put up a finish so this is a rough indicator of the total pool of MTGO players that participate in these events. It also helps to remove repeat performers like Rich Shay or Montolio as they can potentially skew results for certain archetypes. It should be noted that players can switch archetypes/subarchetypes so some players will be counted twice or more as you breakdown the data.

    That out of the way, let's get to the results.

    alt text

    alt text

    The true value of this data in my opinion is how the different archetypes and sub-archetypes have changed over time. Ryan and I broke down these results by week and displayed them on several graphs.

    alt text
    alt text
    alt text
    alt text
    alt text

    As we can see, the trend of a declining metagame prevalence for Gush has not continued (did anyone aside from @Smmenen think this would be the case?). Metagames tend to be cyclical by nature - people build their decks to combat specific decks and that focus shifts with time. Gush was the clear target that emerged from the Lodestone Golem restriction and decks adapted to combat Gush, with a surge in Sudden Shocks, Sulfur Elemental, Thorns, and Defense Grids. As the field diversified, the narrower hate-cards were supplanted by more broad removal (you don't want to be holding a Sudden Shock against a resolved Thought-Knot Seer) and Gush decks themselves diversified to dodge the hate with these decks turning to Tendrils, Pyromancer, and Thing in the Ice. At its heart though, Gush is a control deck with a powerful card advantage engine - it just needs to draw into the right cards for the field. A key development was the adaptation of Cabal Therapy and Baleful Strix by Grixis Pyromancer (and ultimately Esper Mentor) as a means of competing with Eldrazi, Cavern of Souls, and the broader field. This has lead to a resurgence in Gush, decline in Shops and Eldrazi, and ironically the metagame percentages have returned to roughly the same percentages as the start of April. It remains to be seen how the metagame will adapt but I hope this look at it has been interesting. Keep in mind, all statistical work is subject to variance and the samples sizes are low (though we have a comparable number of lists to Paper over the same time span). Questions? Comments? Suggestions? Have at them and I hope we can get a good discussion going.

    Correction 1: We noticed an error in our calculations that affected the Sum of 3-1 Finishes (it did not count Eldrazi) and as a result, the percentages were high. We've had an issue with Google Sheets where the formulas we write do not appear to "fill" properly, randomly skipping certain cells...This could be an issue with us simultaneously trying to edit a sheet. This specific instance could have been human error (aka I screwed up), but we really don't know. In any case, the best thing to do is post a correction explaining the error and fixing the data. The first table has been updated and should be correct now. Other charts were unaffected as they did not use the "Sum of 3-1 Finishes" in the calculation.



  • @ChubbyRain Thanks a lot for this interesting piece. I totally agree with your write-up too that these things move in circles. One statistic I missed (and thus calculated myself) was the total wins to unique players ratio per archetype. This basically would tell us more about how likely you are to win with a deck of each archetype (disregading player skill etc). Here's that breakdown:
    Gush: 390/53= 7.36
    Shops 103/13 = 7.92
    Dredge 54/12 = 4.5
    Combo 90/17 = 5.3
    Blue Control 77/15 = 5.1
    Big Blue 26/6 = 4.3
    Oath 37/8 = 4.6
    Null Rod 15/5 = 3
    Eldrazi 68/16 = 4.25

    So the archetypes which seem likeliest to bring success are Shops and Gush (this is of course a major simplification just for illustrative purposes). The rest are basically all the same since the N (sample size) is so small that you can't really claim anything else.

    Edit: I also calculated these per subarchetype although that is even more shaky due to even lower n (sample sizes). But here's the top five decks according to total wins to unique players ratios:
    Control Oath 11.0
    Other TKS 11.0
    TKS Ravager 7.5
    Mentor 7.2
    Other Oath 7.0



  • @ChubbyRain said:

    As we can see, the trend of a declining metagame prevalence for Gush has not continued (did anyone aside from @Smmenen think this would be the case?).

    Except, I was actually right, using the measure we were discussing.

    The statistic we were discussing was % of DE/ month.

    Using the daily event breakdown by week, your data obscures this trend:

    Here is the percentage of Gush decks in the DE's during the time period you covered, but organized by month:

    Gush as % of DEs by Month

    Gush was 61.66% of the April (post-restriction) metagame, but 41.4% of May, but declined to 38% in June. The declines are even more steep if you slice up the data in other ways, like focusing on 4-0s, etc.

    I applaud this kind of detailed analysis, and really appreciate your hard work here - but it's not only tasteless (and rude) to throw digs at me (or anyone) like you did with your parenthetical jab, but it's embarrassing when what you are trying to critique me for isn't even accurate, using the metrics from the original context.

    Suggestions: in the future, I recommend doing breakdowns both by week, but also by month. The second half of June had far fewer daily's fire. Also, dis-aggregating 4-0 and 3-1 decks is also worth doing, just to monitor the results, if nothing else.



  • @ChubbyRain Do you believe that "metagame prevalence," as you put it, is the most important thing when discussing these numbers? Personally I am far more interested in seeing how the actual decks perform in head to head matches against other archetypes, as you and @diophan have thankfully provided when the data is available (by your own hard word!). For example, we are admittedly working with small data sets (which is better than nothing), but it seems like for as much as Gush decks have been played according to the data above, there seems like wild variance in overall win percentage for the Gush decks, when broken down by large events. If I'm reading the data that you've posted correctly from previous results threads correctly, this is what I'm seeing in terms of Gush decks' actual win percentages, by event:
    MTGO June Vintage Premier Event 61.3%
    Eternal Extravaganga 4 50%
    MTGO May Vintage Premier Event 51.5%
    NYSE IV 46.5%
    MTGO April Premier Event 50.3%

    My question (for everyone) is why do you think Gush decks are so heavily played (in Daily Events, as shown in this thread), when they seem to be averaging about a 50% match win percentage in these recent larger events? That doesn't seem like a great metagame choice at the moment, just based on simple results.



  • @kistrand You're welcome. I am a bit confused though about what the ratio of total wins to unique players means. Please correct me if I'm wrong but total wins/unique players is a representation of how many repeat performances there are within an archetype by individual players. If decks were equally likely to see play, this could represent the likelihood of you winning with a specific archetype. However, you also have to disregard player skill and preference which has a very large effect on the outcome. Gush and Shops are commonly played by several of the best MTGO players: Diophan on Gush, Montolio on Shops, and Rich who plays both. It does not surprise me that these decks would have the highest ratio of wins/unique player. Now, Shops and Gush are in my opinion 2 of the best 3 decks right now in the format (the third is Dredge, but there isn't really a player that has taken up that mantle on MTGO) so the intrinsic strength of the archetype could be a very real component here. But the impact of these frequent and skilled players is a pretty significant confounder here.



  • @kistrand Although it's a bit late, I don't think what you are calculating is meaningful. The problem is that the same person can contribute many wins from 4-0 and 3-1s, but only contributes one person to the denominator. To take this to an extreme, if only one person played dredge and put up several finishes, the number you are computing for dredge would be incredibly high, but that doesn't necessarily say anything about the winningness of the archetype. No disrespect intended, of course.

    @JACO To interject my own opinion, many players (myself included) don't particularly want to play a non-blue deck. Gush has been the best performing blue deck for a while (although Oath on occasion can do well), so it is reasonable for a large number of players to play gush, even if in the abstract they might be better served playing Shops, Dredge, or Eldrazi. I don't have a strong opinion for whether winrate or metagame prevalence is a more important metric. To be honest, I don't even have a strong opinion on whether gush should be restricted.



  • @diophan and @ChubbyRain You are certainly right and good that you pointed this out. That's what I meant with it being an over-simplification of reality and we can never know either if the most skilled players always tend to play certain archetypes etc. Still, though, I hope it served some descriptive purpose (just like the other low-N descriptive data here). Was fun calculating at least : )



  • @diophan said:

    @JACO To interject my own opinion, many players (myself included) don't particularly want to play a non-blue deck.

    I think this is actually a huge problem, not just because non-blue decks are pretty interesting right now, but because there are more viable options for non-blue decks than perhaps any time in the history of the format. It's like we've opened a frontier or font of non-blue strategies.

    It's not just that Shops, Dredge and Eldrazi are great, which they are, but there are a range of non-blue strategies that are intriguing, starting with decks like Humans (which won the BOM) and all the way to 2-card Monte. The full range of Humans/Hatebears decks remains, imo, inadequately explored because Humans isn't, to far too many players, an interesting strategy.

    I was so impressed last year when long-time blue players, like Rich Shay and Brian DeMars, demonstrated a willingness to play Shop decks at the Vintage Championship, etc. One of the long-time weaknesses of the format is the tendency for some players to play the same deck or strategy regardless of strengths/weaknesses or metagame position. If non-blue decks are both more diverse and a larger part of the field, it's a problem for the format if players behave as if there is a redline.



  • @JACO I find metagame prevalence (which this isn't exactly - it's more a prevalence of winning lists similar to the collected data on Paper Top 8's) to be much more important. There are too many factors that affect the outcome of a game of Magic: luck (draws, coin flip, Mana Crypt), player skill, individual card choices within an archetype, tournament circumstances (the NYSE had a giveaway that incentivized people to stay in the event rather than dropping - some people racked up impressively bad records as a result), alcohol consumption... We don't have anywhere near the sample size in which these factors could be "washed out". I also think that win percentage changes quite a bit from tournament to tournament. A deck being more prevalent both serves as a target for the other decks in the format and makes it more likely to be "netdecked" by inexperienced or infrequent players who may be playing the deck in the wrong metagame (for instance, not changing the removal suite around to address Eldrazi). The effect of both of these is that it suppresses the archetype's win percentage. And this is part of a cycle - as the deck gets targeted and the win percentage goes down, fewer people play the deck, and the metagame changes to target the new best deck. Eldrazi was the breakout deck of the May P9 - posting a 65% MWP against the field. As it became known, it's win % dropped to 58% at the NYSE and 52%. Over the same time, Gush's win percentage rose to 61% in June. This is simply a part of a dynamic metagame. For these reasons, I look to MWPs as evidence for further investigation - if Gush did well against Storm, why was that so - but put very little stock in the actual win percentages.



  • @Smmenen said:

    Except, I was actually right, using the measure we were discussing.

    The statistic we were discussing was % of DE/ month.

    Using the daily event breakdown by week, your data obscures this trend:

    Here is the percentage of Gush decks in the DE's during the time period you covered, but organized by month:

    Gush as % of DEs by Month

    Gush was 61.66% of the April (post-restriction) metagame, but 41.4% of May, but declined to 38% in June. The declines are even more steep if you slice up the data in other ways, like focusing on 4-0s, etc.

    I applaud this kind of detailed analysis, and really appreciate your hard work here - but it's not only tasteless (and rude) to throw digs at me (or anyone) like you did with your parenthetical jab, but it's embarrassing when what you are trying to critique me for isn't even accurate, using the metrics from the original context.

    Suggestions: in the future, I recommend doing breakdowns both by week, but also by month. The second half of June had far fewer daily's fire. Also, dis-aggregating 4-0 and 3-1 decks is also worth doing, just to monitor the results, if nothing else.

    You are actually hilarious.



  • So what can be gathered from this data is that half of the people in the metagame like to play Gush decks. Time to break out the Chains!


  • TMD Supporter

    Is it entirely likely that the powers that be chose to restrict lodestone golem and chalice without even having access to data like this?

    It's fantastic work, obviously, it is amazing that you guys can process all of this data, it's really giving the community access to a tool that was never available back in the day. The closest we could get was dr. Sylvan's meta game breakdown which did not collect nearly as many results.

    I don't really look at this data as a small sample size, this is all of the vintage played online over three months.

    So what is the proper answer to this warped vintage state? Restrict preordain and gush and call it a day?

    Vintage might be due for another apocalyptic restriction of blue draw spells, it's only been 9 years since the last?



  • @Smmenen

    I AM genuinely interested in whether other people believed the metagame prevalence of Gush would decrease and why. You and I have said our parts on the matter and I don't think we need to rehash this. I welcome others to chime in.

    I AM NOT interested in debating whether or not you are "right". Nobody cares about that but you. I will say that our data conflicts with yours (we have the percentages of Gush at 64.2%, 43.8%, and 42.7%, respectively). This discrepancy is likely due to differences in classification but that emphasizes the fact that the difference is not statistically significant - you take this data to any scientist and try to argue that a downward trend exists between May and June and he will laugh you out of his lab.

    Thank you for the suggestions. I included the number of decks per week to reflect on the lower number of dailies that fired. I thought about including monthly breakdowns but I concluded it would make the post too long and people can get a rough idea of those values looking at the weekly graphs, which do a better job of conveying the evolution of the metagame. We've talked about the 4-0 results before and again, I do not believe they are statistically relevant - there were 12 4-0 finishes in April, 22 in May, and 14 in June - it would be like trying to make metagame inferences from 1 or 2 top 8's.

    Edit: As mentioned in the OP, there was an error in our equations that seems to be a bug...I've left the errant data in the post but the actual percentages we have are 61.4%, 40.0%, and 38.1%. The point that a ~2-3% difference (out of 84 lists in June and 115 in May) is not a indicative of a trend stands. There is simply too much variation in the outcome of a game of Magic for this small of a percent difference to be statistically relevant.



  • @ChubbyRain As I scientist I agree with you that if one would run a chi-squares test on these distrutions per month it would turn out insignificant between May and June. However, any scientist would now that tests of significance are pointless when you are dealing with total data of something because then you are not using a small sample to say something about the whole. You know the whole and its trends. So what are we doing here? Looking at a sample of the entire vintage meta-game or are we looking at the entire meta-game? The former requires tests of significance the latter does not, statistically speaking that is.



  • Bearing in mind that a majority of my Vintage experience is in Paper, and that my conjecture is based less on the quantitative data presented here (and again, thanks to both Ryan and Matt for all the diligent work) and more on the qualitative data of discussions with other vintage players on their thoughts and perceptions of the Vintage metagame.

    (As a longer aside, I understand that this sort of thing is highly nebulous and to those in harder science fields smacks of crockery....but at the end of the day it does reflect "feel bads" and other emotional responses that seemingly Wizards does take into account with looking at things such as the B/R decision.)

    I did not expect Gush to diminish for the following reasons.

    1.) Vintage players seemingly shift decks less frequently, Again, this is reflective of my emphasis on paper Vintage, but if shifting your deck type require a $500+ outlay just to purchase two lands, things have a tendency to move slower. Additionally, some are not interested in chasing the flavor of the month; they'd rather instead stick to a deck they know well and feel comfortable piloting.

    2.) Many players associate Blue with Vintage. Bearing in mind that the average cognitive development of most adults hovers in the 3-4 spectrum on Kegan's cognitive development framework, these tendencies will likely be reflected in the attitudes and predilections of the populace. As a result, we end up with the purported notions that "blue is the color for smart players" and since we all like to imagine ourselves as clever, we make choices reflective of the thoughts we are subject to. (Incidentally, this also explains the tendency towards irritation when decks.strategies are attacked, due to how we make meaning of what our choices mean to us and how others perceive them).

    3.) Related to number 2, there is a book on Gush. Nothing tempts people who love a challenge (and Vintage players have decided they want t play with the largest available cardpool) than telling them it is difficult to do correctly. So they want to see for themselves. They want to play the decks that afford them the opportunity to test their abilities.

    I've rambled enough, but there is my two-cents.


  • TMD Supporter

    I really wish they'd (WotC) publish all the DE results.
    It wouldn't really affect the results that much, but I cashed 3 of my last 4 Dailies, with one 4-0 finish, all with Oath decks.
    Anyway, thanks to Matthew and Ryan for all the hard work you guys do. I really appreciate it, and if there's anything I can do to help let me know.



  • If anything, I think MTGO lends to more players shifting decks than what we could ever see in paper. The cost, physical ease of switching up decks (no need to de-sleeve or rotate cards among decks, etc.), and speed with which you can play meaningful matches over a short period of time lead me to believe that shifts happen on MTGO much faster than paper. It's possible that it's so much more faster that looking at aggregate monthly data is far too slow.



  • @kistrand This is a sampling of winning lists and somewhat stochastic as a result (since Magic does have a significant luck component). Similar to rolling dice, you would expect variation from the "true probability" as a result.



  • i didnt realize this until now, but has the doomsday decks fallen off almost completely? or are they classified under combo gush?



  • @letseeker Combo Gush



  • @ChubbyRain ok thank you, has there been any gush tendrills lists lately? i havnt seen that deck in a long time

    Edit: also what deck do you have under "other gush" decks?


Log in to reply
 

Looks like your connection to The Mana Drain was lost, please wait while we try to reconnect.