Understanding Vintage Metagame Data



  • I'm new. At least to the current Vintage scene. I picked up Magic as an undergraduate circa 1996, shortly after Alliances released. I came back as a grad student circa summer 2004 on MTGO, just ahead of Champions, playing a little Legacy and proxy-Vintage in paper. Took another break, had a couple kids, and returned this year to find 'functional' vintage online. Hurray!

    I picked up Shops (played White Stax in Legacy) and have had fun in the TP room (and the occasional 2-man queue). The restriction of Golem has me down, but I think there are lots of answers, so I'm not worried that Shops is dead, but I still don't understand the decision. The official notice that "Shops-based decks continue to be significantly overrepresented, reducing the competitive metagame." doesn't seem justified. I've gone through Steven's data and the data from the P9 Events, and I just don't see it, especially using more rigorous data analysis methods.

    I'm asking for help. Is the decision really as simple as 'it's justified'? I think the data from the MTGO dailies has to be earmarked as the 'smoking gun' as the P9 Challenge metagame analysis (@diophan and @ChubbyRain) doesn't really suggest 'over-representation'. Yes, the T8/T16 data shows Shops is well-represented, but it is not 'over-represented' overall. The same is true of the paper data, Shops isn't the most-played deck. There are, of course, other potential factors at play, but the bottom line is that the data does not support a restriction.

    To that end, I've built the data from all the daily events in 2016 (I'll post my spreadsheet soon). I found a couple interesting things which I feel may have been ignored, but I'd like to hear from folks who have much more experience in looking at the data.

    I started by aggregating the data, as Steven suggests, and found**:

    Shops - 27.74%
    Gush - 23.87%
    Blue - 19.03%
    Combo - 18.06%
    Oath - 9.68%
    Hatebears - 1.29%
    Red/Burn - 0.32%
    *I separated Oath out by itself as the archetype has won 2 P9 Challenges as well as the recent Asian Vintage Championship and the last Vintage Championship. Shops is certainly strong.

    The same trends continue in looking at 'Wins (4-0s)' and Match Wins (aggregate). The trend breaks down when you sort based on unique pilots. Instead, we see:
    Shops - 21.53%
    Gush - 18.06%
    Blue - 25.00%
    Combo - 22.22%
    Oath - 9.72%
    Hatebears - 2.78%
    Red/Burn - 0.69%
    I find this to be a significant result of the data analysis. Shops is a winning deck, but the archetype is 3rd among unique pilots. It's interesting to point out, too, Gush similarly drops with the deck's results being somewhat similar in 'results' being 5-6% higher than the amount of play (unique pilots).

    I should say that I understand that the DCI/Wizards makes their decisions nearly a month ahead of a B/R announcement, so they did not have the benefit of this more full data set, so I question why a decision 'had' to be made. In fact, if we look at the month-to-month variation (Jan-Feb-Mar), we see something else interesting...

    Shops - 27.74% +/- 7.30%
    Gush - 23.87% +/- 5.62%
    Blue - 19.03% +/- 3.99%
    Combo - 18.06% +/- 3.00%
    Oath - 9.68% +/- 3.09%
    Hatebears - 1.29% +/- 0.58%
    Red/Burn - 0.32% +/- 0.49%

    Shops representation among the archetypes is the most 'variable'. This ultimately leads to a question - why? If we look at the variation in actual player numbers we find:

    Shops - +/-2.65 players
    Gush - +/-3.51 players
    Blue - +/-3.21 players
    Combo - +/-3.06 players
    Oath - +/-2.65 players
    Hatebears - +/-0.58 players
    Red/Burn - +/-0.58 players

    It sees the lowest amount of variability in the player base month-to-month, i.e. it shows up with the same number of players month-to-month. But wait, how can it be 'over-represented' if the same number of pilots are bringing it month-to-month? February is the month where the data appears to be in line with the 'over-represented' comment. Here is that data:

    Archetype - %Decks - %Players - #UniquePlayers (ChangeFromJan) (Change to March)
    Shops - 37.50% - 32.69% - 17 (+1)
    Gush - 15.28% - 15.38% - 8 (-3)
    Blue - 15.28% - 19.23% - 10 (-5)
    Combo - 20.83% - 17.31% - 9 (-4)
    Oath - 9.72% - 13.46% - 7 (-1)
    Hatebears - 1.39% - 1.92% - 1 (-1)
    Red/Burn - nul% - nul% - nul (nul)

    17 unique Shops pilots in 15 events in February (45 different total players). Compare that to January when 16 unique Shops pilots competed in 19 events (58 different players). But in 'raw' numbers, 1 less Shops deck made the 3-1, 4-0 list in February compared to January. So the question that I really want to ask is, did the non-Shops pilots play as much? From the daily data, it doesn't look like it. If we look at the P9 Challenge metagame data, it looks like they showed up for that event - just not the dailies.

    Steven, in his recent Podcast, suggests that the restriction of Golem was 'justified' based on the data. I'm not sure I agree with 'justified' unless either (a) you add in something more subjective (non-interactive argument) or (b) ignore other data (P9 metagame data, paper data, # of unique players, etc.). I struggle with the subjective because I'm not as familiar with the format as many here. I feel like turn 1 orchard/mox/oath is similarly non-interactive. It's, obviously, not played as much, but it appears to be fairly dominant when played. This seems like a poor measuring stick, especially to use to determine restrictions. As far as point (b) goes, I'm a data hog, so I don't like to ignore data, I prefer to explain it or ask it different questions. When data points to 2 different conclusions, isn't gathering more data the right approach?

    This seems especially true when we look at how the Vintage metagame developed into March and has begun to develop into April.

    Archetype - Feb-Mar-April (pre-restriction)
    Shops 37.50% - 23.23% - 21.05%
    Gush 15.28%-25.25%-34.21%
    Combo 20.83%-21.21%-10.53%
    Blue 15.28%-23.23%-15.79%

    I, for one, am interested to see how the second half of April develops once the restriction becomes the rule online. Several early Shops decks have already shifted to 1 Golem, but even more have not...

    I'd also like some feedback on my analysis. What am I missing? I know I don't know the format well enough to draw perfect conclusions and would love to hear other input rather than the same restriction rhetoric.

    Thanks!

    ** I've grouped decks rather broadly, not for effect, but likely more due to ignorance. 'Shops' is any deck which played 4x Shops. 'Oath' is any deck with 4x Oath. 'Combo' is dredge and storm. 'Gush' is Delver/Mentor/Doomsday. And 'Blue' is anything with 4x FoW but 0-1x Gush, mostly Blue Moon, BUG, Tinker/Colossus, Key/Vault, Standstill, etc.



  • I find the player-specific aspect of the MTGO data to be underrepresented in our overall narrative of data mining. I really wanted to discuss this in more detail during the show, but we'd already been arguing for an hour and I didn't want to add too much more of it.

    Suffice it to say, I believe that one of the critical flaws in the MTGO results is how few players it represents, given that a very short list of players make up a high percentage of the 3-1/4-0 results.



  • MTGO dailies for all the promise of a global platform to play Magic seem like one medium size store. What is the breadth of the player base? You highlight unique players by archtype. What is the total number of unique pilts for a given month of dailies, maybe with a minimum of 2 entries. Does anyone know? To contrast we've fired a sanctioned Vintage FNM every month since December, it's a fixed 4 round event. We've gotten turnout of 16-22. The total player pool is about 30-35 total over the course of 5 events. Is the breadth of players in the dailies 100? 500? 40? I don't put much stock in them as a metric without knowing if it's just 20-30 people essentially at a local store. No more than I consider posting our 4-0's from Vintage FNM (maybe I should :| ) to be particularly metagame defining.



  • @nedleeds I would guess about a 100, though some names you see much more often than others. In any case, the events themselves tend to fluctuate between the minimum of 12 people to 20 or 30 on weekends with not much else going on. The power 9's are much more valuable IMO and that's why I spend the time working with Ryan on those reports.



  • @CHA1N5 said:

    I find the player-specific aspect of the MTGO data to be underrepresented in our overall narrative of data mining. I really wanted to discuss this in more detail during the show, but we'd already been arguing for an hour and I didn't want to add too much more of it.

    Suffice it to say, I believe that one of the critical flaws in the MTGO results is how few players it represents, given that a very short list of players make up a high percentage of the 3-1/4-0 results.

    This is 100% true based on the data mining I did.

    There is about 25-40% repeat rate by players across archetypes and about 10-15% event-to-event using the 3-1/4-0 data. If we assume that each event is ~16 players (1 4-0 and 4 3-1 decks is the median), the total number of unique players over a month is between 150 and 200. [for example, in January - 101 decklists made 3-1/4-0, these were piloted by 65 unique pilots by archetype, but there were only 58 unique Vintage players. If the numbers hold for non-3-1/4-0 decks, there were about 186 unique Vintage players.]

    The comment was made in the most recent podcast that the 3-1/4-0 data was similar to a Top 8 of a 5-round event, but that's really not true is it? These daily events require 12 players to 'fire'. The median data indicates that these usually get ~16 players with the majority of data representing 16 and less. The 4-0 decks would be deck which are guaranteed to Top8 a 5-round event, but a 3-1 deck has only a 30% chance - a much smaller sample size.

    I agree with @ChubbyRain, the P9 data seems much more consistent.



  • @ChubbyRain makes sense. Agree with power 9's. 80-100 people would be considered a massive turnout in the world of sanctioned paper. Thanks for you work on this, we leaned on it heavily in our latest podcast.



  • I think everyone agrees that the P9 data is far more useful than Daily data. I voiced my concerns about this to Steve on that other thread. I play online and I know how small the player base is. Steve believes we should look at all of the data, but I think that the dailies amplify their results disproportionately because of how many events there are. In reality, dailies are more like a local metagame of 100 players than the wide Vintage metagame. That is why I believe we should, for the time being, assign more weight to paper and P9 events.



  • I listed to SMIP in the car yesterday, and noted when Kevin mentioned offhandedly the unique players issue. My immediate (somewhat egotistical thought) was that if I played my delver deck in more than 1 daily a month the statistics from the dailies would noticeably change. I've tracked how frequently I 3-1 or 4-0 and it's over half the time. I imagine Rich has an even better conversation rate. If one person playing consistently on modo could change the statistics that much, it makes me very suspicious of using that data at face value too much.


  • TMD Supporter

    @diophan please don't get Gush restricted.



  • @DeaTh-ShiNoBi @diophan I give more weight to Dailies (collectively) compared to your usual monthly Vintage event. As far as unique players go, I think this is only somewhat helpful. Many paper Top 8s have the same players represented and I would certainly not dismiss the finishes of Brian Kelly or Montolio because of the frequency of their success.

    Why do I think Dailies are superior? The metagame is much more dynamic with a higher concentration of tier 1 decks. Paper metagames change much more slowly, people play their pet decks, and this really isn't that useful in "solving" the format or determining the "best deck".



  • @ChubbyRain on the other hand, many many more players play paper. And what wins changes there just as fluidly as online. Plus if we want to talk about weighting... why the heck is this decision being based on what people are playing anyway? It should be based on what is winning and nothing else. If everyone in vintage wants to play Ghazban Ogres is DCI going to restrict them? (the second part seems possible, actually) Just manage the format if something wins too much and there isn't a good way to fight it, otherwise butt out DCI.

    But this whole discussion is verging on the self defeating anyway. There isn't any need to weight small amounts of data, when the decision isn't even being made based on all the data that's available. Take it all collectively and make the best decision. More data is better data. This is all stuff everyone already knows. This is high school stuff. The DCI should playtest the format and take all the data collectively, and anything else is a half-assed half-measure. That's just all there is. It's hard to respect decisions like this as good, when they aren't trying very hard to make them good.



  • @Topical_Island More data is not better data... That's why meta-analyses attempt to qualify and rate the value of data, looking for bias and confounding factors. The DCI does not have even close to the resources necessary to playtest all formats...It might be half-assed but at least it's practical. What I would certainly like is a much more in depth explanation for their Vintage decisions. They dedicated much more space to a much less controversial ban of Eye of Ugin in Modern.



  • @ChubbyRain Yes. Metadata. I am familiar...I mean, I'm not going to get lost in a side debate about the validity of a truism. I think we pretty much agree here, and anyone who thinks that the DCI did a good job here and went out of there way to get robust data for this decision... I would just restate that as being their own opinion and rest my case. I don't think that's what you're saying...

    As to the resources... to do a better job making this decision the DCI would have to marshaled up the resources to write a scraper program aimed at any of a number of websites already aggregating the very data they need, or even easier, to play a single game of vintage. Can that really be true? Even as I type it I find myself saying... woah woah, pump the breaks Bill... internet rant guy.

    If that is true, its just negligence. (which isn't actually the name of a magic card, though it seems like it should be... 2WW, enchantment. Pay 1W and some attention: do you frigging job.) I just can't believe that this is the governing body for our format.



  • @Topical_Island said:

    to play a single game of vintage

    There are people who self identify as Vintage players working within Wizards of the Coast. There are also people play testing Vintage by playing Vintage who are adept and whom the DCI is paying attention to.



  • @Aaron-Patten I hope so. The more the better. ;)



  • @Topical_Island said:

    Yes. Metadata. I am familiar...I mean, I'm not going to get lost in a side debate about the validity of a truism. I think we pretty much agree here, and anyone who thinks that the DCI did a good job here and went out of there way to get robust data for this decision... I would just restate that as being their own opinion and rest my case. I don't think that's what you're saying...

    One of the main topics of discussion here has been the role of Online and Paper results. You mention taking the data collectively, but how you interpret that data is an essential component that you seem to be glossing over.

    As to the resources... to do a better job making this decision the DCI would have to marshaled up the resources to write a scraper program aimed at any of a number of websites already aggregating the very data they need, or even easier, to play a single game of vintage. Can that really be true? Even as I type it I find myself saying... woah woah, pump the breaks Bill... internet rant guy.

    A) I think you are vastly overestimating the competency of WotC given the abomination that is Magic Online.
    B) I think you are vastly underestimating the difficulty of creating a virtual Vintage metagame.
    C) They have a much more cost effective system of letting the players play the game....Heck, we even pay them to do that.



  • @ChubbyRain said:

    .

    Yeah. I felt bad after I wrote that. I got myself worked up into an emotional response there. I don't think that WotC is incompetent at all. I love the game and think that the cards from a design standpoint have been great. The thing that gets me grumpy is the impression I get that vintage is just enough of a DCI afterthought that a good effort to understand the format isn't really getting made (for whatever reason... be it resources or whatever. To be clear, I do not think these are stupid people. I just think it isn't a priority at all.) Yet not enough of an afterthought to be left alone.

    But that was probably too far in those post. I really don't want to be the rant guy. I guess I'll just calmly re-ask... do they have a scraper? That could be pretty easily done. Most third year CS students can do this. An intern could do this. And shouldn't there just be more thought given to the system of restrictions itself. (I wrote a diatribe on this already under the Community section) Honestly, I play a lot online and in paper. But I've never played a game on MTGO because I don't trust the decision making process enough to get that committed directly. I say that with complete calm. I think the folks at WotC are awesome. I play a lot of Chess and Go and Poker... but I think Vintage MTG is far more interesting as a game than all those. But I don't believe that WotC has shown enough care and interest in the format, for me to buy their version of it. That's where I guess I have to leave it. Other people who love MTGO, keep making vids. I love them, and thank you. I just think this process could be a lot a lot better if these very intelligent people were trying. And data and metadata are great, if somebody wanted to do a big data masters thesis on this I would actually read the entire thing. I promise you. But the solution for my concerns isn't numeric; its systemic.

    Sorry again for the burnout before.



  • @Topical_Island said: Honestly, I play a lot online and in paper. But I've never played a game on MTGO because I don't trust the decision making process enough to get that committed directly. I say that with complete calm. I think the folks at WotC are awesome.

    Can you explain what you mean here? Do you mean that you don't trust your capacity to make decisions on MTGO? Because of the platform/software or because of the mechanics of the game of Magic?

    I play a lot of Chess and Go and Poker... but I think Vintage MTG is far more interesting as a game than all those. But I don't believe that WotC has shown enough care and interest in the format, for me to buy their version of it. That's where I guess I have to leave it. Other people who love MTGO, keep making vids. I love them, and thank you. I just think this process could be a lot a lot better if these very intelligent people were trying.

    This is very interesting to me.

    I played Chess before I played Magic, and played Go heavily in college and graduate school, but haven't played much since because of lack of good real life opponents.

    I started playing Magic in late 1993, but was a serious chess player before that point. I think coming to Magic from Chess or with a Chess mindset actually brings a very different mindset. Is that what you are speaking about?

    One of the things that I dislike most about Magic, and dislike less about Magic Online, is the pressure to make decisions quickly. This causes Magic players to get into a pattern where they rely far more on pattern recognition and intuition than carefully thought out plays. This makes the game much less interesting than it would be if people were far more deliberate. Is that what you are speaking about?

    I believe this tendency contributes to an almost ADD mindset in Magic that is truly unfortunately. With more deliberate play, I believe games would be far more interesting and intelligent. Regular Magic is basically the equivalent of speed chess. Even commentators rarely take time to analyze lines of play methodically. I would prefer online matches of magic with bigger clocks as the default. Unfortunately, since Magic player's brains are now used to the faster clock management and mindset, magic players become quickly bored if decisions aren't made fast. That leads to worse in game play than if players were far more deliberate. I would like to see Magic played more like Chess, where moves are made no faster than one play per minute, as opposed to that being the ceiling.



  • @Smmenen said:

    @Topical_Island said: Honestly, I play a lot online and in paper. But I've never played a game on MTGO because I don't trust the decision making process enough to get that committed directly. I say that with complete calm. I think the folks at WotC are awesome.

    Can you explain what you mean here? Do you mean that you don't trust your capacity to make decisions on MTGO? Because of the platform/software or because of the mechanics of the game of Magic?

    Pretty sure he means he doesn't trust the MTGO management enough to invest a thousand or so dollars into the program. I can't really blame him.



  • @Smmenen Normally I really hate getting off on tangents. I really try hard to stay "on thread" as it were. But to hell with that this time, you want to talk games? I am so there! Lets do this!

    Firstly, what I meant to express was that I don't actually believe that the folks at the DCI/WotC are in fact, dunderheads. I think the game design is incredible. And that in terms of complexity, I'd place magic alongside the very best games that humans have devised. (My tactic in trying to express that was to literally place it alongside those best, and most cache-worthy games I could think of in the first four seconds after I began trying. I include Go in that list because I personally think it's the best game of its kind, even though I doubt many people have played it.) I include MtG, because the ecosystemic pressures give it a dynamism that is unparalleled in the others. (The idea that one deck can beat another because of the existence of a third, is a pretty amazing example of causality) I'm a teacher, and I actually use both MtG in class, because, logic. And language. So that's what I was going for anyway.

    In terms of MTGO, I just don't want to pay WotC money that directly. I'm happy to provide some downstream value to their product when I buy my wife a retail set of Mentors for her birthday, but yeah, @ChubbyRain was right on. I just don't want to get that financially entangled when I don't at all feel that the format - THE format, that I've always played and pretty much the only one I have any interest in playing (the occasional temptation to go slumming it in legacy aside). It kinda feels like those Hasbros making the decisions are thinking, "hey, we could squeeze a little money outta that format without too much work on our part... so why not? This is a business right?" For me, it's not a business. It's a game, a chance to improve my mind and get at some of the underlying logos of competition. If they want to try to "run it like a business" which I find is usually just a palatable euphemism for, squeeze money in the short run without much care-taking or long term investment going on... (I'm from the state where they ran Flint Water like a business... saved money in the short run too. Now I gotta schlep my ass over there with cases of Poland Spring) if MtgO wants to do it that way, that is completely within their rights. They get to choose their own business strategy. But running it like a business isn't very good for business, in my case at least. That could easily just be paranoia on my part, but that's where I'm at.

    SO! Games! They are great!... I love GO, and am really pretty terrible at it. I don't think I'll ever make Shodan, which is kinda a bitter pill. But man what a great game. It has so many wide ranging and applicable principles. I'm even worse at Chess, since I'm much better at creating than calculating. My most serious games were all sports growing up. (I'm a girls Volleyball coach now.) And there are so many principles (I want to say logoi here, but I kinda think that would lose the larger audience) that I've bumped into playing either Go or MtG, that when followed, are both completely tangible in a statistical and rational sense, but feel mystical in the heat of competition when there isn't time to think, but when followed with courage seem to put one on the happy side of randomness again and again. Games are really beautiful.

    As for MtG, now that I'm married to a fellow game junky. I play other folks much less than I used to, because I literally play two or three matches a night, of whatever match-up we want to try. Which is incredible good fortune. If people ever want things honestly play-tested, we really do just crank games like fiends. And we both despise losing, so we always play both sides of each match-up.

    I hear what you're saying about the ADD nature of some plays and players. I think that's present at least to some extent in all games they, as a real wood pusher on the chess board, I'll fess to that. But I do think that magic maybe attracts more than it's average number of people who aren't hardened in a lot of forms of competition? Maybe? That's purely speculative. I'll say it this way. The culture of each game is different, and that magic has maybe an above average number of players who are bad at the skill of losing. As a population, I think losing is done very poorly by our players in fact. I'd hypothesize that because the game is beautiful, it's art is beautiful and the stories and mythology taps into that symbology we collectively find compelling, players get drawn in who have never competed anywhere else in a dedicated way, and now find themselves in a competitive environment. (Which is great, the more the merrier) The feature I notice the most, among young players, isn't speed of play, (Though now that you mention it...) it's that they lose badly. In general, a lot of conclusions get created around protecting the against the sting of fault. People do cursory postmortems, and just abdicate choice generally. That's how I was when I was young. (Still am sometimes... We are all guilty of this to some extent. Or to answer your earlier question. Of course I don't trust myself to make decisions on MTGO, but that has everything to do with epistemology than user interface.) I got a lot better at losing over the years. Practice makes perfect.

    That's also why I put Poker up there with other games. During college and the Poker craze I paid my rent playing online, and the experience of losing a month's rent in a day, and then having to get up the next morning and get back at it like a job... well, that certainly changes how you feel when you Oath for your last creature with 15 cards in your library. Live or die, it doesn't matter. The only thing is the best choice, and nothing else. People in MtG, in general, play too fast and talk about luck too much. When it's pretty clear that being superstitious is about the most unlucky thing a person can do. (NO people, I don't run a Memory's Journey. Cause seriously, you lose somewhere between 0-4% more games with it in... at least in my build.)

    Loved the discussion about Thing in the Ice on your podcast too... Think it will replace Restoration Angel in those kind of decks?


Log in to reply
 

Looks like your connection to The Mana Drain was lost, please wait while we try to reconnect.