[Free Article] What Now


  • TMD Supporter

    @DeaTh-ShiNoBi said:

    @Smmenen said:

    Shops are 30% of Q1, 2016 Daily 3-1 or 4-0 decks, and 31% of Top 16, 30% of Top 8, 42% of Top 4, and 50% of Top 2 MTGO P9 challenge tournaments in Q1.

    Gush is 21% of Q1 Daily 3-1s or 4-0 decks, 21% of Top 16, 22% of Top 8, 33% of Top 4, and 17% of Top 8 MTGO P9 challenge tournaments in Q1.

    The statement that was made in this article is simply not true, that Gush decks are, in aggregate, performing about the same as Shops in aggregate. Gush decks are significantly behind Shops by almost every metric on MTGO results.

    If you don't want to "give weight" to MTGO results, that's fine. But it seems pretty obvious that Wizards does. I frankly think they should give weight to both.

    That said, I also believe that the single best data point every month is the P9 challenge as it is much larger than most paper events, more competitive with stronger players, and global. So to ignore MTGO seems foolish.

    As flawed as MTGO is, paper has many flaws as well, such as proxies, budget decks, etc. that don't exist to nearly the same extent on paper.

    I would very much like to know where the author of this article got his data.

    The reason why I don't give too much weight to MTGO is because there are so few players playing it.

    Alot of the paper tournaments are tiny as well. I mean, we literally have like a number of 13 player events in our paper tournament data (which is why one of the tabs weights by tournament size). Also, lots of paper tournaments don't permit proxies, so there are many fewer Workshops than their would otherwise be. People are forced, in Europe for example, to play budget decks instead.

    Both MTGO and paper data have their flaws. That's why I think you have to look at both.



  • @Smmenen said:

    @DeaTh-ShiNoBi said:

    @Smmenen said:

    Shops are 30% of Q1, 2016 Daily 3-1 or 4-0 decks, and 31% of Top 16, 30% of Top 8, 42% of Top 4, and 50% of Top 2 MTGO P9 challenge tournaments in Q1.

    Gush is 21% of Q1 Daily 3-1s or 4-0 decks, 21% of Top 16, 22% of Top 8, 33% of Top 4, and 17% of Top 8 MTGO P9 challenge tournaments in Q1.

    The statement that was made in this article is simply not true, that Gush decks are, in aggregate, performing about the same as Shops in aggregate. Gush decks are significantly behind Shops by almost every metric on MTGO results.

    If you don't want to "give weight" to MTGO results, that's fine. But it seems pretty obvious that Wizards does. I frankly think they should give weight to both.

    That said, I also believe that the single best data point every month is the P9 challenge as it is much larger than most paper events, more competitive with stronger players, and global. So to ignore MTGO seems foolish.

    As flawed as MTGO is, paper has many flaws as well, such as proxies, budget decks, etc. that don't exist to nearly the same extent on paper.

    I would very much like to know where the author of this article got his data.

    The reason why I don't give too much weight to MTGO is because there are so few players playing it.

    Alot of the paper tournaments are tiny as well. I mean, we literally have like a number of 13 player events in our paper tournament data (which is why one of the tabs weights by tournament size). Also, lots of paper tournaments don't permit proxies, so there are many fewer Workshops than their would otherwise be. People are forced, in Europe for example, to play budget decks instead.

    Both MTGO and paper data have their flaws. That's why I think you have to look at both.

    I agree both MTGO and paper data have their flaws that we have to put up with, and I'm not advocating throwing out MTGO data, but MTGO amplifies its own data due to the number of events one player can participate in. Skilled players like Rich Shay, Brian Schlossburg, and Montolio can contribute a huge number of 3-1 or 4-0 for Shops and lopside the results. You pointed this out in your podcast: Doomsday on MTGO is a great example of this, as there's really only one guy who plays it, but he puts up enough results to make a significant portion of the metagame. To me, putting a lot of weight on MTGO Dailies is like putting a lot of weight on the results of a small metagame. There's value to MTGO Dailies, but they don't describe the big picture. That's my view.



  • A small meta is somewhat interesting in that it shows you what's difficult to adapt to even given a high chance of seeing a particular player/deck.



  • @Smmenen

    Hi Steve,

    Now that I'm home and dome with all of my non-Magic obligations, let give you and everyone else a more proper response.

    I believe that your point breaks down into two separate issues. The first is that I said that Mentor's performance was equal to every Shops deck. That's actually not what I said. I said they occupied "basically" the same percentage. I could have used "around", "nearly", "almost", or a litany of other synonyms. At that moment though I felt like saying what I did. One's opinion on what constitutes a close enough percentage to use that kind of an adjective/adverb may be different, but that's a pointless argument since it's a matter of opinion.

    The second issue is the data I'm using and how I'm getting it. For the paper data I stuck to your Q1 results because I couldn't find the other IRL results from the archived TMD to go any further back. The IRL data showed Mentor actually having a .65% edge on Shops, so I don't believe we disagree about those results. For the MTGO data, I used all results between October 8th, 2015 and April 3rd, 2016 provided by mtgo.com. I went with these dates because they were both after the Chalice/Dig/Thirst change and after Daily Events became four rounds again. I then broke up the data into Mentor, Shops, Decks with Gush (Gush for short), and Other. For the sake of simplification, I defined "Mentor" as any deck with Monastery Mentor and Gush. The reason I chose that definition is because in my eyes those are the defining elements of the deck, and any extra Dragonlords serve as an alternative threat. Here is what I found:

    Mentor Total = 85
    Shops Total = 123
    Gush Total = 64
    Other = 272

    Metagame Total = 544

    With these numbers Mentor adds up to be 16% and Shops adds up to be 22%. For me personally, 6% is close enough where I feel comfortable using the word "basically", but I can understand if some people disagree. What's interesting though is if we dig into this further. Stacking up decks with Gush vs decks with Mishra's Workshop in this time period, we have these numbers:

    Combined Gush Total = 149
    Shops = 123

    That's 27% of the field for Gush vs 22% of the field for Workshops. Again, a 5% difference would be close enough for me to say feel ok with saying "basically", but from this range of dates it clearly indicates that Gush had a larger metagame presence than Mishra's Workshop. What about the most played archetype in each of those categories during this time period? For the whole of Gush that archetype is Mentor, and for Mishra's Workshop it's Ravager Shops. Analyzing just those, we now get these numbers:

    Mentor = 85
    Ravager Shops = 61

    That's now 16% vs 11% in favor of Mentor, which makes Mentor more played than any Shops archetype from October 8th, 2015 to April 3rd, 2016. Since we're continuing the deeper exploration of data, what happens if you change the parameters? Instead of just Mentor with Gush, what if you talk about all deck that use the card Monastery Mentor? Also, for Shops, what if you discount the Workshop decks that aren't based on locking people out of the game under numerous Sphere effects? That condition on Shops may seem somewhat arbitrary, but I believe that the vast dislike for Shops comes from being put in a position where you literally can't play Magic. If all the deck was doing was powering aggressive artifact creatures while letting you interact with it I'm not sure we'd even need to entertain the idea of Shops being "bad for the format". In my classifications I found five such decks for Mentor and 14 such decks for Shops respectively. With these shifts now in mind, we now have this:

    Adjusted Mentor = 90
    Adjusted Shops = 109

    That's now 17% to 20%, a 3% difference. I definitely feel comfortable using the word "basically" here, but I can understand disagreeing both with 3% being close enough and the classification I used to get those numbers.

    Now I don't consider these numbers to be infallible. I am human, and as a single person with learning disabilities gathering all of this data without someone to check my work it's entirely possible I may have messed up somewhere while adding all of my numbers together. Still, no data was intentionally manipulated and these were the numbers I had in front of me while I was writing. Between all of this and the fact that there are some major differences with IRL and MTGO Vintage because of things like infinite combos, ease of changing decks compared to paper, and card availability that affect the respective metagames, I felt comfortable saying that Mentor and Shops were "basically" the same percentage. From now on I will definitely be more clear with my words in order to prevent misunderstandings like this, and I thank you for your feedback.

    On a personal level Steve, in the future I would really appreciate the chance to give a real explanation before my credibility is attacked. I'm sorry that I didn't go into this much detail on my Facebook response to you, but I was not in the physical nor mental place to give this in depth of a response (I also couldn't actually provide my data since I was on my phone at school and they were saved to my computer at home). I know we haven't interacted all that much, but I personally respond better when I don't feel like I'm being confronted.

    I hope this answers any questions you may have. Thank you for taking the time to read this, and thank you for taking the time to comment on my article.

    Danny Batterman


  • TMD Supporter

    @DBatterskull said:

    With these numbers Mentor adds up to be 16% and Shops adds up to be 22%. For me personally, 6% is close enough where I feel comfortable using the word "basically", but I can understand if some people disagree.

    According to the data you just presented, with a total count of 544 decklists reported in the daily's you surveyed (the denominator), a 6% difference is equal to 33 decklists.

    That means that there were 33 more Shops decks in reported daily's than Mentor decks. No, "basically the same" is not an accurate description. Nor is any synonym.

    To put that in context, in historical Magic Metagame reports, there were many occassions in which only the top 3 or so decks even constituted more than 6% of the overall metagame. In other words, you could fill all of the Dredge decks in a metagame in some quarters with that percentage. That's more than many archetypes combined. I don't think there is any reasonable definition of "basically the same" that can bridge that gap.

    My concern with your article is not simply the factual inaccuracy which I pointed out here, and which you now acknowledge having presented your data, but the impression conveyed in what may be reasonably read as a critique of the DCI's decision.

    I don't always agree with the DCI, and I am on record saying I probably wouldn't have handled this restriction the same way, but I've been incredibly disappointed with some of the vitriol, venom and false characterizations of the DCI.

    I must also add that I suspect that there are further errors in your data that undercount the number of Shop decks in your sample. In both the Daily Sample and Premier Event Sample I compiled with Kevin, we found that Shops were consistently no less than 31% of all reported decks. So for you to find that Shops are only 22% is a pretty significant discrepancy that I am skeptical can be explained by simply going back to October. Since Shops are both 31% of the Premier Top 16s and the reported dailys in our sample (Jan 1 to March 20th, when we did our podcast), Shops would have to be about 12% of the reported decklists in October, November and December to average out to 22%. That seems very unlikely to me.

    It's possible that people jumped off of Shops hastily following Chalice's restriction, and that could explain a brief dip in Shops numbers, but Kevin and I had previously found Shops to be 50% of reported daily decks prior to Chalice's restriction. In any case, I think my data - starting in January (or, in the case of the premier events, November) makes more sense as it is more proximate.

    I should also add, lest you feel picked on, that my concern here is not you or any particular individual, but the overly casual claims made by far too many Vintage players (including VSLers on Dark Petition) regarding prevalence, dominance, or representation.

    On a personal level Steve, in the future I would really appreciate the chance to give a real explanation before my credibility is attacked.

    My first in this post in this thread merely says that a statement in your article was false/incorrect. Not every time someone disputes an empirical statement should it be interpreted as an "attack on someone's credibility."

    I was watching a little bit of the replay of Rich Shay's twitch stream this evening, and I was actually astonished that some people felt that by posting data here and asserting that a quote in your article was a false statement that I was "attacking your credibility."

    Moreover, my credibility was actually repeatedly attacked because I "wrote a book about Gush." There was also a good deal of venom and vitriol. Sad.

    Moreover, your article, and others like it, could be read as an attack on the DCI's credibility. I consider that far more serious than any individual's pride. The legitimacy of the format depends on that, not to mention risks of an overreaction.

    Second and more importantly, I did give you a "chance to give a real explanation."

    I replied to your post linking this article with the comment:

    ""They occupied basically the same percentage of the metagame as all of the Mishra’s Workshop decks combined" This statement is not true. https://docs.google.com/.../1cj99OKyaTn7zLvyh3OND.../edit... https://docs.google.com/.../1cj99OKyaTn7zLvyh3OND.../edit... FYI."

    To which you replied:

    "This is all I'm going to say in this regard as I candidly don't want to get into this debate: I used a different data set than this, and mine took different things into consideration."

    You had a chance to explain, you responded, and I found your response troubling on account not only of its vagueness and unwillingness to provide specifics, but more importantly because of the position that you were unwilling to even discuss it at all. Although I found your reply a bit suspicious, I never felt that you were being deceitful - just a bit fast and loose. That impression appears well founded, by your own admission.

    I often enjoy your work, so I appreciate your willingness to try to strive for greater clarity. I think we can all aspire to that.



  • @Smmenen can any of your reaction to this article also be your bias for Gush and possibly your fear it might be on the chopping block next B&R announcement? Just curious. No offense intended.



  • @Smmenen said:

    Shops are 30% of Q1, 2016 Daily 3-1 or 4-0 decks, and 31% of Top 16, 30% of Top 8, 42% of Top 4, and 50% of Top 2 MTGO P9 challenge tournaments in Q1.

    Gush is 21% of Q1 Daily 3-1s or 4-0 decks, 21% of Top 16, 22% of Top 8, 33% of Top 4, and 17% of Top 8 MTGO P9 challenge tournaments in Q1.

    The statement that was made in this article is simply not true, that Gush decks are, in aggregate, performing about the same as Shops in aggregate. Gush decks are significantly behind Shops by almost every metric on MTGO results.

    If you don't want to "give weight" to MTGO results, that's fine. But it seems pretty obvious that Wizards does. I frankly think they should give weight to both.

    That said, I also believe that the single best data point every month is the P9 challenge as it is much larger than most paper events, more competitive with stronger players, and global. So to ignore MTGO seems foolish.

    As flawed as MTGO is, paper has many flaws as well, such as proxies, budget decks, etc. that don't exist to nearly the same extent on paper.

    I would very much like to know where the author of this article got his data.

    @Smmenen But what is the actual quality of the data? The data is presented as a 'random sample', but is it? I'm not accusing you of bias, but I believe the data you present is. You make a compelling argument based on it, but it's not really the 'whole story'. The Q1 data says that Shops was 30%, but, on 1 hand, the March data shows a drop to ~22% (if my cursory review of the data is correct). That implies that Shops was as much as ~34% of the pre-March meta. It seems disingenuous to use Q1 when Jan-Feb is so much different than March. You criticize the author for treating a 6% difference as approximately equivalent, but a 12% month-to-month variance is represented as non-significant...

    The second biasing is not looking at or accounting for non-randomness. Dailies (and to an extent P9 Challenges) fire on the same days at the same times - you can 'count' on them. It's like looking at data from a single shop and expecting new people to come in every day - it just doesn't happen. Montolio has 11 finishes on Shops in Q1, BlackLotusT1 has 11 finishes on Shops in Q1, and The Atog Lord has 7 finishes on Shops in Q1 - that's almost 40% of the Q1 Shops decks in the data! Are good players the problem or is the deck really over-powered?

    I would say, in defense of the article in question, a 6% difference in decks being presented as approximately equal is probably more reasonable than you argue given the quality of the underlying statistics (the error bars are likely enormous). If the MTGO data were a true random sample (it's not) or the data was pared down to represent a more random sample (bad idea because the data set would get even smaller and error bars would get larger) or the data included all decks played (best idea but is it possible at this point?), Gush is pretty close to on par with Shops over Q1. To say otherwise makes the data seem more robust than it really is.



  • @Ten-Ten said:

    @Smmenen can any of your reaction to this article also be your bias for Gush and possibly your fear it might be on the chopping block next B&R announcement? Just curious. No offense intended.

    I'd prefer we not make this about something it isn't. It's pretty clear that both sides of this most recent restriction are emotionally charged - to say the least.

    From a neutral standpoint it feels as though Steve has been relatively transparent as to his stance both pre and post restriction. For what it's worth I'm not seeing this transparency from the majority of nay sayers.

    On a more aggressive note; data can be interpreted many ways - something I've learned from working in the medical field. That being said does this blog post or many of the outrageous vitrol induced posts that have been slung over the past week back up any of their claims with directed citation? Nope.

    I'm no fan of Steven but this seems pretty clear cut to me.


  • TMD Supporter

    @Ten-Ten said:

    @Smmenen can any of your reaction to this article also be your bias for Gush and possibly your fear it might be on the chopping block next B&R announcement? Just curious. No offense intended.

    It is possible that my sensitivity to Gush could make me more inclined to respond to unfactual or unfounded statements regarding it. I probably would not have responded to a comment made about Dredge in the same way.

    But it's more the fact that Vintage commentators and pundits on websites and social media have a really bad habit of making completely unsupported claims about data far too often, and it's long been a pet peeve of mine. That's why I started collecting data more than ten years ago.

    The idea, suggested in the twitch stream, that my "data" was biased because I "wrote a book about Gush" makes no sense unless you think I was selectively omitting data or lying about the calculations.

    @Fred_Bear said:

    @Smmenen But what is the actual quality of the data?

    That's a very odd question. Do you think Wizards has lied about the daily results?

    The data is the daily reported MTGO decklists and the MTGO P9 decklists. That is, every day after a daily fires, Wizards of the Coast posts the decklists on their website here. Kevin and I went through every single daily in the Q1, 2016 up to our recording date, and then compiled them. So did the author here.

    The raw data is here.

    The cleaned data is here.

    And then then aggregate data is here.

    While it's possible that Wizards is lying about the decks that actually performed as they claimed, that seems very unlikely. Asking about the quality of the data is odd because it's the information that Wizards publishes. We simply collected it from their website.

    The data is presented as a 'random sample', but is it?

    Huh? No it's not presented as a random sample. It's almost the entire population. You don't sound like you understand what is presented here.

    Sampling is a statistical technique to draw inferences about a population when the population is too large to count. In this case, we have almost the entire population, so sampling is unnecessary. The only thing missing are daily results in which there were two dailies fired in the same day. According to Wizards, that is less than 20% of dailies.

    I'm not accusing you of bias, but I believe the data you present is.

    I think you misunderstand the data. You believe I was presenting a sample rather than the whole population of data.

    You make a compelling argument based on it, but it's not really the 'whole story'. The Q1 data says that Shops was 30%, but, on 1 hand, the March data shows a drop to ~22% (if my cursory review of the data is correct). That implies that Shops was as much as ~34% of the pre-March meta. It seems disingenuous to use Q1 when Jan-Feb is so much different than March. You criticize the author for treating a 6% difference as approximately equivalent, but a 12% month-to-month variance is represented as non-significant...

    First of all, that's the "opposite" of the whole story. Removing data makes is less than "the whole story."

    In any case, there is tremendous month to month variance in vintage and always has been going back to the earliest data sets we've ever collected. Look at the % of Gush decks in the Premier events. There was only one is the January Top 16, 2 in the February Top 16, and 7 in the March Top 16. That doesn't make "data" biased. It just means that there is tremendous variance.

    That could have easily explained why the author's data differed from that I collected, except that when he shared his data set, it becomes clear that that's not the case here.

    The selection of quarterly data is not "disingenuous" even in the remotest. First of all, it's consistent with historical approaches:
    http://themanadrain.com/topic/138/vintage-metagame-data-archive

    Secondly, we know the DCI makes it's decisions a month in advance, so they largely didn't have the benefit of March data to make their decision, so, if anything, Jan-Feb is the most relevant period.

    In any case, my criticism of the author here has nothing to do with variance or date range for selected input - it has to do with the fact that, according to his own data, Mentor and Shops decks are not even close to "basically the same" amount of the metagame. 16% v. 22% is a pretty enormous metagame representation difference that is equal to 32 decks in his data set and a larger percentage than most archetypes in the metagame.

    The second biasing is not looking at or accounting for non-randomness. Dailies (and to an extent P9 Challenges) fire on the same days at the same times - you can 'count' on them. It's like looking at data from a single shop and expecting new people to come in every day - it just doesn't happen. Montolio has 11 finishes on Shops in Q1, BlackLotusT1 has 11 finishes on Shops in Q1, and The Atog Lord has 7 finishes on Shops in Q1 - that's almost 40% of the Q1 Shops decks in the data! Are good players the problem or is the deck really over-powered?

    The same thing happens in paper data. If Brian Kelly plays in 5 tournaments out of a 20 tournament data set in a single quarter and makes top 8 each time, he shows up 5 times in the paper data set. No one has ever objected that we should be concerned about this problem in Vintage or Magic paper data sets to my knowledge.

    I would say, in defense of the article in question, a 6% difference in decks being presented as approximately equal is probably more reasonable than you argue given the quality of the underlying statistics (the error bars are likely enormous). If the MTGO data were a true random sample (it's not)

    That's right. It's not a random sample. It's almost the entire population. Sampling is a statistical technique to draw inferences about a population when the population is too large to count. In this case, we have almost the entire population, so sampling is unnecessary.

    If I were sampling, the idea that the reported data was more "biased" would have merit. But these aren't samples.

    Gush is pretty close to on par with Shops over Q1. To say otherwise makes the data seem more robust than it really is.

    Um, no, actually, it's not. In paper, yes, that's true.

    But on Magic Online, it's consistently clear that Shops are about 33% larger part of the reported results.

    Just to be clear, here is the breakdown of Q1 Dailies by archetype:

    1. Shops - 32% of all reported daily decks (72 decks out of a 241 reported decks)

    2. Dredge - 15%

    3. Oath - 10%

    4. Mentor - 10%

    5. Delver - 7%

    And then everything else is under 5%.

    But if you add up all of the Gush decks in our population (all Mentor, Delver, Doomsday, etc), as we did in the tab, you get to 20.74% (21%).

    Which happens to be the same percentage as all of the Gush decks in the Top 16 of the premier events.

    So, no, Gush decks are not "pretty close to par" with Shops in Q1. Shops are 32% and Gush is 21%. That's not even close to "pretty close." That's not the same vicinity, let alone galaxy.

    It has nothing to do with "robustness." A 6% difference is a huge difference when you consider that that is almost as large as all of the Delver decks, for example, in the population. There is no world in which Mentor is "basically" the same amount of Shops.


  • TMD Supporter

    If I may indulge in a slightly off topic aside, what I love about this Vintage community, and TMD in particular, is the level of rigor of the debate. I am grateful for the analysis that Steve, Kevin, and Danny, and several others contributed to this particular debate, and many others.

    The Vintage metagame is an incredibly complex system, and reasonable human minds, which are inevitably biased to some extent, can disagree as to how best to analyze and distill meaning from the incongruous sets of data, particularly with respect to ill-defined concepts as dominance of a particular archetype, or the appropriateness of restricting a card.

    In sum, this thread, stripped of the personal attacks and ruffled feathers, is representative of what I think makes this such a great forum. Keep up the good work gentleman, and keep it clean.



  • @Smmenen said:

    Do you think Wizards has lied about the daily results?

    The data is the daily reported MTGO decklists and the MTGO P9 decklists. That is, every day after a daily fires, Wizards of the Coast posts the decklists on their website here. Kevin and I went through every single daily in the Q1, 2016 up to our recording date, and then compiled them. So did the author here.

    Quick question, then... The data you present represents **EVERY **decklist or every decklist which went 3-1 and 4-0 in the dailies? I am under the impression that your data was ONLY 3-1 and 4-0 decks from the dailies - the data shared with us on mtgo.com and mtggoldfish.com.


  • TMD Supporter

    @Fred_Bear said:

    @Smmenen said:

    Do you think Wizards has lied about the daily results?

    The data is the daily reported MTGO decklists and the MTGO P9 decklists. That is, every day after a daily fires, Wizards of the Coast posts the decklists on their website here. Kevin and I went through every single daily in the Q1, 2016 up to our recording date, and then compiled them. So did the author here.

    Quick question, then... The data you present represents **EVERY **decklist or every decklist which went 3-1 and 4-0 in the dailies? I am under the impression that your data was ONLY 3-1 and 4-0 decks from the dailies - the data shared with us on mtgo.com and mtggoldfish.com.

    Your question doesn't make sense. What's the difference between "every decklist that went 3-1 or better"and decklists those "only went 3-1 or better"? That's the same thing.

    The world "only" and "every" perform the same work in each part of your question by excluding decks that performed worse than 3-1.

    In any case, if you clicked the links I provided for you, you would have seen the answer. It is the complete population of decklists that performed at 3-1 or better. It's not a sample of 3-1 or better decks.

    Again:

    The raw data is here.

    The cleaned data is here.

    And then then aggregate data is here.

    Wizards of the Coast asked MTGGoldfish to cease and desist collecting data, so our data was taken directly from the Wizards website. Had you actually looked at the tabs or read my previous post more carefully, I think that would have been clear.


  • Administrators

    I think maybe he's talking about the fact that we don't have the metagame breakdowns (which isn't your fault, we just don't have that data.)

    If 75% of people are playing Shops decks and making up only 30% of wins, that tells a different story than if 1% are playing Shops decks and make up 30% of the wins. Of course, both of those situations would be a problem.

    I'd love to see if/how the data on shops trended down over time, as I suspect a lot of early shops wins came from players assuming the deck was dead, and seriously underpreparing for it. Many people were super excited to play Storm Combo or Doomsday after the Chalice restriction, and most of those people lost. Of course, WotC made the decision before they could have identified any trend, which is a different problem, but one that every format has to deal with.

    I don't think we have consensus as a format on what a problem metagame even looks like. Personally I would love the top deck to be around 25-30%, (even if, in this case, I don't really enjoy playing the top deck), so I looked at the same numbers and said "obviously not a problem!." If you think an optimal metagame has a top deck at 15-20% wins, those very same numbers say "obviously a problem!" Without any pre-discussed target for what a healthy metagame looks like, it's too easy to postrationalize what the data means, even if the data is complete accurate, which in this case I have to believe. (Note that I'm not saying anyone in this thread is doing that, it's just a peril of the sort of discussion we've been having)


  • TMD Supporter

    @Brass-Man said:

    I think maybe he's talking about the fact that we don't have the metagame breakdowns (which isn't your fault, we just don't have that data.)

    Ah.

    If that's the point he's making, then that would render Danny's statement at issue fundamentally unsupportable - since there is no way that anyone could know that "Mentor decks are basically the same percentage of the metagame as all MIshra's Worksho decks combined."

    The assumption in this discussion is that by "metagame," we are referring to Top X metagame (either 3-1/4-0 decks or Top 16/8).

    In fact, if you go back and look at every single metagame report ever, that's what we are talking about - the Top performing decks.

    That said, although we don't have the entire metagame breakdown for most events, there are many in which we do. For example, the NYSEs, Waterburys, many of the Vintage Championships, and at least three of the MTGO P9 Challenge events are data points in which someone (in some cases me, Jaco, or Matt & Ryan), have gone in and counted every single deck in the metagame.

    For example: http://themanadrain.com/topic/146/january-and-february-mtgo-p9-challenge-data

    And:

    http://www.eternalcentral.com/so-many-insane-plays-magic-online-p9-challenge-metagame-analysis/

    From those data points, we've been able to see what % of the metagame these decks tend to be. In my experience from having closely observed this data over time, Workshops are often around 20-25% of the metagame. I think it was about 22.5% of the NYSE 3 last year. It was 22% of the Feb MTGO p9 but about 20% of the January MTGO P9 event.



  • @Smmenen said:

    If you clicked the links I provided for you, you would have seen the answer. It is the complete population of decklists that performed at 3-1 or better. It's not a sample of 3-1 or better decks.

    Steven, I appreciate the condescension, but, by definition, your data is a sample of the full population. More decks than what went 3-1 or 4-0 were played at each event. I didn't misunderstand anything and I do believe it is disingenuous to present only those decks as the full population. A daily requires a minimum of 12 participants - 48 events (in your sheets - I did look) x 12 decks = 576+ decks in the full population. Your data includes 241 "reported decks". The analysis done by @diophan and @ChubbyRain was a full population.

    My issue with the data still exists.
    #1 - The data does not represent a random sample and results in data with huge variation. What I mean by this is that you are looking at many snapshots rather than a continuous stream of data. This causes high variance on its own. Add to that the high variance (which you acknowledge) in a deck's play month-to-month and add to that the high variance of a small data set and you have data that is probably +/- 1 deck (on the conservative side) in every event. What does that mean? You show 72 Shops decks over 48 events. The high variance means that over the next 48 events, we should see 72 +/-48 decks in your data - just based on the variance in the data. That seems about right, too. The data was weighted higher in Jan/Feb and a drop-off was seen in the March data (and at the March P9). 57 decks in 35 Jan/Feb events and 15 decks in 13 Mar events. High variance, but within expectations. [Note, this is why I have no problem with the author say 16%=22% in his article. Ultimately, 24=120 in terms of Shops level of play over 48 daily events. That's what the 'variance' means in real numbers.]

    #2 - Deck and Win % are not the only correlated variables. 'Huh?' Pilots matter over this data set. The premise of your data is that the deck is the dependent variable that leads to win percentage. It would not hold up to a stronger analysis of correlation. From your data set, we can point to 3 Shops pilots, Montolio, The Atog Lord, & BlackLotusT1, who account for nearly 40% of the decks in the Shops population. If 3 players can distort the data to the point of getting a card restricted, we should be able to agree that the data set is too small to base decisions on.

    @Brass-Man is right. The data you put together is missing a key component and maybe someone at Wizards or the DCI has that information and has done a more in-depth analysis. Based on their explanation that came with the restriction, I'm doubtful.

    I also agree that a top deck being in the range of ~30% should be fine and with the small samples that we are subject to, 30% probably means 20-40%. If it starts to creep from there, we have issues.

    Looking at your most recent post, if Shops is historically 20-25% of large events, to use your terms, the DCI's explanation that Lodestone Golem was 'over-represented' is "fundamentally unsupportable" unless they are willing to outline what the ideal metagame looks like...


  • TMD Supporter

    @Fred_Bear said:

    @Smmenen said:

    If you clicked the links I provided for you, you would have seen the answer. It is the complete population of decklists that performed at 3-1 or better. It's not a sample of 3-1 or better decks.

    Steven, I appreciate the condescension, but, by definition, your data is a sample of the full population. More decks than what went 3-1 or 4-0 were played at each event. I didn't misunderstand anything and I do believe it is disingenuous to present only those decks as the full population.

    First and foremost, if you are defining the full metagame as every deck played, then that renders Danny's claim not only unsubstantiated and unsupported, but fundamentally unsupportable as unknowable.

    The issue being debated is my disputing the claim that "mentor decks are basically" the same portion of the metagame as Shops. I find that provably false, as an empirical matter.

    So, if you wish to redefine the "metagame" as every deck played, then that only strengthens my critique.

    But, based upon Danny's data set and my response, neither Danny nor I were defining "the metagame" as the every deck played. Rather, the "population" was the top performing decks. In MTGO dailies, this was defined as 3-1 or 4-0 decks. In my larger data set, this included Top 8 paper tournament results and Top 16 MTGO premier event results.

    But, to reiterate, the "population" in our data was not every deck, but only the top performing decks. That's how were both defining the population - as the top performing decks.

    This is not a novel concept.

    Take a look at the metagame report archive: http://themanadrain.com/topic/138/vintage-metagame-data-archive

    When, in 2004, Phil Stanton, posted an article titled the "April 2004 Type One Metagame Breakdown," he looked only at Top 8 data.

    Or, when in 2011, Matt Elias posted an article titled "The Q1 Vintage Metagame Report," he looked only at Top 8 data.

    In both cases, the titles of the articles and the discussions used the term "Vintage metagame." Not "Vintage Top 8 Metagame."

    In the context of these discussions, it's well understood that we are discussing top performing decks, not the entire set of decks played. Danny's own data set makes that clear.

    Now, if I had advanced a claim that you were now disputing using that logic, then maybe you would have a leg to stand on. But my only reason for participating in this thread is to dispute a very specific claim presented in the article that this thread is about.

    #1 - The data does not represent a random sample and results in data with huge variation.

    That's not what random sampling is. Random sampling is a statistical method that is used to try to understand a population that is too large to count feasibly So, instead of polling every possible voter, campaign pollsters use samples.

    Not only is it not a sample, there is nothing "random" about this data. It's a complete population of top performing decks.

    What I mean by this is that you are looking at many snapshots rather than a continuous stream of data.

    Uh. No, I'm limiting most of my data to Q1, but within that set, it's a fairly continuous stream of data. Yes, I imposed some parameters on it (the Q1 of 2016), but you have to do that to any data. The notion that it's "merely a snapshot" suggests that it's some sort of inherently biased sample, when it's the exact same methodology that Vintage metagame reporters have used since 2003.

    #2 - Deck and Win % are not the only correlated variables. 'Huh?' Pilots matter over this data set. The premise of your data is that the deck is the dependent variable that leads to win percentage. It would not hold up to a stronger analysis of correlation. From your data set, we can point to 3 Shops pilots, Montolio, The Atog Lord, & BlackLotusT1, who account for nearly 40% of the decks in the Shops population. If 3 players can distort the data to the point of getting a card restricted, we should be able to agree that the data set is too small to base decisions on.

    This is off-topic to the issue I was debating here, but you are calling for a standard for restriction that Wizards is in no way obligated to follow.

    By saying that "a data set is too small to base a decision" you are explicitly saying that Wizards either should not restrict or is unjustified in restricting unless they have a certain quality of data.

    That's just false. That's not to say that I don't think that Wizards shouldn't use data in making decisions. I've been arguing that for years. In fact, I argued that in 2003 in one of my earliest SCG articles.

    But Wizards, like many real world policy makers, have imperfect data sets when making policy decisions.

    Do you think that Federal Reserve has every data set it would like in setting the federal funds rate?
    Do you think that the President has every data set he wants in making military policy (see this month's issue of the Atlantic and the tremendous uncertainties in his Syria policy).

    As I already pointed out, the problem of "individuals" skewing data sets already exists in paper magic, and it's just as true on MTGO. But that doesn't mean that the data can't be used to make banned and restricted list decisions or that doing so is somehow less valid. Wizards is perfectly justified in using imperfect data to make policy decisions, just as much as any other real world policymaker.

    I also agree that a top deck being in the range of ~30% should be fine and with the small samples that we are subject to, 30% probably means 20-40%. If it starts to creep from there, we have issues.

    Looking at your most recent post, if Shops is historically 20-25% of large events, to use your terms, the DCI's explanation that Lodestone Golem was 'over-represented' is "fundamentally unsupportable" unless they are willing to outline what the ideal metagame looks like...

    Complete nonsense. Wizards has no duty to outline what an "ideal" metagame looks like. Moreover, Wizards has access to all of the MTGO data. It certainly is the case that they can look at the overall composition of Workshops in these metagames, and then see how they are performing relative to their metagame presence. There is not a shred of tangible evidence to doubt that's exactly what they did here.

    In any case, that part of the discussion is a non-sequitur. I'm not debating the validity of Wizards decisions. I'm debating the validity of Danny's claim regarding Mentor and Shops.


  • TMD Supporter

    On a different but related topic, I am curious Danny about your different suggested sideboard choices between the Storm and Doomsday, and especially the different anti-dredge, anti-workshop and "insulation packages" (defense grid/city of solitude/xantid swarm).

    It isn't obvious to me why you wouldn't play more similar sideboards, particularly when the effects are similar. What were your thoughts?

    P.S. I'm annoyed at you for mentioning City of Solitude. I've been thinking about that technology for a while, and was looking forward to catching folks off guard.


  • TMD Supporter

    Great datasets and breakdowns. I'm sure we all appreciate the efforts people put into it.

    I do wonder the value of tracking "Top 4" and "Top 2." In such small datasets, is this really relevant, as I think it leads to a warped perception. Top 16 or Top 8, yes. But Top 2/4 seems much less relevant and leads to overimportance of data.

    I also question the dismissal of @Fred_Bear point about how three players made up 40% of the success of one deck. You can't say that 6% is incredibly relevant in one area, and then dismiss the enormous impact that these three players have had on the MTGO 3-1/4-0 population numbers. I think this level of repeat/consistent success is a rarer phenomenon in paper.

    I just don't see Paper results and MTGO results being directly comparable (drops, lack of proxies on MTGO, tournament times, tournament prizes, etc). That said, with what is available, I think everyone is doing admirable work.



  • Look, you're obviously a smart guy and I'm not trying to dispute that, but just as you accuse others of hyperbole, you seem unwilling to relent on your own use of it...

    @Smmenen said:

    First and foremost, if you are defining the full metagame as every deck played, then that renders Danny's claim not only unsubstantiated and unsupported, but fundamentally unsupportable as unknowable.

    No it doesn't. Statistics are used to draw comparisons and conclusions. Danny is looking at the available data and drawing a conclusion based on the expected variance in the data. Can it be known 100%? No. Can it be known for the sake of an editorial? Absolutely.

    By your argument, the statistics are definitions and that's unreasonable in the terms of the original article's intent (at least by my understanding). It should be obvious to a reader that 16 does not equal 22, but the difference between 16 and 22 is not quite as vast as you want us to believe. [32 decks over 48 events is well within the variance between Mentor and Shops]

    The issue being debated is my disputing the claim that "mentor decks are basically" the same portion of the metagame as Shops. I find that provably false, as an empirical matter.

    By this argument, I could look at the data for 3/13 and claim Shops is 0% of the meta as an empirical fact. It's a crap argument. Statistics work when combined and interpreted with some common sense and you know that. What happened on 3/13 must be viewed within a larger discussion, just as March or February or January or Q1.

    There is nothing "random" about this data. It's a complete population of top performing decks.

    For a given time, you mean. This is a problem with statistics. Population and Sample. You've said before, it's the complete population, except when 2 dailies fire, so it's really a pretty comprehensive Sample.

    What I mean by this is that you are looking at many snapshots rather than a continuous stream of data.

    Uh. No, I'm limiting most of my data to Q1, but within that set, it's a fairly continuous stream of data. Yes, I imposed some parameters on it (the Q1 of 2016), but you have to do that to any data. The notion that it's "merely a snapshot" suggests that it's some sort of inherently biased sample, when it's the exact same methodology that Vintage metagame reporters have used since 2003.

    It may be the same methodology, but is that warranted? Look at February 2016 in your spreadsheet as an example. You have paper data - 8 events ranging from 8-43 participants. The online data is 15 events with a minimum of 12 participants, all less than 32 (would result in 2 4-0 decks). The 8 paper events have maybe 2-3 of the same players appearing in the top decks (60) while the 15 online events have many of the same players appearing multiple times, for example BlackLotusT1 had 5 top finishes out of 72 total decks. The data is fundamentally different and those differences should be accommodated. To simply try and view it through the same lens as paper Magic has been viewed for the last decade doesn't seem right.

    I mean, you can certainly do it, but what are the implications?

    By saying that "a data set is too small to base a decision" you are explicitly saying that Wizards either should not restrict or is unjustified in restricting unless they have a certain quality of data.

    That's just false. That's not to say that I don't think that Wizards should use data in making decisions, but Wizards, like many real world policy makers, have imperfect data sets.

    Seriously? Again, I think you know what I mean. The MTGO data can be manipulated to present any number of "empirically correct" arguments (i.e. anything involving FoW, Mental Misstep, and Ingot Chewer). That doesn't make for useful decision making. The Federal Reserve may not have every data set it would like in setting rates, but they don't look at hockey shooting percentages or baseball batting averages. They try to make sense of the data set that they have and work to make it as strong as they can. If a data point doesn't fit, I guarantee that the Federal Reserve doesn't key on that data point to drive decisions.

    This is the issue that many people seem to have. 12/60 (20%) Shops decks in paper (February) compared to 27/72 (37.5%) decks in MTGO (February dailies) could lead to much different decision making. The question becomes - which is more accurate? It's up to them and that's fine, but then either they can explain the reasoning or they can live with us questioning the methodology. I doubt they lose any sleep over whether or not I question something, so they will continue to do what they want...

    As I already pointed out, the problem of "individuals" skewing data sets already exists in paper magic, and it's just as true on MTGO. But that doesn't mean that the data can't be used to make banned and restricted list decisions or that doing so is somehow less valid. Wizards is perfectly justified in using imperfect data to make policy decisions, just as much as any other real world policymaker.

    To use your terminology, this is empirically false. Using February 2016 as an example, the paper Magic data is not nearly as skewed by individuals as the MTGO data over the same time period.

    And you are right, Wizards/DCI can use whatever data they like, but we are also free to question their interpretations and decisions when they conflict with alternate data analysis.

    I appreciate the analysis that you put together. I just don't think it's the whole story because you can easily be led to alternative conclusions. I understand that you are looking at it with the same historical perspective as has always been done, but I think that's skewed by the type of data MTGO generates.

    As for the original argument, using your definitions, Danny probably overstated his case. If we read it and assume Danny is extrapolating the data to describe the metagame in broader terms (as Wizards does in their B/R announcement), I think he's justified. Again, based on the expected variance in the daily events data and the level of play Workshops see in larger events (paper, P9 challenges), it is reasonable to believe that Shops is played about the same amount as Mentor. I guess I should clarify that this can never be 'known' 100%, but based on the data available, I would've bet that for the next P9 Challenge Shops and Mentor would be within 5 decks of one another (post-Golem restriction, I expect Mentor to go up in comparison to Shops).


  • TMD Supporter

    @Fred_Bear & @Smennen

    I'm not sure what you are even debating anymore. The data speaks for itself. We can disect it and criticize analytical methodologies ad nauseum, but at some point the head of the pin becomes overcrowded with angels.


Log in to reply
 

WAF/WHF

Looks like your connection to The Mana Drain was lost, please wait while we try to reconnect.