[Free Article] What Now



  • I think this disagreement is caused by a difference in intent on what is being reported. My understanding is that Steve is measuring Top X finishes in reported tournaments, with an eye towards the following: if a certain archetype is more than some percentage of these reported finishes, then it may warrant a restriction. There is no predictive element in what he is doing; he is saying that Shops was this percentage, mentor was this percentage, and so forth.

    Fred_Bear, from my understanding, is interested in at least one of the two following questions:
    (1) If an event were to happen tomorrow (ignoring the B&R announcement): what can we predict the Top X metagame will look like, and with what error?
    (2) Using the Top X data that we have available, can we extrapolate what the entire meta looked like?

    All of these are fine things to care about. Personally I find the extrapolation in question (2) problematic since as Fred_Bear has mentioned, assuming the Top X is a random sample is flawed. The data that Matt and I have published makes that clear.

    Anyway, as far as I can tell you are both talking through each other because you are after different aims. What exactly Danny meant when he said "they occupied basically the same percentage of the metagame as all of the Mishra’s Workshop decks combined" is another question. Personally I feel like this was an imprecise statement not in proportion to this discussion.

    Anyway, probably not going to reply again since I find this back and forth a bit silly.



  • Agree with Ryan @diophan. There is a difference between "percentage of the metagame" and "percentage of winning decks" (whatever your criteria for winning might be, i.e. top X or X-1).

    If you look at the "percentage of the metagame" for the P9 events, there were a total of 58 Gush decks and 55 Shops decks total in the January, February, and March events. There were 261 decks, so you are comparing 22% to 21% which is "basically" the same. If you look at "percentage of winning decks", it's a different story...Shops decks have largely been more successful than Gush decks at least on MTGO as Steve's data shows.

    What this means is up to interpretation...but this discussion does seem to be much ado about nothing.



  • I feel that enough has been said on the subject such that readers are free to make their own interpretation. I don't think anything more has to be discussed about the disagreement.


  • TMD Supporter

    @diophan said:

    I think this disagreement is caused by a difference in intent on what is being reported. My understanding is that Steve is measuring Top X finishes in reported tournaments, with an eye towards the following: if a certain archetype is more than some percentage of these reported finishes, then it may warrant a restriction.

    If this were a discussion in the thread about the restriction, you would be correct. It is also true that that is why I measured the Top X finishes.

    But my intent for posting in this thread has nothing to do with whether the DCI was justified or warranted in restricting Golem, whether Shops or Gush or Mentor or anything else would have performed at the same levels of different levels in the future, or anything like that at all.

    My primary concern here has nothing to do with prediction, the evolving metagame, or even the DCI.

    Rather, all I was trying to point out was something very simple and much more granular: that Danny's particular statement regarding the overall representation of Mentor v. Shops in the metagame was factually untrue.

    Yes, I believe you are correct that FredBear is interested in something different. And, on that point, I don't necessarily disagree. But I'm trying to avoid getting off-topic. There are already threads for those kinds of discussions.

    My concern is with respect to how Vintage pundits, commentators, and analysts here, on social media, on the VSL, or, frankly, anywhere, are too quick to make empirical claims that are either untrue or unsupportable. (In this particular case, I felt (and still believe) that Danny was playing a bit fast and loose with the facts. By his own admission regarding the data he used, this critique appears to be well founded. )

    It was that very concern that led me to monitor and publish Vintage metagame results in my quarterly Metagame reports from 2007 to 2011. Since the DCI largely left the format alone from 2009 to 2014 (save for an annual unrestriction), the kinds of conversations in which such claims were likely to arise less frequently occurred, and therefore the impetus for that kind of data collection was diminished.

    @ChubbyRain said:

    Agree with Ryan @diophan. There is a difference between "percentage of the metagame" and "percentage of winning decks" (whatever your criteria for winning might be, i.e. top X or X-1).

    If you look at the "percentage of the metagame" for the P9 events, there were a total of 58 Gush decks and 55 Shops decks total in the January, February, and March events. There were 261 decks, so you are comparing 22% to 21% which is "basically" the same. If you look at "percentage of winning decks", it's a different story...Shops decks have largely been more successful than Gush decks at least on MTGO as Steve's data shows.

    I agree with your statements here. And if Danny had been using this data when making his claim, it would have had much more validity.

    But Danny presented his data sources that he relied on, and it wasn't the overall MTGO Premier metagame. In fact, he specifically pointed out that he didn't look at the premier events at all.

    In addition, as I mentioned earlier in this thread, there is an idiomatic shorthand that is broadly employed when discussing the "vintage metagame," and that is that we are generally referring to Top performing decks.

    As I said:

    When, in 2004, Phil Stanton, posted an article titled the "April 2004 Type One Metagame Breakdown," he looked only at Top 8 data. Or, when in 2011, Matt Elias posted an article titled "The Q1 Vintage Metagame Report," he looked only at Top 8 data. In both cases, the titles of the articles and the discussions used the term "Vintage metagame." Not "Vintage Top 8 Metagame." In the context of these discussions, it's well understood that we are discussing top performing decks, not the entire set of decks played. Danny's own data set makes that clear.

    What this means is up to interpretation

    Even if that were true, and I don't believe it is, then that means we should be debating the merits of each interpretive methodology, which, to some extent, has been occurring, but not as fully as perhaps could be or should be.

    ...but this discussion does seem to be much ado about nothing.

    It may seem excessively picayune to focus on such a specific statement in the context of a much broader and more ranging substantive article, let alone the broader discussion of DCI policy, but I don't think so.

    Holding authors (myself include) accountable for the veracity of their claims is actually a vital function for the Vintage (and magic) community. It's clear that, according to the data that Danny relied upon to support his statement, his claim is a fairly significant exaggeration at best (and that's offering a very generous reading).

    Whether we characterize Danny's statement as exaggeration or falsity, in some sense, is not actually what matters most. What matter is that readers of Danny's article could easily walk away with a very incorrect impression relying on Danny's statement, including members of the DCI (even more so if they understand, as most readers probably do, that by "metagame," Danny meant "top performing decks.")

    In fact, the fact that the DCI, by its own admission, relies heavily on Vintage community members for data and information means that the responsibility for Vintage authors like Danny to "get it right" is even greater. See, for example, Ethan's comments to Danny's article.

    Beyond the simple principle of holding authors accountable for the claims they make, there is also another reason to care. Most of the DCI B&R discussions seem to take on factional lines, as people form on one side of an issue or another (despite the fact that there are often fence sitters, like myself).

    To the extent that factual claims like those presented and debated here are used as ammunition in these partisan debates, then the principle of veracity becomes even more important.

    As I mentioned before, one reading of Danny's article is a kind of critique of the DCI. Although Danny frames the beginning of his article as a kind of "C'est Le Vie" acceptance, the statement at issue and the surrounding context actually functions, and could reasonably be interpreted by readers, much more as a critique of the DCI. (His proposal for a liaison committee does nothing to dispel this reading).

    In the other thread, there are many posters who attack the DCI's "over-representation" claim. Danny's statement feeds that critique. If it's false, then it should be called out as such, because it unfairly and unjustifiably diminishes the credibility, and therefore, legitimacy of the DCI.

    It should not escape notice that, while some people unjustifiably interpreted my critique of Danny's statement as "an attack on his credibility," his statement, to the extent that it is false, or even an exaggeration, it functions as an attack on the credibility of the DCI.

    Lest we think that I am overly concerned with the esteem of some faceless and shadowy corporate entity, let's not forget that, as I said before, "the legitimacy of the format depends on the credibility of the DCI" as it's authority and policymaker. I, for one, prefer to have Vintage respected rather than not.

    Danny said:

    With these numbers Mentor adds up to be 16% and Shops adds up to be 22%. For me personally, 6% is close enough where I feel comfortable using the word "basically", but I can understand if some people disagree. From now on I will definitely be more clear with my words in order to prevent misunderstandings like this, and I thank you for your feedback.

    Danny, since you acknowledge that this is a misunderstand, and you propose to try to avoid misunderstandings like this in the future, I respectfully request that you issue some sort of correction in this article so that future readers of your article will not misunderstand your meaning, even unintentionally. Thanks.



  • I just wanted to point out that, as someone very new to Vintage, @Smmenen 's deconstruction of "shops and mentor are the same" seems very important. Reading the article, I left with the impression that the restriction was unjustifiable, at least for the reasons the DCI cited. Looking at the data, I got an entirely different impression - that we simply don't have a complete enough picture of the entire metagame to really be making judgment calls here. Given the available data that @Smmenen presented....yea, the DCI's choice makes some sense. I may still disagree with it, but it's hard to argue with data we can't see.



  • I'm given to understand that the DCI does no playtesting of Vintage. Really? A huge outside point we're missing from this discussion is, why the heck doesn't the DCI just do some playtesting. Big data is great and all, but if this is the governing body of the format, oughtn't they know the format? Having card company folks who don't actually test the format be the ones deciding which cards are too powerful is like FIDE being made up of the carpenters who carve chess sets, or letting the komi be set in Go by a select group of stonemasons. DCI really needs to do some non-zero amount of testing, if for no other reason than to demonstrate that a minimal effort at real understanding has been made.



  • I volunteer to play "bad cop" on the team.



  • @OurLadyInRed said:

    I just wanted to point out that, as someone very new to Vintage, @Smmenen 's deconstruction of "shops and mentor are the same" seems very important. Reading the article, I left with the impression that the restriction was unjustifiable, at least for the reasons the DCI cited. Looking at the data, I got an entirely different impression - that we simply don't have a complete enough picture of the entire metagame to really be making judgment calls here. Given the available data that @Smmenen presented....yea, the DCI's choice makes some sense. I may still disagree with it, but it's hard to argue with data we can't see.

    I'm not new to Vintage and I also agree totally with this.

    What really is weird, though, is that Steve presented his data, and Danny presented his data, the point was made that Steve had more data and this put Danny's conclusion into perspective, and that was all done in the first few posts of this thread.

    Since then, this thread seems to have spiraled rapidly into a battle over semantics. I learned nothing new after the posts that presented the actual data. Things just got insufferable. Despite what the Objectivists may say, debates are NOT always better conducted by digressing to first principles.

    @Topical_Island said:

    I'm given to understand that the DCI does no playtesting of Vintage. Really? A huge outside point we're missing from this discussion is, why the heck doesn't the DCI just do some playtesting. Big data is great and all, but if this is the governing body of the format, oughtn't they know the format? Having card company folks who don't actually test the format be the ones deciding which cards are too powerful is like FIDE being made up of the carpenters who carve chess sets, or letting the komi be set in Go by a select group of stonemasons. DCI really needs to do some non-zero amount of testing, if for no other reason than to demonstrate that a minimal effort at real understanding has been made.

    They don't. They literally just announced this recently:

    http://magic.wizards.com/en/articles/archive/making-magic/odds-ends-shadows-over-innistrad-part-1-2016-04-11
    @MarkRosewater said:

    One of the challenges of making a Magic set is there are so many different formats for which we must make cards. Different players want different things, so we work hard to make sure that every set has something to offer most players. The three formats we spend the most time focusing on are Sealed, Booster Draft, and Standard. We focus on Sealed because that's the format of the Prerelease and we want to make sure the set creates a good first impression. We focus on Booster Draft as that's the most-played Limited format and is the way a lot of people play (especially on Magic Online). We focus on Standard as that's the most popular Constructed format.

    We're aware of other formats and definitely think about them on a card-by-card basis as we design and develop the sets. For instance, when designing legendary cards, we often will think about their impact on Commander. (I should note, though, that there are other fans of legendary creatures who don't play Commander, so not every legendary creature is optimized for that format.) We do try to ensure that each set has cards that will have a chance of seeing play in older formats. The larger (and older) the card pool, the harder that task is.

    What we don't do, however, is playtest formats outside of the three I named above. We have a limited number of playtesters and a limited amount of time, so we have to focus our energies. Older formats have a banned list (and a restricted list for Vintage), which we can use as a tool to adjust those formats if something gets broken. So how much thought was given to Eternal formats when designing Shadows over Innistrad? A little, but not a lot.

    That makes sense to me. They think about eternal but they have a limited amount of people they are willing to pay to play Magic (can you blame them?) so they want those people using all of their time to make the biggest markets the best play experience.

    EDIT: I recently asked Floral Spuzzem for advice on another topic, so I asked him what his opinion was on the quality of data used to support the restriction of Lodestone Golem.

    alt text

    I think his perspective is inherently biased, however, given his role as artifact removal. I suggest we now have 10 posts with data-driven analyses supporting and refuting this position.



  • @Topical_Island I actually prefer that they do not test for eternal formats. I'd rather we build meta-games out here in the wild than have Wizards attempting to artificially create what they think Vintage should be; and as they stated if something just ends up being too busted they can just ban/restrict it.



  • @themonadnomad I love your phrase... in the WILD. It is a feral format! I guess I'm just thinking that if they are going to be interfering (restricting things) anyway... it would be nice for them to at least send some scouts into our ecosystem once in awhile. Otherwise I don't see how any of their wildlife management choices can be justified. But yea, I'm all for letting the predators roam free as much as possible.


 

WAF/WHF