# Measuring Archetype "Inequality" or metagame balance

• @zias Exactly, Gini-Simpson is just another name for 1 minus Simpson index.

Just not sure why you want a measurement that entangles two different ideas.

Richness seems pretty easily defined simply by the number of available archetypes, no?

So you'd have a Gini index for the inequality among the top decks, and a "richness" index that is just the number of unique archetypes.

Because both elements matter. If we just measured 'inequality', we wouldn't actually be measuring diversity. So, if you had only two 'species,' but they were equal, that's not actually reflective of a diverse format, even though it may be balanced. I wanted a measure that captured both diversity and balance.

As I said in post 5 of this thread, "what I like about the Simpson Diversity Index is that it measures BOTH 1) Richness, and 2) Evenness." As that biology website I linked above put it.

So, take a look at this link.

Now scroll down, and look at the value below the Simpson value, called Dominance, or (1-Simpson). The website doesn't call it Gini Impurity, although I understand that's the same calculation. Probably just semantics?

Also, look how useful this tool is, as it creates a visualization at the bottom.

• @smmenen For what I can read "Gini-Simpson index" and "Gini coefficient" are two different things.

"Gini-Simpson index" is 1 minus "Simpson index" and also name "Dominance index" in your link. It's is the one I was refereeing too.

"Gini coefficient" is something else.

But at the end it's just semantics.

That's what I said.

Gini Coefficient is a value that measures inequality. Gini Impurlity is a completely different thing. The Wikipedia entry even says "NOT TO BE confused with Gini Coefficient."

But what I was saying is that the web calculator doesn't call 1-Simpson Gini Impurity. It calls it the "dominance index."

Again, thats why you have two measurements. One is the unique decks, another is a measure of eveness. By using a measurement that is entagling them how are you determining whether the problem in a particular metagane is too few decks or not enough spread amoung decks?

You supplement the primary measure with secondary measures. But my goal, as clearly explained in the OP, was to find "a formula that could detect, and scale, a metagame that is more 'equal' and balanced, and one that is more 'unequal' and imbalanced."

This does that perfectly.

• Honestly, I think "balance" is really a shorthand for saying "win rates". It's how I use it. It's basically how wizard's has used it in their most recent B&R reports.

Number of times used in the Temur Energy bannings:
Balance: 0
Win percentage: 14
Win rate: 0

Number of times used in the Aetherworks Marvel ban:
Balance: 0
Win percentage: 3
Win rate: 3

I think what you are seeing is a refinement of the terminology used as Wotc has moved to more data analysis and away from top 8 results. "Balance" was used as a subjective measure of a decks win rate, often derived from top 8 results, which is what they relied on before.

So for me, the interesting thing will be to see how closely your statistical analysis of the challenges matches the data we would have from champs (if the data can be compared).

• @ChubbyRain

A few things:

1. Everyone agrees that win % and win rates are the best possible metric for assessing deck performance, but there are two issues with this:

a) we lack this on a regular basis, your Vintage Champs analysis and my mid-October Vintage Challenge analysis are the exceptions that prove the rule, where we actually have win % by archetype.

b) Win % or win rates don't actually tell us that much about the overall shape and scope of the metagame. They don't tell us about diversity. They aren't a metagame metric, per se. You could have a large or small number of decks, and the win rate of any particular deck wouldn't tell us much about that.

1. Language used by Wizards: I agree with your point that Wizards is constantly refining their terminology. BUT, and this is a big caveat, the last time they restricted cards in Vintage, they specifically cited Vintage Challenge Top 8 data, not win rates or win percentages.

https://magic.wizards.com/en/articles/archive/august-28-2017-banned-and-restricted-announcement-2017-08-28

"Data from twelve recent Vintage Challenges reinforces this, with 40% of the Top 8 decks being Shops and 30% being Mentor. Both decks feature strategies that are powerful, stifle diversity, and can be frustrating to play against."

Before I read up on the Simpson Diversity Index, I was thinking about creating a "Menendian Index" that would be a mashup index of different indicators; possibly 1/3 the range of decks in Top 8s, 1/3 a Gini Coefficient-like variable that measures inequality, and 1/3 perhaps something else.

But when I read up on the Simpson Diversity Index, realizing that it is sensitive to BOTH the range of strategies in a metagame AND the relative proportions of those strategies in the field, I realized it was the perfect holistic measure for what I was looking for.

Balance is obviously a metaphore that we are applying to Magic metagames, but balance by itself refers primarily to inequality. The primary image associated with balance is a scale or teetertotter. The problem with balance, by itself, is that the metaphor of balance doesn't include the range of decks. So a 2-deck metagame could be balanced, even though such a duopoly is bad for the format. My OP had two hypotheticals that illustrate two different extremes.

The Simpson Diversity Index is perfect because it accounts for both 'inequality' and for 'diversity.' Both matter.

TLDR: terminology is tricky here. We don't just care about one thing: we care about diversity AND balance, evenness and abundance, inequality AND range. And all of these concepts and terms are conceptually related, but also different.

In any case, I will do a write-up of my findings.

WAF/WHF