12 coaches online • Server time: 06:22
* * * Did you know? The best blocker is Taku the Second with 551 casualties.
Log in
Recent Forum Topics goto Post Gnomes are trashgoto Post FDL only 3 spots lef...goto Post Secret League Americ...
ArrestedDevelopment
Last seen 4 days ago
ArrestedDevelopment (29447)
Overall
Super Star
Overall
Record
13/2/6
Win Percentage
67%
Archive

2021

2020

2020-03-30 20:10:00
rating 5.5
2020-02-04 23:58:41
rating 5.3

2019

2019-04-25 08:37:26
rating 5

2018

2018-10-06 23:18:42
rating 5.2
2018-02-10 10:02:48
rating 5.4
2018-02-08 18:08:47
rating 5.3
2018-02-08 16:45:25
rating 6

2017

2017-12-06 17:58:24
rating 5
2017-08-01 12:10:03
rating 4.8
2017-06-01 11:25:17
rating 5.3
2017-05-15 16:43:37
rating 4.9
2017-04-09 15:49:49
rating 5.2
2017-02-06 00:28:11
rating 5.4
2017-01-04 00:11:48
rating 4.6

2016

2016-12-22 18:43:50
rating 5.4
2016-12-01 18:03:22
rating 5.4
2016-10-01 09:43:17
rating 5.4
2016-08-01 13:09:29
rating 5.5
2020-02-04 23:58:41
26 votes, rating 5.3
CR musings
This blog is prompted by the trend of people to complain about CR loss on ties, or massive losses on a loss using an inferior race. It's not an attempt to rubbish the CR system because a lot of thought has went into it and frankly, it functions about as well as you can want it to, albeit in a way the vast majority of people chasing high CR rankings seem to be completely ignorant of, or deliberately ignore when it suits them.

First of all, let's actually look at the CR calculation (direct quote from Christer):

Definitions
CR = Coach Rating. A starting coach has a CR value of 15000 (no fractions allowed), scaled by 100 on display to 150.00.
CR' = New CR (ie, the CR after a match has been processed)
k = An amplification factor designating the effect of a match result on CR
S = Factor for the actual result in the game.
p = Win probability of the match.

The Math
The core of the rating system is based on the Elo system, which has been heavily modified for the unique parts of Blood Bowl (primarily that teams are not equal). Note that this is calculated twice per match (and CR type): Once for each coach involved in the match.

We start with calculating the win probability (p) for a given match:

CR_coach = CR of one of the coaches
CR_opponent = CR of the opponent

We then compute a weighted CR difference:

CR_diff = CR_linear * f_1 + (CR_linear * f_3) ^ 3
CR_linear = CR_opponent - CR_coach
f_1 = 1.0
f_3 = 0.275

We also compute the normalized TV difference for each team:

TV_min = min(TV_coach, TV_opponent)
TV_diff = 100 * (TV_opponent - TV_coach) / TV_min

Finally, we get to the win probability:

p = 1 / (10 ^ (CR_diff / f_CR + TV_diff / f_TV) + 1)
f_CR = 400
f_TV = 70

Next, we define the S value, which is the actual result of the match:
S = 0.0 for a loss
S = 0.5 for a tie
S = 1.0 for a win

Then, we calculate the basic amplification factor k:

k = 2 - This is the base k value that is used unless the exceptions below are in play.

We then evaluate the match:
outcome = (S-p)
bracket_diff = bracket_coach - bracket_opponent. This is a numerical value between 1 and 6 for the different CR brackets (experienced through legendary). Additional detail about these brackets is posted below.

k = 2 + abs(bracket_diff) / 2, if bracket_diff < 0 and outcome > 0 or if bracket_diff > 0 and outcome < 0
This means that if the coach did better than expected vs an opponent in a higher bracket, or worse than expected vs a coach in a lower bracket, the k value is amplified to somewhere between 2.5 to 5.5. In short, we amplify the result if there's a clear upset.

k = 2 - 2/(5 - abs(bracket_diff) / 2), if bracket_diff > 0 and outcome > 0 or if bracket_diff < 0 and outcome < 0
This means that is the coach did better than expected vs an opponent in a lower bracket or worse than expected vs a coach in a higher bracket, the k value is dampened to somewhere between 1.2 and 1.56. In short, we dampen the result if there's a clear non-upset.

After all this, we are ready to calculate the CR change:

CR' = CR + k * (S - p)

For each match, this is repeated a total of 8 times (overall, twice for the overall divisional one and the division/race specific one; all computer for each coach in the match).


and here are the coach brackets

So basically, we have a system that is measuring your outcomes vs expected outcomes. Or simplified, your ability to win vs the expected probability you should.

All well and good. The first things you should notice are TV difference is normalized - this means when you play a TV gap game with a significantly developed team, it doesn't really matter if the gap is 300-500k TV+ if you are already around 1800k tv. To the algorithm a 300k uphill gap with an 1800tv team isn't any different from a 160k gap at 1000k tv, which while it may be a bit rough, most would agree is far from insurmountable, and in a significant coach gap game almost entirely irrelevant. Even so, it should be noted that at low TVs, the matching rules on the site in R and the scheduler algorithm in B will actually prevent such a game being made at low TV - this is not to be taken that such games are mathematically or logically undesirable, but is simply a byproduct of the fact we have many old teams of varying TVs that could produce a game that would mentally be upsetting to a coach to face.

Nextly, we notice the starting CR is much, much more heavily weighted than the TV of the teams. Again, this is completely logical - your ability to win with a team is dependent on several factors, all of which impact a game much more significantly than the actual TV of the teams involved in the game itself. Especially since if you are winning to the extent you have a very high CR, we can safely assume you are not making inefficient skill choices and have a finely tuned team. We can also assume you are skilled enough at the game to select the "proper" inducements to have the largest direct effect upon the match.

S value is very simple and should not need explaining.

k is where the wailing and gnashing of teeth may come from for those with an unwillingness to accept that a tie in a game they may mentally dislike was an improbable result. Effectively, if the coach gap in a game is large enough, the amplification factor will cause a relatively large drop in CR upon a tie. Because the expected outcome is that you #winanyway.

Lastly we will note that there is zero compensation anywhere for racial tiering. This is by intent - the system is not trying to rate your ability to play the game with goblins, it is measuring your ability to win in general, and thus the calculation is going to heavily punish you if you use a stunty race as a legend and tie vs an experienced coach. Some of you may question this, but it is, again logical - over a prolonged stretch of using stunties your CR will normalise: that is to say, as you lose more games with them, your CR will drop to reflect your lowered ability to win games. If you do not lose with them your CR will change in accordance to the probability that you beat your opponents, which may, or may not, still result in an overall net loss in CR.
In effect, you should see stunty races for what they are - actively handicapping yourself in pursuit of harder games.


But AD, you don't care about CR, why are you posting this?

Well, I don't care about CR in the sense I am not actively trying to game the system to earn it, but I do have a look at it from time to time because it does actually reflect how you are playing. Which is to say in context: it's a good form guide - am I meeting expectations?

As you rise up the the CR rankings, the expectation that you win any one-off game vs someone of lower rank is high. The expectation that you win a game of the same rank but lower CR is also not insignificant. There are always external factors you can use to "justify" any significant loss of form or any loss in a single game - but these are either things we simply accept, or things that without being too harsh, may stem from user entitlement.

If you consistently make the correct decisions, build your teams efficiently, and select the proper inducements in each match, your CR will rise, because if you're consistently doing this, you'll be winning consistently too. It is that simple. Deviating from such things - taking every stat you see early, using poor races consistently, making errors in games, will of course result in CR losses (some of them temporary, sub-optimal initial skill choices may be optimal long-term after all), but these are decisions you make that are, simply put, affecting your ability to win.

If you find yourself consistently losing significant CR in games you tied, instead of questioning the algorithm, perhaps the first question you should ask is if you are undervaluing your equity in each game. Because it should also be noted, given the statistics on win% in TV-gap games on FUMBBL, the weighting on the TV-gap is in fact probably too great.

Lastly of course, if your opponent is, in fact, very good, but currently has a much lower CR than normal due to a protracted period of losses (for whatever reason)... well, that's life, a longer-term algorithm might remove that side effect, but would also potentially leave FUMBBL with a relatively static top 50. Such matches are somewhat infrequent and rapidly recovered from.


The top 10 on FUMBBL is not "who is best at bloodbowl" it is related to skill, but subtly and suitably different given the environment. And of course, at particularly high CRs, it becomes quite difficult to maintain given the variety of opponent. That is by design.
Rate this entry
Comments
Posted by spinball on 2020-02-05 00:36:30
That is very in depth, I think I am going to have to reread this a few times...
Posted by erased000047 on 2020-02-05 02:20:07
this was a "musing".. wow lolz thanks for the info AD :) cheers
Posted by cdassak on 2020-02-05 08:26:58
Thanks for the time and effort AD, good job! Only the SR Rankings matter, it is known :D
Posted by dorfeus on 2020-02-05 09:00:36
Every time I peak with CR I only accept games like this:
https://fumbbl.com/p/match?id=3991274
Posted by Rawlf on 2020-02-05 10:51:33
"Nothing to lose". Good idea dorfeus, bad example though. That must have been a tournament draw. ;)
Posted by Strider84 on 2020-02-05 11:09:14
so if I want to increase my CR I have to win! Got it :-)
Posted by Rawlf on 2020-02-05 11:17:48
That was a very interesting read, AD, much appreciated! Not much news to me but still enlightening when it is all spelled out like that.

My personal discord with CR is how races are disregarded. Because of this I used to refer to CR as 'Choice Ranking', though that is an exaggeration. A good coach with a bad race will be higher in CR than a bad coach with a good race.
Still, from my understanding, Christer left races out of the equation not so much because he did not want them in, but because his data showed that races have an insignificant effect on a matches expected outcome. Which is something I could never get my head around, especially now, playing Ogres. But I made my peace with it, resting on the fact that Christer is several times smarter than me and knows very well what he is doing.

A subject that imo has more meat for agitation than CR is TV though. Or more precisely people's misconception of what it is. "Fair game" - *shudder*. Maybe I should write out my own TV musings, but I guess it would quickly degenerate into TV rantings. :)
Posted by koadah on 2020-02-05 11:24:28
VoodooMike once put forward a formula for 'more accurate' CR. With his formula, you could actually lose points while winning if you didn't beat a weak opponent heavily enough. :)
Posted by SzieberthAdam on 2020-02-05 11:38:28
Good write. Not long ago, Christer toyed with a machine learning way of calculating the win probability. The results showed that CR is the most significant factor, while TV difference has almost no effect. Can't remember whether races were considered by the AI.
Posted by Augustine on 2020-02-05 12:28:24
Using comparative coach ability and racial strength as variables to calculate the outcome of CR is the obvious and correct way to do it. And I think in this respect the score calculation works fine majority of the time.

however the anomaly and obvious time it fails is in representing the ability of the world class coaches who predominantly use stunty in which the case the cR is no way representative. In this event when calculating match 'scores' one of the variables(stunty coach CR) is way off and the algorithm falls down.

If you cared enough for this to be a problem then I imagine you would need to start at working out an algo to better reflect a stunty coaches cR rating on the site in general. (a multiplier of cR based on % of games played with stunty?).

I doubt it is worth the effort of thinking of all the considerations.
Posted by MattDakka on 2020-02-05 12:28:30
I don't know what other coaches think, but the possibility of big CR drop when losing with a tier 3 team vs a poor coach with a tier 1 team is a reason I don't play anymore tier 3 teams in the Box.
The effort is not worth the CR gain/loss risk.
Losing using tier 3 is acceptable to me, but losing 2-3 CR points on top of that is not.
Also, claiming that the TV difference has almost no effect on the outcome of the game sounds really weird to me.
As far as I know, in my experience, assuming both coaches are evenly matched in coaching skills and the teams are same tier and not bloated by silly choices, the highest TV team has an edge and tends to win.
People trying to minimize the impact of TV gap on match outcome look wrong to me.
No wonder many strong coaches here adopt the cycling way and play top tier 1 teams for 15 games then retire them, that way they almost surely avoid the TV gaps thanks to rookie protection.
Posted by ArrestedDevelopment on 2020-02-05 12:44:51
@SzA: Christer's initial inputs were CR, TV, Race and division. After several runs he found little to no difference R/B, size of TV gap was a non-factor (although being the bigger team *did* help, the win% was mostly unchanged by gap size) and that CR was the most effective factor. I believe he thought an improvement would be to strip out TV from the formula and actually feed in the rosters/skills/treasury.
Posted by PurpleChest on 2020-02-05 12:50:15
Numbers are there to look pretty, not mean things. If only we stopped attaching meaning to the numbers and starting seeing them like flowers. But then some people insist on competing for the prettiest flowers don't they?
Posted by FinnDiesel on 2020-02-05 13:03:04
Yes, maths.
Posted by ArrestedDevelopment on 2020-02-05 13:04:08
Matt there is enough data from various studies that TV gap size is not an accurate predictor of match outcome, this should itself be obvious because raw TV is an inherently bad predictor of a "fair" game anyway. The very coaches cycling teams at low TV can still play gaps of 100tv+, which statistics have shown to be just as "detrimental" to winning odds as the larger TV gaps seen at higher TVs.

Posted by MattDakka on 2020-02-05 14:13:03
If TV is not an accurate predictor, then let's play in the Box with total random matchmaking, without considering the TV for suitability.
Posted by Joost on 2020-02-05 14:53:46
@Matt, I don't think that's a logical conclusion. The data is from the current meta, so shows it is not an accurate predictor in the current meta only. taking away matching rules might change the meta dynamic and break the relationship. For instance everybody could aim for 3000TV Chaos teams.

Let's be happy to conclude that the current matching system seems to work very well and elimate TV as a factor!
Posted by MattDakka on 2020-02-05 15:27:49
Mine was a provocation, I think that an inaccurate TV matching is still better than not considering TV an important factor in the pairing, and I prefer to play with very small TV gap or same TV whenever I can.
If I tie with a "differently skilled" coach who used a big overdog team (let's say more than 300 TV difference) and still I lose more than 0.25 CR my gut feeling is that the CR system is not accurate.
It's already unfair that the overdog coach monoactivated (unlike me, I don't monoactivate), not offering, in this way, variety to the matchmaking, and more than that, he even gets rewarded by earning CR with a tie.
That seems suggesting to me that CR (COACH ranking) system puts too much emphasis on team building aspect over individual coaching skill.
Don't get me wrong, I'm not dismissing CR system as a bad idea nor dismissing the effort put into it by Christer, I like to earn/lose points so I have a way to evaluate myself, I just think that the system could be honed in order to reward more playing tier 3 teams and to punish less ties vs monoactivators with big overdog teams.
Posted by Joost on 2020-02-05 15:51:01
If you provoke expect a response ;-)

But it's an interesting topic which is why I responded; I think the interesting take away is that the data shows that apparently large TV gaps aren't as important as your own experience tells you (and my instinct was the same btw). If that is the case, CR is still driven by coaching, team selection and skill selection regardless of a TV gap. So no change needed, unless the purpose of CR is changed. Like you, I would love a change that takes team selection out of the equation by correcting for race strength. But it is another discussion entirely.
Posted by MattDakka on 2020-02-05 16:01:15
Yes, no problem with that, Joost :P.

Now that I think more of it, maybe the issue I have arises from this part of the formula:

bracket_diff = bracket_coach - bracket_opponent. This is a numerical value between 1 and 6 for the different CR brackets.

Let's say that the Super Star bracket is 165+ and Legend 170+ (I don't know exactly, but it's for purpose of explaining).

A 164 Star could be as good as a 169 Super Star, but temporarily under his usual score.
If a Star with a CR just a little bit under Super Star ties with a 170 Legend the difference, in terms of CR, is just 6 points, but the 2-bracket difference (i.e. Super Star, Legend) drastically increases the CR won by the Star coach.
Using only the CR value could be more accurate and granular than taking into account in the formula the 6 brackets.
I'm not complaining, just academic discussion.
Posted by Mattius on 2020-02-05 16:24:58
I think Matt makes a good point that if you do want a high CR, then you are detered from playing Tier 3 teams. I've just been playing woodies, dark elves and High elves for the last 50 or so games due to BBT/Majors and my CR has surged. Clearly race to me is a massive factor.

Only suggestion i have on this is to have it NAF style where people are rated more on their individual race performances than their overall.

Should Christer spend any time on this? I would suggest there are far more important priorities for the site and his time, the current CR is good enough.

Oh and as always @Mattdakka, for the love of god, get involved in majors and BBT instead of worrying about CR.
Posted by Christer on 2020-02-05 16:49:23
Without spending too much time, a couple of facts based on the existing set of matches on the site (quite a large sample):

1. TV Difference
TV difference gets you to a 40% win chance at worst if you eliminate the effect of CR.

The CR system doesn't model this and allows expected win chance to go much lower than 40% at higher TV differences. And on the other end, it estimates too high if you're playing the higher team.

What does this mean? If you're playing an underdog, you will gain more CR for a win and lose less CR for a loss, thus giving you an advantage here. Conversely, you're being penalized playing the stronger team in terms of CR gain and loss. The system would simply overestimate your chance to win, and create less favourable CR difference if there's an upset.

Given that I know this, why isn't it adjusted? Because even as it is, people question these types of results because it's tough to understand how the system works. It's also a deterrant to cherry picking for CR purposes.

2. Races
Races have an effect similar to that (40-60% win rate overall), with stunty teams and Amazons being outliers.
I made an attempt at implementing racial effect, but it's a very very hard problem to do well. Coming up with a data-driven way to integrate a factor like this isn't easy by any means. The implementation that was in place for a while had the end-result of pushing all win probability estimates closer to 50% more than anything else and wasn't adding anything significant to the system as a whole and was consequently scrapped in the last CR system revision.

The correct way to deal with this is related to the next point

3. TV
TV doesn't represent the real value of a team, which is implied by it only being able to affect win probability by 10 percentage points in either direction. Team Value is designed to be easy to do by hand for table top coaches, and for that purpose it's fine. However, in a ranking system it's simply not good enough.

The real solution here would be to re-introduce a team strength calculation that takes into account the fact that skills aren't created equal and that injuries have an effect on how good your team is on aggregate. Ideally, it'd be a relative measure between two teams rather than a standalone number (e.g. a team with tackle on every player will completely negate the effect of dodge on an opponent team). This is something I've been considering for a while, but haven't taken the time to actually do due to the incredible complexity of it.

4. Brackets
The effect of brackets on CR is there to again discourage cherry picking weak coaches, and give people who genuinely improve their game a faster way to reach their real ranking.

There's a lot more to CR than these few points, and it's a complicated system despite the relatively simple math behind it. There's no point in talking about extreme situations where coach X has had a bad streak and coach Y had a good streak to create a larger CR differential than their "true" difference, because that happens equally often as the other way around and over time that averages out.

That leads to a key takeaway that you shouldn't be looking at the actual CR number too closely at a single point in time. It's more reasonable to compare the brackets (or simply a rolling average). I would make it automatically do the average if I thought people would be happy if I did (which isn't the case). Some people like the detailed information even if it implies a higher precision than it actually holds.
Posted by Arktoris on 2020-02-05 18:17:53
It seems the TV parameter doesn't take into effect inducements.

Posted by ArrestedDevelopment on 2020-02-05 21:27:49
@Matt
In response to your "provocation" (which incidentally really does nothing serve make you look foolish, especially when it something you have repeated and had explained multiple times), it should be noted we actually already have that past 30 games. But let's also acknowledge box is not a solely tv-matched division before that anyway - games played play a factor from 1-29 in the allowance of TV-gaps. In addition, it's been shown plenty of times TV is neither an accurate denominator of team strength, nor a particularly fair way to schedule games. There are plenty of other methods we could use - including cyanide-style Tv+ z-sum matching, which with an actual multi-team activation might work at a level they have been unable to produce themselves.

We quite frankly, persist with TV because it's what we've used since TS/TR.
Posted by MattDakka on 2020-02-05 22:23:32
As far as I know, past 30 games we don't have totally random matchmaking, otherwise big TV gaps would be way more common.
Anyway, if there are better methods to pair teams, I'm open to try them.
TV-matching for sure is not the best system, and pairing teams with big TV gaps should be avoided at any cost.
1 hour is lot of time to be wasted locked in a game affected by a big TV gap which doesn't even give many CR points if you win it.
I find foolish people unable to grasp such simple concepts.
Posted by ArrestedDevelopment on 2020-02-05 22:37:39
"TV-matching for sure is not the best system" and "pairing teams with big TV gaps should be avoided at any cost". These statements are mutually exclusive.

A big TV gap barely affects the game any more than a smaller one. CR points gained/lost reflect that. You can find foolish all you like, but the statistics don't lie, they simply are.

Posted by MattDakka on 2020-02-05 22:45:53
1) TV-matching is not the best system, but my point was that is still better than totally random matchmaking.
I would be happy to have a more accurate system.

2) Talking about the system we are using, big TV gaps should be avoided.
The sentence still works even if you use a different matching system, if, for example, TV was changed and we used "FUMBBL TS" system, big "FUMBBL TS" gaps should be still avoided.
I used TV in the sentence because TV system is not going to change, no matter how much we discuss it.
Posted by ArrestedDevelopment on 2020-02-05 23:28:49
TS and other matching systems and TV are not comparable in terms of gaps, anyone with the slightest hint of intelligence would recognize this easily enough - the reason "big TV gaps" aren't an issue is because TV is an inherently *bad* matching system.
Posted by MattDakka on 2020-02-05 23:45:08
Would you play with a TV 1500 team vs 2670 TV team and consider the match up a no issue in terms of balance? Or, given the choice, would you play a close TV match, even if we both know that TV is a crude way to measure team's strength?
If you answer yes to first question then either you don't care about balanced matches or you are a masochist or you like to play as underdog for some reason.
A normal person with common sense may like to play with a slight disadvantage, but not being matched in a one-off game vs an overdog team.
If the Box scheduler arranged all the time TV gaps matches I'm sure that the division would be soon abandoned by coaches.
Posted by ArrestedDevelopment on 2020-02-06 00:06:51
You are creating hyperbolic situations "all the time tv gaps matches" to illustrate a point that is incorrect. Incidentally, the box does schedule TV gaps all the time, just not always huge ones.

You have enough experience to know even the largest TV gaps are far from insurmountable. Indeed, some of my favourite games have been playing 500+ uphill vs opponents who found themselves unable to deal with my lean team and inducements.

Participation in majors has also shown that the TV gap is a terrible way to judge the balance of a game - my own chaos team got taken to penalties after overtime by a human team 1m+ TV lower. I've been on the overdog side in such games and found myself the equity underdog - elves with a wizard and other inducements spring to mind as well.

Your issue here is you attribute a value that isn't actually as terminal as you think. Especially true when many of the biggest teams in box are carrying rerolls and a bench they seldom get maximal value from.
Posted by Nelphine on 2020-02-06 03:42:34
So, I agree that their should be a racial factor.

My logic is this:

CR professes to describe if you are good at winning. However, it is modified by a tv gap, and it is modified by the CR of the opponent. Both of these things are determined prior to the game starting.

Racial modifiers are specifically not included; yet this is also determined prior to the game starting.

So, CR states 'are you good at winning, assuming an equal tv, and assuming an equal opponent skill, BUT, if you happen to choose this thing called race prior to the game, that can't influence your chance to win'.

Yet we KNOW that certain racial matchups are in fact more important to winning than opponent CR. And racial matchups are chosen at exactly the same time as opponent CR.

Why does CR reward choosing one thing, but not choosing the other? If something chosen prior to match shouldn't influence the CR rating (race), then nothing chosen prior to the match should influence CR (and therefore all matches forever should reward or reduce CR equally).
Posted by Joost on 2020-02-06 09:11:40
From Christer's explanation we know CR is not meant to be taken as too precise a science. Rather, a rolling average would be a better way of estimating a player's ability to coach, build and select teams. It works to that definition with the understanding it isn't very exact.

Christer also let know that he'd like to include racial factors, but that it is very hard to do based on the data available. So it isn't included as a factor for a very good reason. Not because it wouldn't be a good idea per se (although the definition of what CR measures would change), but because it is beyond what can be done at the moment.
Posted by Rawlf on 2020-02-06 10:42:00
So just keep in mind that CR is only accurate if you play an even mix of races. If you stick with top (bottom) tier teams, your CR will be inaccurately high (low). Nothing wrong with it, just keep in mind. I guess the racial mix is good enough for most coaches to not matter overly much.

If you are shooting for max CR though, by all means only go with top teams. You might find this detrimental to the fun of playing, but if you are only playing for that one number on this website and consider a game that is not helping in this a wasted hour of your life, you are doing things wrong in a very fundametal way anyway.
Posted by SzieberthAdam on 2020-02-06 11:26:36
Racial factor is a hard thing and is hard to model correctly. We all know that one race performs better at certain TV levels. So RaceA TVa vs. RaceB TVb results a racial win probability 4D matrix with a size of 24*24*B*B. The number of B is how many TV brackets we define. It is about 25 if we bracket by 100 and about 50 is we bracket by 50 TV. Even with 100TV brackets, knowing that the matrix is symmetric for every pairs the number of unique values would be 180000. FUMBBL simply do not have big data enough to make us able to fill this matrix correctly.

We could do a regression analysis for every race pairs which could be an interesting to do.

Both way, one of the hardest issue is how to exclude the coach factor from our model as for every match result we have are affected both by the coach and the team/race factors mixed. Not to mention the luck factor!

Because of the complexity of the problem, I really think that machine learning would be the proper way. However, it might not worth the efforts.
Posted by Christer on 2020-02-06 11:41:04
The thing with racial factors is that it is simply a very crude proxy for the core problem of TV being inaccurate.

The same problem happens if someone builds an all zombie undead team, and in the extension building a team in a suboptimal way. Extending this gives you a situation where you'd want to differentiate an elf team that picks side step on their linemen as opposed to pass block, or an amazon team who exclusively plays against 0 tackle opponents (or dwarves exclusively vs high-dodge teams).

What it comes down to is an intent for the CR system to be more than a measure of actual on-pitch gameplay and the ability to build teams that are part of the current "meta". I'm opposed to Rawlf's statement that the CR system is only accurate if you play an even mix of races, because in my opinion the ability to build strong teams is very much a part of CR (and in R the ability to choose which games to take and which ones not to).

This will always be the case too, since TV calculation simply can't be perfect. A Team Strength system (as we used to have on the site on LRB4 times) could easily be better than TV but it's still practically impossible to get it perfect.
Posted by MattDakka on 2020-02-06 11:47:21
@ArrestedDevelopment:

1)"You have enough experience to know even the largest TV gaps are far from insurmountable."

I never said that TV gaps matches are insurmountable, my issue with them is the mismatch in terms of skills and stats. This difference can put the underdog team on the backfoot and I don't consider this fair.
If there is a race, it's not fair to compete with a 500 vs a Ferrari, even if the 500 is driven by a super pro pilot and the Ferrari by an average or bad pilot. If for some coaches the TV gap is a no issue or a minor issue, good and well, for me is an issue, not insurmountable, but an issue I'd like to avoid whenever possible, even if I have to fail an activation to avoid a super high TV monoactivator.


2) "Participation in majors has also shown that the TV gap is a terrible way to judge the balance of a game - my own chaos team got taken to penalties after overtime by a human team 1m+ TV lower. I've been on the overdog side in such games and found myself the equity underdog - elves with a wizard and other inducements spring to mind as well."
I agree with you, I have been there as well and I made miracles with an underdog HE team + Wizard, but playing in a Major is different from playing a one-off matchmaking game. If I join a Major I know I will face sooner or later a big overdog team, it's part of that kind of competition.
I don't want, even if I can handle it, play big TV gaps in the Black Box, it's a TV matchmaking division, not a "totally random underdog vs overdog" division. I know that TV is crude, but it's still quite decent to evalue the fairness of a match, assuming both coaches don't make poor choices, but making good choices is part of coaching skill. Also, please notice this important point: since I activate several teams, thus offering more "flexibility" to the scheduler in terms on possible team to pair, I expect not to face a guy who activates only 1 super high TV team, even if bloated, because this is unfair. If I offer many options, I should not be penalized and face a guy who is offering only 1 option to the scheduler.

3) "Your issue here is you attribute a value that isn't actually as terminal as you think. Especially true when many of the biggest teams in box are carrying rerolls and a bench they seldom get maximal value from."
That can be true, but not all the overdog teams in the Box are super bloated and used by bad coaches. There are high TV teams with good skills choices and Legend coaches.
Anyway, playing vs TV gap is not what I expect when I play in the Box, I want to play with reasonably balanced teams and face a very good opponent, ideally: even if I win an overdog team of a bad coach, after 1 hour where I had to choose my inducements and play harder than expected due to the sheer difference of rosters, what I gain is few CR points and if I tie, probably due to the roster difference, I even lose CR points.
I prefer to play with 2 close-TV teams vs Legend coach, honestly, it's way more interesting, teaches me something and doesn't waste my time.
Posted by ArrestedDevelopment on 2020-02-06 13:15:14
Matt the key issue here is: this is all just your personal preferences, it has no real relevance to the statistical reality.

You activate a variety of teams, but each is efficiently tuned to its TV, and you have high TV teams to act as cover. In addition, you have reduced the teams you activate as you achieve metagoals with previous ones, so in reality, the "variety" you activate is not nearly as widespread as you make out.

Truth be told though, what is most interesting is you seem to fail to grasp that close TV matches would only reward the type of teambuilding you deem best, while complaining that the current status increases the effect of building on CR.

But actually NONE of this is relevant at all to the actual post - these are scheduler complaints. I am, and was, not talking about the scheduler, merely the calculation of CR. The intention of this post was not to issue forth complaints about the scheduler, but to explain to you, and others, why you lose CR when you play to a tie vs a lower ranked coach regardless of the TV gap.

And to show it is not unfair.
Posted by MattDakka on 2020-02-06 13:51:58
Even if I don't activate all the 24 races anymore, I think we can agree that activating 13 different teams at different TVs is better for the scheduling than activating a single high TV team. I'm open to activate 24 different races again if my opponents do the same and if losing with a tier 3 team doesn't make my CR drop a lot.

Again, if I wanted very close TV matches to rack CR points I would just cycle tier 1 teams, because I would benefit from rookie protection too. TV gap can still happen, but it's less likely with new teams.
I like variety to a degree, so I like to play teams a bit developed, because makes the games more different than playing rinse-repeat Undead, Norse, Dwarfs, Amazons at TV 1200.
Everybody can finely tune his teams for their TV (I'm glad to help if people want), while not everybody can have 2300 TV teams.
My finely tuned teams, though, are not immune from finding monoactivators, while a cycled team has the rookie protection.

If I tie vs a way lower CR coach than me just because he monoactivated a super high TV (350+ TV gap) and I lose more than 0.25 CR something must be wrong in the CR calculation (and I don't dare to imagine the CR lost in case of defeat).
Over time this calculation mistake will be fixed by other games played, I get that, but it takes time and it's quite frustrating (and unfair) having to play like 10 games and win/tie them all while with a single bad game (created by a low CR monoactivator) I lose CR even with a tie.
Showing me how the CR is calculated doesn't change this.
Posted by ArrestedDevelopment on 2020-02-06 14:37:56
There is nothing stopping you having a massive team - you just need to choose to make it that way. You don't. These are issues with the scheduler you have, and your issue is the meta has changed away from something you would like - ever since the change to 30+ game teams having open matching, and with the additional rules changes of BB2016 supporting maxmax rosters, the meta of the box is different from what you prefer. Incidentally, rookie protection will not really stop TV gaps at all once you pass 5 games. A coach who recycles at 15 games can still end up playing 300+ TV gaps.

Your inability (or lack of willingness) to adjust to a meta is actually a failing as a coach.

There IS something wrong with the CR calculation actually yes, you should lose MORE CR for that tie, or loss. The TV factor is too high in the calculation. You are actually being rewarded in those games with a lower loss of CR (or a greater gain) than you would otherwise get for playing a lower bracketed coach.

Posted by Nelphine on 2020-02-06 15:12:26
To those who are saying racial matchups are really just an extension of the tv problem:

I agree. HOWEVER, racial matchups, with 500tv brackets (so at most 5 brackets), are still a more reliable predictor than the current tv gap value in the formula.

If we're willing to have the TV gap formula at all, it would be better served to add a simple racial matchup table, or replace the tv gap formula entirely with a simple racial matchup formula.

Similarly, crude racial matchups are ALSO more predictive for certain races than CR of the opposing coach. So if we're willing to have CR of opposing coach then we should also at least be willing to have a racial tier (3 tiers) formula in the calculator.


For details, the simple racial matchup would be 24*24*4*4 (you could do 5 brackets, but I think it's fine to just lump the last bracket into the 4th) .
The crude racial tier matchup would be 3*3 (it would only be tier 1 race vs tier 2 race vs tier 1 race, so things like dwarf vs Amazon would be completely mussing, and tv wouldn't be there at all).

You don't need an ideal perfect racial matrix to improve the current CR calculator, in terms of making it logically consistent. If picking an opponent with sub 150CR gives you less CR on a win, then picking an opponent playing halflings should give you less CR on a win.
Posted by Naama on 2020-02-06 16:08:08
Hmm i have a pretty limited understanding of how the calculations are actually implemented, but the whole racial matchup factor seems like a pretty huge hassle.

I just remember when Christer was improving the CR calculations a few years back, didn't it calculate CR all the way back from when the site was first created?

Now you add teams to tiers according to the current meta and ruleset, so does LRB4, LRB5, LRB6 era games get calculated with the same tier factors? Or do you somehow add some datafield which tells what ruleset the game was played on? (In case it's not possible to just calculate the CR in a certain way from a specific date forward).

Who even is responsible for tiering teams? And yeah Chaos probably is a tier 1 team at high TV, but is it a tier 1 team at low TV? No.

And say some team gets an awesome new positional or PO wouldn't need a re-roll anymore. That's a huge change in the meta and would affect tiering and there would be upkeep involved to keep racial tiers relevant.

I just feel racial tiers in CR is a long winding rabbit hole that probably brings it's own problems that people wouldn't be happy with, and if you can't get it right and valid then isn't that just pushing the problem elsewhere?

Sometimes simple is better (In the voice of Judd Crandall)

Now sorry if all these worries are irrelevant. I'll be the first to admit i'm not the sharpest tool in the shed!
Posted by Nelphine on 2020-02-06 16:22:56
No I completely agree they are relevant. However, despite the worries, it's still a fact that racial matchups are still potentially more important than either tv gap or opponent CR.

The only reason I'm suggesting using this very flawed inaccurate racial matchup is because it's still better than CR and tv gap. (Although I think it should be in addition to those two things. And I would use the simple racial matrix, not the crude racial tiering, I'm just pointing out that even the crude racial tiering is still an improvement.)
Posted by MattDakka on 2020-02-06 16:57:41
@ArrestedDevelopment: about having a maxmax massive TV team, well, if winning with a 440 TV gap makes me earn 0.18, imagine if I had won with same TV team, I would have won 0 CR or negative CR maybe, XD.
In other words, as the system is, there is no benefit in maxmaxing a team in the Box, it's better to just have 1 high TV, but not too high TV team in order to deal with the super high TV monoactivators, that way there is a TV gap but not as big as it could be if I had only 1500 or lower teams.
Posted by MattDakka on 2020-02-06 17:09:30
About improving TV: I think that fan factor is too expensive for what it brings to a game, a good idea could be considering the ff half the TV value it is, so, if a team has 10 ff, it should count as 5 TV, not 10 TV in the hypothetical "FUMBBL TS" for purpose of matchmaking.
Would halving the ff fix the TV inaccuracy?
No, but in my opinion would be a step towards improving it.
Posted by garyt1 on 2020-02-06 17:35:27
"1 hour is lot of time to be wasted locked in a game affected by a big TV gap which doesn't even give many CR points if you win it."

I think this comment from Matt shows a difference in priorities over a lot of coaches. Enjoying the game is important and admittedly a large TV gap may affect that. There is then the part about trying to win the game, whatever the matchup. A win despite a bigger gap may give you a bigger sense of achievement. There is then the part of developing your team. Plus there is the dice factor which can skew individual matches anyway. CR is really just an additional factor you get out at the end of it all.

Also where people complain about overall CR being affected by using tier 3 teams, they should remember they have a racial CR. Which you can post a couple of on your profile. Put your best team and a stunty team and that shows your range.
Posted by Arktoris on 2020-02-06 17:52:07
I don't how a 20TV gap after inducements is so big. Perhaps someone can educate me on that.
Posted by MattDakka on 2020-02-06 18:10:58
Inducements are not supposed to fully bridge the gap, but to improve the underdog chances to win. Overdog teams tend to win.
They are generally overcosted compared to the same TV invested in skills and stats on the overdog team.

@garyt1: you don't climb the ladder to rank 1 with the sense of achievement, you need CR points to reach it.
I can feel a sense of achievement (I didn't, TV gap games are generally boring in my book), but when I play vs Legend coach with evenly matched teams.
Playing vs bad coaches with big overdog teams is plainly boring and not challenging, the challenge could be represented by the roster/stat freaks/killstack, but this is not a true challenge.
The true challenge lies in positioning, not in having a massive roster advantage over the opponent.
I don't have fun even when I'm the overdog, in case you were wondering.
I always prefer even TV matches or very close and in order to improve the chances of even match I activate many teams at different TVs.
Posted by Arktoris on 2020-02-06 18:45:13
Yeah, but is it still fair to say the gap was 440TV after acquiring 420k of inducments?

and while you pay a premium price for inducements, you are getting to select and choose specifically what would work best for that specific game...whereas the overdog may have quite a lot invested in things that will not specifically help in this game. ie if he has 6 tacklers and you have no dodge, that's 120TV wasted. Or if he has 16 players, but only 11 on the board, the other 5 potentially mean nothing.

Meanwhile you're not going to spend 430k for Morg, if your opponent has a chaos minotaur with 6ST and claw. You're going to invest that cash in inducements that will matter.

Yes?
Posted by MattDakka on 2020-02-06 18:55:28
After the inducements have been bought, it not correct, but when I talk about TV gap I'm referring to pre-match value.

Selecting the inducements: well, cards are not available anymore, most of times I hire babe/apo/igor and Wizards + fodder, no matter the opposition.
So, there is not much advantage in the tailoring of the inducements.

Well, the other 5 can mean he could foul or simply having fodder to place on LOS.
Anyway, I just prefer to play vs even TV team, in my experience games are most often balanced that way, unless certain particular racial match-ups.
Posted by Arktoris on 2020-02-06 19:41:13
thanks Matt.

Yeah, I've played a few games where I struggled to win, only to get a whopping 0.2CR. I don't get flustered over it, i just laugh. Anyway, the low CR gain had a lot more to do with your CR vs your opponents then the team types and TV. Christer has set up the k to work against the higher bracket coaches with the specific aim to deter cherrypicking low CR coaches.

so the big question is, can one cherrypick in the Box as easily as one can in Ranked?

If yes, then the system should stand. If no, then perhaps an appeal to Christer to ease up on the bracket factor for Box and tournament games.
Posted by MattDakka on 2020-02-06 19:58:53
In the Box you can't cherrypick as in Ranked, of course.
Yes, I don't play in Ranked at all yet I get affected by the Ranked anti-cherrypick part of the formula.
Posted by Arktoris on 2020-02-06 20:05:46
Well true, the cherrying in Ranked is different than Box, but Box still cherries...it simply does so in different ways.

In Box, you can purposely keep your TV low so you can fish the rookie and low CR coaches which tend to have low TV teams.

You can also check to see if someone who'd give you a difficult time is online before activating your teams. If he's already in a game, then all systems go!

You can also find a partner for the Box in Chat and agree to activate the same time.

But are these strategies as good as the ones in Ranked?
Posted by MattDakka on 2020-02-06 20:18:35
Check the win rate of a coach in Ranked and in Box, you will see that most of times the Ranked win rate is higher (assuming he played enough games in both divisions).
The reason is the ability of accepting/declining good/bad offers.
If Ranked and Box cherrypicking were equally viable the win rate from the 2 divisions would be closer.
Posted by MattDakka on 2020-02-06 20:25:16
If you keep deliberately low the TV in Box without cycling you might find a TV gap, the coaches who do it tend to cycle tier 1 teams for few games then retire them, rather than controlling the TV (although some coaches control the TV).

You can avoid a coach if he is offline, but you risk not to play a game at all, and it's hard to avoid more than 1 coach.
There are more than 1 good coach in the Box during Euro time (don't know NA time, I don't play anymore night time).
About rigging a match with a friend: 4 people are needed, it's not assured you will play vs your friend, while in Ranked you can do it 100% sure as long as you don't break the 1/10 rule (which is not a strict rule, as far as I know, but a rule of thumb with some degree of freedom).

Posted by Arktoris on 2020-02-06 21:27:43
well if you find several others who agree with that and present Christer some data, perhaps he'll ease up the coach bracket factor for Box (and tournaments).

as for me, I don't care if a hard fought game yields low results, or vice versa. I like having a 140s CR and 160s CR.

With 140s, I don't have to waste so much time getting a game in Ranked. And with 160s comes a warm fuzzy feeling of recognition for my abilities. It's the 150s I hate. That's the "meh" zone.
Posted by badger89 on 2020-02-06 22:07:43
What a good read
Posted by Endzone on 2020-02-07 02:21:01
Some musings:

1. Generally, it has been my view that you can get more value out of the TV increase from building your team than you can from the equivalent amount of inducements. A newish team that has four players who have learned block (80k) would be advantageous v the 80k inducement, which may end up being spend simply on a Babe or card. However, other factors tend to be more significant, e.g. coaching strength. This is, in my view, good game design. We want there to be some benefit to building a team without underdog teams being unplayable.

2. One exception to the above used to be a wizard. At 150k it was a game changer. Particularly powerful on teams able to capitalise on a lose ball like woodelves.

3. I think race is significant - but there are a lot of variables in this:
a) actual racial matchup (e.g. amazon strong but weak against dwarves, lizards strong but weak against woodelves etc.)
b) build (e.g. the all zombie Undead team is weak)
c) TV level (Chaos teams tend to get relatively stronger as they move up through TV, amazon and Norse get relatively weaker etc.)
d) coach preference. I consider my win chance to be significantly higher with Woodelves compared to High Elves, due to my experience (or lack of it) with those two races. I expect there are other coaches who like AV8 who would see that differently for them.

4. Halflings like having at least 100k inducements so they can get their hot pot.

5. If a racial factor were added into the CR calculations, it would make lower tiers teams more playable for the CR minded coach. For me, it would need to take account of the racial match up and TV level. I believe win % data by TV band and race is available though so this may be doable? e.g. Lizards v Chaos at 1000k - advantage Lizards (probably no claw) but at 2000k advantage Chaos (probably lots of claw). Dwarves (tackle) > Amazon (dodge) > Khemri (low tackle/tier) > Dwarves. (strength imbalance)

6. I have seen some stuntie coaches post before that CR is not a concern to them though so I don't know whether there is really the demand for more accurately reflecting their coaching ability in CR.


Posted by ArrestedDevelopment on 2020-02-07 07:33:10
@Endzone

1. Indeed. I tried to specifically mention the size of TV gaps being irrelevant - because simply being bigger does have an advantage, but whether it's 20tv or 700tv is actually quite meaningless statistically, it only increases your win odds to a maximum of 60%. And when we look at a CR gap game, I think that the size of the gap is even more irrelevant, in fact, some might argue a legend *should* be spotting an emerging star a bundle of TV/inducements for the game to actually be interesting. Some people might even have run a league where this happened!

2. Agreed, but even at fireball only the wizard is quite powerful - he changes the way a coach has to cage/screen, potentially reduces the impact of riot kickoffs, can be used to make space, and even at the worst of it, is a potential for some free mighty blow hits.

3. Yeah pretty much, which is why as Christer said above, TS would solve the issue: it would cover straight up the strength of a team which would account for stacks, skills, negatraits, and even roster build. Which would leave us with the coach preference unaccounted for, but the overall CR score, given time, should reflect that.

4. One of the issues here is actually halflings have a variety of factors in why they are so weak. I personally think stunties just flat out aren't intended for perpetual style play online (even in R). They're there so the boy wonder at his local league can play a race and not beat everyone 8-0. Not so "TurgidProgress" can make lowbie flings and sit in blackbox all year beating up rookie teams endlessly with a chef every game or so (from treasury or TV gap), or pick endless opponents in R who might not be able to deal with 0-1rr after a chef.
While flings in a TT league generally are guaranteed inducements every game, their opponents are also growing in skills per game. This isn't always the case online. And I do actually feel online you eventually reach a point where you'd just rather see less mb/tackle than pick up the chef.

5. It might, it might also end up making such a small difference that it still rendered CR-minded coaches unlikely to play tier 3 while creating a lot of unnecessary work for others.

6. Half the fun in stunties is taking CR from those who DO care! ;)
Posted by MattDakka on 2020-02-07 12:53:18
A question:
let's say I have a TV 1500 team and I get paired with a TV 2000 team.
I decide to buy no inducements at all.
At the end of the match, when CR points are awarded, does my team count as TV 1500?
Or the CR calculation assumes that inducements are always bought, thus considering my team as if it were TV 2000?
I'm curious, because I could decide not to buy some inducements in order to gain more CR.
Posted by Joost on 2020-02-07 19:52:04
Based on experience i think it doesn’t matter if you buy inducements or not. Which makes me realize that a coach’s ability to pick good inducements is also part of the definition of CR. Of which we should take the rolling average seriously, not any moment’s score as discussed.
Posted by Nelphine on 2020-02-07 23:29:43
Lots of talk of team strength. Obviously, with the nature of my customisable league, I'm quite interested in this topic, and I'd be willing to put a fair bit of work into it.
I'm obviously biased as to my starting point, but does anyone have the old team strength formulas? I'd love to look through them and see if I could propose an update for this edition.
Posted by Nelphine on 2020-02-07 23:37:22
Never mind, found it. I think the rules in my custom league could be used in conjunction with that old team strength to come up with a proposal.
Posted by MattDakka on 2020-12-29 19:32:50
I discovered that, if you don't buy inducements (or you just buy some, without fully bridging the TV gap), you gain more CR points if you win a TV gap.
So, the CR calculation takes into account the TV of underdog + actually purchased inducements, it doesn't always assume that the underdog buys all the inducements given by the TV gap.