Statistics – Formula for weighted average difference

In my application I have instances where $ m_1 ge1, m_2 ge1 $ the models have produced activity values ​​respectively (after some other calculations) $ a_1, a_2 in (0,1) $ ($ 0 $ medium inactive, $ 1 $ is totally activated) the $ m_1 $ the models represent the good ones and the $ m_2 $ the bad some so to speak. I always take the difference: $$ d = a_1-a_2 in (-1,1) $$ where values ​​closest to $ -1 $ indicate that the good In general, the models are more deactivated than the bad those of that grouping ($ m_1 $ vs $ m_2 $ models), values ​​closer to $ d = 1 $ indicate that they are more activated and values ​​closer to $ 0 $ indicates that there is no activation difference between the two groups of models.

That works fine, but I wanted take into account the number of models $ m_1, m_2 $ to counteract the bias which is being introduced as you can see in the following example: $ a_1 = 0.9, a_2 = 0.1, d = 0.8 $, with $ m_1 = 5, m_2 = $ 1000. I would like to have a smaller difference than $ 0.8 $, from the $ m_1 << m_2 $. So, I did the weighted average (type of) as: $$ d_ {w} = frac {m_1a_1-m_2a_2} {m_1 + m_2} $$
The problem is that the penalty is now too large, p. For the previous example, $ d_w = -0.095 $, which is wrong since I would never expect it to be less than zero in this case.

So I want reduce the penalty making the numbers smaller and smaller Come closerand what is better than using $ log_ {10} $ do exactly that: $$ d_ {lw} = frac {log (m_1) a_1-log (m_2) a_2} {log (m_1) + log (m_2)} $$

Now, the previous example produces $ d_ {lw} = $ 0.0889Much more sensible! the good the models are more active, but since I just arrived $ 5 $ of them vs $ 1000 $ bad some, the estimation of the activity difference is penalized.

by $ m_1 = $ 1.2 model the $ d_ {lw} $ The result is negative in the example! And if we put $ m_1 = m_2 $ I would expect the result to be equal to the original difference $ d_ {lw} = d $, but it is not 🙁 I tried to duplicate the last difference: $ d_2 = 2 * d_ {lw} $ that solved this, but now of course $ d_2 in (-2.2) $ i don't want

I'm looking for a difference function $ f (m_1, m_2, a_1, a_2) in (1,1) $, for which the following properties are true:

  • $ f (m, m, a_1, a_2) = a_1-a_2 = d $
  • $ lim_ {m_1 << m_2} f (m_1, m_2, a_1, a_2) = 0 $, (as in the previous example). The same if m_1 >> m_2.
  • The transition of the $ m_1 = m_2 $ Extreme cases (where model numbers differ too much) should be not steep – I don't know how to express this in a single word, but what I mean is that if you think the model number is starting to change from equality $ m_1 = m_2 $ the $ d $ the difference should be difficult to change and only near the ends should we begin to see a notable difference … I have also called this last property the Montana (it would be interesting to see what it means in mathematical terms) since equality is like the top of the mountain and to the right and to the left are the slopes (which in my case I want them to be passable – that is, not steep).