In my application I have instances where $ m_1 ge1, m_2 ge1 $ the models have produced activity values respectively (after some other calculations) $ a_1, a_2 in (0,1) $ ($ 0 $ medium *inactive*, $ 1 $ is totally *activated*) the $ m_1 $ the models represent the *good* ones and the $ m_2 $ the *bad* some so to speak. I always take the difference: $$ d = a_1-a_2 in (-1,1) $$ where values closest to $ -1 $ indicate that the *good* In general, the models are more deactivated than the *bad* those of that grouping ($ m_1 $ vs $ m_2 $ models), values closer to $ d = 1 $ indicate that they are more activated and values closer to $ 0 $ indicates that there is no activation difference between the two groups of models.

That works fine, but I wanted **take into account the number of models** $ m_1, m_2 $ to counteract the *bias* which is being introduced as you can see in the following example: $ a_1 = 0.9, a_2 = 0.1, d = 0.8 $, with $ m_1 = 5, m_2 = $ 1000. I would like to have a smaller difference than $ 0.8 $, from the $ m_1 << m_2 $. So, I did the weighted average (type of) as: $$ d_ {w} = frac {m_1a_1-m_2a_2} {m_1 + m_2} $$

The problem is that the penalty is now too large, p. For the previous example, $ d_w = -0.095 $, which is wrong since I would never expect it to be less than zero in this case.

So I want **reduce the penalty** making the numbers smaller and smaller *Come closer*and what is better than using $ log_ {10} $ do exactly that: $$ d_ {lw} = frac {log (m_1) a_1-log (m_2) a_2} {log (m_1) + log (m_2)} $$

Now, the previous example produces $ d_ {lw} = $ 0.0889Much more sensible! the *good* the models are more active, but since I just arrived $ 5 $ of them vs $ 1000 $ *bad* some, the estimation of the activity difference is penalized.

by $ m_1 = $ 1.2 model the $ d_ {lw} $ The result is negative in the example! And if we put $ m_1 = m_2 $ I would expect the result to be equal to the original difference $ d_ {lw} = d $, but it is not 🙁 I tried to duplicate the last difference: $ d_2 = 2 * d_ {lw} $ that solved this, but now of course $ d_2 in (-2.2) $ i don't want

I'm looking for a difference function $ f (m_1, m_2, a_1, a_2) in (1,1) $, for which the following properties are true:

- $ f (m, m, a_1, a_2) = a_1-a_2 = d $
- $ lim_ {m_1 << m_2} f (m_1, m_2, a_1, a_2) = 0 $, (as in the previous example). The same if m_1 >> m_2.
- The transition of the $ m_1 = m_2 $ Extreme cases (where model numbers differ too much) should be
**not steep** – I don't know how to express this in a single word, but what I mean is that if you think the model number is starting to change from equality $ m_1 = m_2 $ the $ d $ the difference should be difficult to change and only near the ends should we begin to see a notable difference … I have also called this last property the **Montana** (it would be interesting to see what it means in mathematical terms) since equality is like the top of the mountain and to the right and to the left are the slopes (which in my case I want them to be *passable* – that is, not steep).