Let’s focus on Lemma A.1 of Paper S4 that shows the number bound indicating the chart we made last time. To show (4) there, the formulation of Lemma A.2 in Appendix of Paper 3 is not quite relevant although it is ok for Lemma 4.1 on p. 12 to refer to. It’s good if it’s changed into the formulation of Lemma A.1, covering a broader range of applications such as showing (4) to resolve our initial core growth problem.
Before breaking down its proof, let’s get convinced that \( {m \choose q}{n-m \choose d-q} \) regarded as a function of \( q \in [0, m] \) takes maximum around \( q \simeq \frac{md}{n} = \frac{m}{r} \) given that \( m \ll d \ll n \). This is because \[ {m \choose q+1} = \frac{m-q}{q+1} {m \choose q}, \quad \textrm{and} \quad {n-m \choose d-q-1} =\frac{d-q}{n-m+q+1}{n-m \choose d-q}, \] so \( {m \choose q+1}{n-m \choose d-q-1} = t {m \choose q} {n-m \choose d-q} \) where \[ t= \frac{(m-q)(d-q)}{(q+1)(n-m+q+1)} = \frac{md}{qn} \frac{\left[1-O(q/m) \right]\left[1-O(q/d) \right]}{\left[1 + O(1/q) \right] \left[1 – O(m/n) \right]} \simeq \frac{md}{qn}, \] Here the approximate equality is due to the extra condition \( 1 \ll q \ll m \) added to \( m \ll d \ll n \). So, if \( q > m/r + o(1) \), then \( {m \choose q}{n-m \choose d-q} \) increases when \( q\) is incremented, and decreases if \( q < m/r – o(1) \). (Asymptotic notations here work just as well as 8/11/23. )
If \( q = O(1) \), the value of \(t \) is \( \frac{m}{O(r)} \) so it has the same asymptotic behavior upon the incrementation. If \( q \ge \epsilon m \), then \( {m \choose q} \) increases by less than \( \epsilon^{-1} \), and \( {n-m \choose d-q} \) reduces by an approximate factor \( r^{-1} \) not affecting the asymptotic decrease rate. Therefore, we conclude that \( {m \choose q}{n-m \choose d-q} \) takes maximum at either \( \lfloor m/r \rfloor \) or \( \lceil m/r \rceil \) when \( m \ll d \ll n \): if \(q \le \lfloor m/r \rfloor -1 \) the function still increases, and if \(q \ge \lceil m/r \rceil +1 \), it decreases. There’s a subtlety at \( q= \lceil m/r \rceil \), but you can check that \( q= \lceil m/r \rceil~ \Rightarrow~ t<1 \) by a bit more involved argument, so you know it can’t take maximum at \( q \ge \lceil m/r \rceil +1 \).
As said before, we are analyzing the distribution of the summand \( {m \choose q}{n-m \choose d-q} \) of the Vandermode’s identity \( \sum_{q=0}^m {m \choose q}{n-m \choose d-q} = {n \choose d} \). This is equivalent to the following visualization involving random \(d \)-sets:
It’s good to know that random \( d \)-sets are distributed in \( X \) in the above sense. Each element of \( U \) is chosen almost independently so it gets in \( X’ \) with probability \( \simeq m/n \), and in \( X- X’ \) with probability \( \simeq (n-m)/n \). (Element placements are not completely independent because you can’t place one where it’s already taken. But \(n \) is much larger than \( d \) giving a small difference from \(d \) independent trials.) So a typical \(U \) must have \( \frac{|U \cap X’|}{|U- X’|} \) around \( \frac{m}{n-m} \). This insight allows us to derive (2) on 5/27/24 and Remark 4 on 6/10/24, where we can just consider typical \(U \) with \( m_*/2 < q < 2m_* \) because all others have a negligible occurrence probability.
A lot of statistical distributions, not just combinatorial ones, have a similar tendency that event occurrence probabilities are concentrated in a small range of a random variable. This is often due to the nature of some binomial coefficients.
Here is our formulation of the lemma:
So if the lemma is true, we have (4) last time stepping forward to resolve our problem. We’ll start seeing its proof next time.
Links for mobile devices