The Push-Up Lemma (2)

Please recall the three methods mentioned on 11/13/23. We are now familiar with the 1st and 3rd ones and will be with all the three after this subchapter, ready to verify the three-disjoint-set claim. We discussed the necessity of the push-up lemma last time – it allows for the simultaneous core growth staying on the \( g^{th} \)-collective \( \Gamma \) condition, without which accumulated constant overheads will eventually overwhelm the entire construction.

This is how the lemma is formulated:

In the rest of the post today. we overview how the lemma helps finalize the first induction step on \( X_1 \) of \( \boldsymbol{X} \) for the three-disjoint-sets claim.

We continue the overview next time

Links for mobile devices

The Push-Up Lemma (1)

We finished the subchapter to discuss splits of \(X \) last time. In this new chapter, we’ll find all the details of how we can prove the three-disjoint-sets claim exactly. We’ll eventually confirm Proposition 3 on 11/13/23 for a general \( k \) to even try pushing the bound to the limit. I’m planning to include the obtained findings in the next version of Paper 2.

Let’s figure out its proof scenario for \( k=3 \) continued from 4/29/24. Initially, we’re just given an \( {\mathcal F} \) with the \( \Gamma(b) \)-condition where \( b=m^{\frac{1}{2} + \epsilon} \) for an \( m \) sufficiently larger than \( \epsilon^{-1} \). We can without loss of generality replace \( \epsilon \) by \( 2 \epsilon \) to let the constant \( c \) be the minimum possible value of \(m^{\epsilon} \). Then \[ b= c m^{\frac{1}{2} + \epsilon} \] with the large constant \(c \), for which we want to prove such an \( \mathcal{F} \) includes three mutually disjoint sets. This is what Theorem 1.1 of Paper 3 claims.

Further putting \[ r= m^{\frac{1}{2} – \epsilon^2} \] as before, apply \( \mathcal{F} \), \( b \), \( c \), \(r \) and \( k=3 \) to Theorem 6 on 7/29/24. This produces three sets \( C_i \), subfamilies \( \mathcal{F}_i \subset \mathcal{F}[C_i] \), and an \(r \)-split \( \boldsymbol{X} \) that are good for the rank \(g \)-construction. Especially, \( C_i \) are mutually disjoint meeting \( |C_i| < m^{1-\epsilon} \), and \( \mathcal{F}_i \) satisfy \( \Gamma(b/2) \), or just \( \Gamma (b) \) without loss of generality.

Now we perform the rank-\(g\) construction for all three \( \mathcal{F}_i \) on the first strip \(X_1\) of \( \boldsymbol{X} \). Please go through the three steps on p. 8-10 of Paper S4 again to recall what we did in Chapter 3 discussing collective \( \Gamma \)-conditions:

  • Step 1 as 11/27/23: a vast majority of \( \mathcal{F}_{i, Y} = \mathcal{F}_i \cap {X – X_1 \cup Y \choose m} \) meet the density requirement described there, where all \( Y \in {X_1 \choose l} \) are considered with \( l = |X_1|/4 = n/(4r) \).
  • Step 2 as 2/26/24 and 3/18/24: set the \( g \)-parameter weight \(w \) on each \( \mathcal {F}_i \) to apply the conversion lemma. Show \( {\mathcal F}_i \) satisfies a good \( \Gamma_g \)-condition.
  • Step 3 as 4/8/24: use the \( \Gamma_g \) condition for the \(g^{th} \) mark lemma to show a vast majority of \( \mathcal{F}_{i, Y} \) satisfy the exported collective \( \Gamma \)-condition.

Then like 11/6/23, consider all the triples \( (Y_1, Y_2, Y_3 ) \in {X_1 \choose l}^3 \) where \( Y_i \) are the \( Y \) of \( \mathcal{F}_i \) disjoint with the other two. We know that for a vast majority of \( (Y_1, Y_2, Y_3) \), each \( \mathcal{F}_{i, Y_i} \) satisfies both the density requirement and exported \( \Gamma \)-condition, so we can find and fix such a good \( (Y_1, Y_2, Y_3 ) \). Then:

  • This achieves the elementwise disjointness of the three \( \mathcal{F}_i \) within \( X_1 \). (See Proposition 2 on 11/6/23 for the elementwise disjointness.)
  • Each obtained \( \mathcal{F}_{i, Y_i} \) is sufficiently large for the next step on \( X_2 \).
  • By its exported \( \Gamma \)-condition, we can perform the same recursive step on \( X_2 \).

But something is missing before the process is quite complete. It’s about the constant overhead in Step 3: in the second figure of 4/8/24, the LHS of the collective \( \Gamma \)-condition is not bounded by \( |\mathcal{F}_Y|^g \) but by \( \epsilon^{-2} |\mathcal{F}_Y|^g \). If such a constant overhead is allowed accumulated for \( r \) steps, the LHS of the final collective \( \Gamma \)-condition would be bounded by \( \epsilon^{-2 r} |\mathcal{F}_Y|^g \), which compromises the recursive process totally.

We could think of reconstructing the collective \( \Gamma \)-condition for the second step: from the exported \( \Gamma \)-condition, extract a plain \( \Gamma(b_2) \)-condition for some \( b_2 \) close to \( b \) to do the simultaneous core growth like we did on 7/29/24 and after. Then reconstruct a collective condition from the obtained \( \Gamma (b_2 ) \)-condition.

However, please see Proposition 1 on 10/23/24, where we could obtain the collective \( \Gamma( b/2, 1) \)-condition from the given \( \Gamma(b) \)-condition. A constant factor overhead can’t be helped that way either.

So we need a way to perform simultaneous core growth on a given collective \( \Gamma \)-condition. The push-up lemma is designed to achieve this goal in an interesting way. It’s practically Step 6 on p. 20-22 of Paper 3. We’ll formulate it next time.

Links for mobile devices

Partitioning Sets and Families (15)

We have resolved our initial core growth problem last time by proving Theorem 6 given on 7/29/24. With the statement, we can convert an \( {\mathcal F} \) from Proposition 3 on 11/13/23 into \( k \) families \( {\mathcal F}_i \) on the same \(r \)-split \( \boldsymbol{X} \), each satisfying almost the same \( \Gamma \)-condition as the original. This will allow for the rank \( g \)-construction \( r \) times on \( {\mathcal F}_i \) for our confirmation of the three-disjoint-sets claim. It’s been another good step toward it.

There is an interesting remark about Theorem 5 on 7/22/24 related to the complexity of computing the clique function. It was our very first topic to start this blog series given on 1/21/23 and 1/27/23. Please click the two links to recall what we discussed. The clique function \[ f = \bigvee_{C \in {[n] \choose m}} \bigwedge_{E \in {C \choose 2}} E \] detects if a given graph includes an \( m \)-vertex set with every possible edge filled, called \( m \)-clique. The graph has \( n \) vertices \( 1, 2, \ldots, n \) in \( [n] \), and some edges \( (x, y) \in [n] \times [n] \).

To be precise, edges can be undirected in \( G \) so the set of all possible edges can be \( {[n] \choose 2} \), called the edge space, rather than \( [ n] \times [n] \), whereas the vertex space \( [n] \) collects all the possible vertices. A Boolean circuit \( B \) to compute \( f \) is informed of a given \( G \) through the truth values of \( (x, y) \): each \( (x, y) \) regarded as a logical variable is set 1 if the edge exists in \(G \) and 0 otherwise.

In this setting, every node \( \alpha \) of \( B \) is assigned to a family \( {\mathcal E} \subset 2^{{[n] \choose 2} } \) of edge sets, each of which sets \( \alpha \) to be true. Assuming \( m= \sqrt[4] n \) and \( {\mathcal E} \) comprises \( p \)-edge sets where \( p = n^{19/20} \big/ 16 \), let’s consider the following problem.

What makes this problem interesting is that we’re trying to detect an \( m \)-clique in a given \( G \). If there is, any \( r \)-split \( \boldsymbol{X} \) must have an \( X_i \) containing an \( m/r \)-clique. So \(G \supset S\) includes at least \( m/r \) vertices in some \(X_i \), each connected to \( m/r -1 \) other vertices. Condition\((S, X_i ) \) inhibits this implying \( S \) contains no \( m/r \)-clique in \( X_i \). As a result, if we can always find such an \( \boldsymbol{X} \), it could lead to a way to reduce the clique detection problem to a smaller subproblem.

And our answer to the question is yes, by the same proof technique as Theorem 5. The key point is to convert each \( S \in \mathcal{E} \) into a vertex set \( U \) such that every \( x \in U \) is connected to at least \( m/(2r) \) other vertices in \(S \). In other words:

  • In the edge set \( S \), find a vertex \( x_1 \) such that the number of \( (x_1, x) \in S \) is maximum. Put it into \( U \).
  • Find another vertex \( x_2 \) with most \( (x_2, x) \in S \) to put in \( U \).
  • Continue for \( x_3, x_4, \ldots,\) to grow \(U \) until there is no \( x_i \) with \( \#(x_i, x) \ge m/(2r) \).

Then \( |U| \le m/4 \) because \( |S|=p= m^2 / (16r) \). Add any extra vertices to \( T \) to make \( |T| \) exactly \( m/4 \). (This is just to make \(U \) an \( m/4 \)-set not disturbing the construction.)

Now let \( \mathcal{F} \) be the family of such \(U \in {X \choose m/4} \), so we can use the same method as Theorem 5. The only difference comes from more than one \( S \in \mathcal{E} \) possibly mapped to a single \( U \in \mathcal{F} \). But it gives us no problem, because we can just allow multiplicity in \( \mathcal{F} \) so the sparsity could be negative. It can be checked on 5/13/24 and 7/22/24 that the same arguments work anywhere with multiplicity and negative sparsity of \( \mathcal{F} \).

  • It’s so because we count \( (U, \boldsymbol{X} ) \) for each fixed \( U \).
  • \( |{\mathcal F} | = | {\mathcal E} | \), meaning a small minority of \( \mathcal{F} \) correspond that of \( \mathcal{E} \) with the same ratio.

Consequently, by this variation of Theorem 5, there exist an \( \boldsymbol{X} \) and a vast majority \( \mathcal{F}’ \subset \mathcal{F} \) such that \( |U \cap X_i| < m / (2r) \) for every \( U \in \mathcal{F}’ \) and \( X_i \).

This mean that for almost every \( S \), its subset \( S \cap {X_i \choose 2} \) includes no \( m/r \)-clique for any \( X_i \): we can view \( U \) as a label of \(S \) to list possible members of an \( m/r\)-clique in any \( X_i \). As there are less than \( m/(2r) \) such vertices in \( U \cap X_i \), there’s no way \( S \cap {X_i \choose 2} \) can include an \( m/r \)-clique.

This could provide a good platform to show the difficulty of computing cliques by a small enough Boolean circuit \(B \). We’ll detail this technique later when we discuss the computational complexity of the clique function.

Links for mobile devices

Partitioning Sets and Families (14)

We’re in the middle of the induction step \( \Psi(j-1) \Rightarrow \Psi(j) \) to prove Theorem 6, having finished growing the first core \( C_1 \) of \( {\mathcal F}_1 \). It satisfies the four conditions of \( \Pi(j) \) presented last time.

We have proven the induction step. The property \( \Psi(r) \) immediately implies Theorem 6, so we’ve completed its proof as well.

Let me pose one question we might want to ask here: we only assumed the \( \Gamma(b) \)-condition of \( {\mathcal F} \) when it’s given. Is it possible that the final \( {\mathcal F}_i \) with \( \Psi(r) \)-ii) as the only cardinality lower bound gets near-empty? The answer is no. Because the \( \Gamma(b) \)-condition means \( 1= |{\mathcal F}[T]| < b^{-m} |{\mathcal F}| \) for every \( T \in {\mathcal F} \), so \( |{\mathcal F}| > b^m \). With this and \( \Psi(r) \)-i), we can see the final \( {\mathcal F}_i \) is far from empty.

Links for mobile devices

Partitioning Sets and Families (13)

With Theorem 5 proven last time, we can now present a general statement to resolve our initial core growth problem.

This surely offers a solution to the problem: by Proposition 3 on 11/13/23, we are given an \( {\mathcal F} \) with the \( \Gamma ( cm^{\epsilon + \frac{1}{2}} k \ln^2 k ) \) condition. The theorem provides the three objects we need to perform the rank-\(g\) construction on \(X_1 \) and other \(X_j \).

To prove it, use Theorem 5 to obtain an \( r \)-split \( \boldsymbol{X} \) and subfamily \({\mathcal F’} \). Then recursively confirm the following proposition by the simultaneous core growth.

The approximate equality is due to \( r \gg 1 \) \( \Rightarrow \) \( (1- r^{-1})^{-r} \simeq e \), simpler than (1) on 7/20/23 to see. Perform the updates \[ \textrm{(7)} \qquad C_i \leftarrow C_i \cup S, \qquad \textrm{and} \qquad {\mathcal F}_i \leftarrow {\mathcal F}’_i [S], \] for \( i=1 \). Then \( C_1 \) and \( {\mathcal F}_1 \) meet the four required conditions of \( \Psi(j) \).

We’ll continue this to finish the proof of Theorem 6 next time. Note \( \Psi(r) \) implies the theorem.

Links for mobile devices

Partitioning Sets and Families (12)

Having finished our proof of Lemma A.1 of Paper S4 last time, the bound (4) given on 6/17/24 is now available to help resolve our initial core growth problem. First, let’s find a general good \( r\)-split following the same style as Theorem 4 on 6/3/24.

To prove Theorem 5 from (5), we again count the pairs \( (U, \boldsymbol{X} )\) using the same technique as 5/13/24, 11/6/23 and the extra notes of the 2/23 post: among the \( |{\mathcal F}| \prod_{i=1}^r {n-(i-1)n/r \choose n/r} \) such pairs, less than \( m \exp \left( – \frac{m}{12 r} \right) \) of them fail to meet \( \frac{m}{2r} < |U \cap X_1| < \frac{2m}{r} \) due to (5); count them the same way as 5/13/24 fixing each \( U \). By symmetry, less than \( m \exp \left( – \frac{m}{12 r} \right) \) of them fail to meet \( \frac{m}{2r} < |U \cap X_i| < \frac{2m}{r} \) for any particular \( X_i \).

So, all the other \( (U, \boldsymbol{X} )\) that amount to \( \left[ 1- r m \exp \left( – \frac{m}{12 r} \right) \right] |{\mathcal F}| \prod_{i=1}^r {n-(i-1)n/r \choose n/r} \) or more meet the double inequality for every strip \( X_i \). Hence, there must be an \( \boldsymbol{X} \) and subfamily \( {\mathcal F’ } \subset {\mathcal F} \) meeting the two conditions given by Theorem 5, completing its proof.

The theorem helps us justify the simultaneous core growth presented on 6/17/24 for \( k=3 \) completely. Given an \( {\mathcal F} \) satisfying \( \Gamma (b) \) where \( b=m^{\frac{1}{2} + \epsilon} \), find a good \( \boldsymbol{X} \) and \( {\mathcal F}’ \) by the theorem. For \( j=1, 2, \ldots, r \), we recursively prove the existence of:

  • mutually disjoint three sets \( C_i \) with \( |C_i | = O( j r \ln m ) \),
  • and subfamilies \( {\mathcal F}_i \subset {\mathcal F}'[C_i] \) for \( i \in [3] \),
  • satisfying the \( \Gamma (b_j, 2) \)-condition in \( X – C_i \), i.e., \( | {\mathcal F}_i [S] | < 2 b_j^{-|S|} | {\mathcal F}_i| \) for every nonempty \( S \subset X – C_i \), where \( b_j = b( 1- 1/r)^j \),
  • and \( |U \cap X_{j’}| = q_{j’} \) for each \( j’ \le j \) and \( i \in [3]\), some \( q_{j’} \in \left( \frac{m}{2r}, \frac{2m}{r} \right) \), and all \( U \in {\mathcal F}_i \).

You can check the truth for each \( j \) by performing the process described there. Next time, we’ll generalize this to any given \( k \) to pose our final solution to the initial core growth problem.

Links for mobile devices

Partitioning Sets and Families (11)

In our proof of Lemma A.1 of Paper S4, we figured out last time that (A.3): \( u \le – \alpha \Delta \) suffices for the \( u \) expressed as (A.4) in the two cases of \( \alpha \).

In Case 2 where \[ \frac{y’}{x’} \ge \frac{2x}{y}, \qquad p \ge 1, \qquad \textrm{and} \qquad \alpha = \ln 4 -1> 0.386, \] we see that \( \left( \frac{1}{p} + 1 \right) \ln (1+p) \ge 2 \ln 2 \): put the LHS in \( f (p) \) to find \( f'(p) = p^{-2} [p – \ln (1+p) ] >0 \), due to \( p – \ln (1+p) >0 \) if \( p \ge 1 \). So \( f(p) \) takes minimum \( f(1) = 2 \ln 2 \) at \( p=1 \).

\( |q| \) is always less than 1 so we can still use the above upper bound. Consequently, \[ u \le -\Delta (2 \ln 2 – 1) = – \alpha \Delta, \] confirming (A.3) in Case 2 as well. We’ve completed the proof of Lemma A.1 now.

Links for mobile devices

Partitioning Sets and Families (10)

We now show (A.3): \( u \le – \alpha \Delta \) to prove Lemma A.1 of Paper S4 that is also posed on 6/25/24. We figured out last time that (A.3) implies the lemma.

We’ll use the Taylor series of \( \ln (1+x) \) in (A.4) to show \( u \le -\alpha \Delta \). But we need to check if \( |p| < 1 \) and \( |q|<1 \) before it.

We’ll continue the arguments to finish proving (A.3) next time.

Links for mobile devices

Partitioning Sets and Families (9)

We now prove Lemma A.1 of Paper S4. To devise an upper bound on \( \ln \frac{{x-x’ \choose y-y’}{x’ \choose y’}}{{x \choose y}} \), we rely on Lemma 1.1 of Paper S3 again like 5/27/24 and 6/10/24:

Please check this to get convinced, where \( u \) is from the \( y \left( \ln \frac{x}{y} + 1 \right)\) terms of (A.1) for the three binomial coefficients, \( v_j \) are from the \( y s \left( \frac{y}{x} \right) \) terms, and \( – 2^{-1} \ln 2 \pi \beta \) is from the \( \frac{1}{2} \ln \frac{x}{2 \pi y (x-y)} \) terms.

This simplifies the situation a lot allowing us to focus on just two relations: \[ \textrm{(A.2)} \qquad v_j \ge 0, \quad \textrm{for all}~ j>0, \qquad \qquad \textrm{and} \qquad \qquad \textrm{(A.3)} \qquad u \le – \alpha \Delta. \] It’s clear in the above figure that the two imply our target bound.

We start confirming (A.3) next time.

Links for mobile devices

Partitioning Sets and Families (8)

Let’s focus on Lemma A.1 of Paper S4 that shows the number bound indicating the chart we made last time. To show (4) there, the formulation of Lemma A.2 in Appendix of Paper 3 is not quite relevant although it is ok for Lemma 4.1 on p. 12 to refer to. It’s good if it’s changed into the formulation of Lemma A.1, covering a broader range of applications such as showing (4) to resolve our initial core growth problem.

Before breaking down its proof, let’s get convinced that \( {m \choose q}{n-m \choose d-q} \) regarded as a function of \( q \in [0, m] \) takes maximum around \( q \simeq \frac{md}{n} = \frac{m}{r} \) given that \( m \ll d \ll n \). This is because \[ {m \choose q+1} = \frac{m-q}{q+1} {m \choose q}, \quad \textrm{and} \quad {n-m \choose d-q-1} =\frac{d-q}{n-m+q+1}{n-m \choose d-q}, \] so \( {m \choose q+1}{n-m \choose d-q-1} = t {m \choose q} {n-m \choose d-q} \) where \[ t= \frac{(m-q)(d-q)}{(q+1)(n-m+q+1)} = \frac{md}{qn} \frac{\left[1-O(q/m) \right]\left[1-O(q/d) \right]}{\left[1 + O(1/q) \right] \left[1 – O(m/n) \right]} \simeq \frac{md}{qn}, \] Here the approximate equality is due to the extra condition \( 1 \ll q \ll m \) added to \( m \ll d \ll n \). So, if \( q > m/r + o(1) \), then \( {m \choose q}{n-m \choose d-q} \) increases when \( q\) is incremented, and decreases if \( q < m/r – o(1) \). (Asymptotic notations here work just as well as 8/11/23. )

If \( q = O(1) \), the value of \(t \) is \( \frac{m}{O(r)} \) so it has the same asymptotic behavior upon the incrementation. If \( q \ge \epsilon m \), then \( {m \choose q} \) increases by less than \( \epsilon^{-1} \), and \( {n-m \choose d-q} \) reduces by an approximate factor \( r^{-1} \) not affecting the asymptotic decrease rate. Therefore, we conclude that \( {m \choose q}{n-m \choose d-q} \) takes maximum at either \( \lfloor m/r \rfloor \) or \( \lceil m/r \rceil \) when \( m \ll d \ll n \): if \(q \le \lfloor m/r \rfloor -1 \) the function still increases, and if \(q \ge \lceil m/r \rceil +1 \), it decreases. There’s a subtlety at \( q= \lceil m/r \rceil \), but you can check that \( q= \lceil m/r \rceil~ \Rightarrow~ t<1 \) by a bit more involved argument, so you know it can’t take maximum at \( q \ge \lceil m/r \rceil +1 \).

As said before, we are analyzing the distribution of the summand \( {m \choose q}{n-m \choose d-q} \) of the Vandermode’s identity \( \sum_{q=0}^m {m \choose q}{n-m \choose d-q} = {n \choose d} \). This is equivalent to the following visualization involving random \(d \)-sets:

It’s good to know that random \( d \)-sets are distributed in \( X \) in the above sense. Each element of \( U \) is chosen almost independently so it gets in \( X’ \) with probability \( \simeq m/n \), and in \( X- X’ \) with probability \( \simeq (n-m)/n \). (Element placements are not completely independent because you can’t place one where it’s already taken. But \(n \) is much larger than \( d \) giving a small difference from \(d \) independent trials.) So a typical \(U \) must have \( \frac{|U \cap X’|}{|U- X’|} \) around \( \frac{m}{n-m} \). This insight allows us to derive (2) on 5/27/24 and Remark 4 on 6/10/24, where we can just consider typical \(U \) with \( m_*/2 < q < 2m_* \) because all others have a negligible occurrence probability.

A lot of statistical distributions, not just combinatorial ones, have a similar tendency that event occurrence probabilities are concentrated in a small range of a random variable. This is often due to the nature of some binomial coefficients.

Here is our formulation of the lemma:

So if the lemma is true, we have (4) last time stepping forward to resolve our problem. We’ll start seeing its proof next time.

Links for mobile devices