Partitioning Sets and Families (7)

We’ve finished reviewing the \(r\)-split lemma presented on 5/13/24, outlining last time how we can resolve the initial core growth problem with its proof technique. However, our plan for the fix is not quite complete. We can see the shortcoming this way: consider \( k=3 \) for simplicity. As said in the last post, we can grow the core \( C_1 \) up to \( |C_1|= O (r \log m) \) for the first family \( {\mathcal F}_1 \subset {\mathcal F} \) to maintain its \( \Gamma(b_1) \)-condition for some \(b_1\simeq b\). This works for many \( d\)-sets \( X_1 \) but may not for all of them. Keep \( C_1 \) for each good \(X_1 \) deleting \( {\mathcal F}[C_1] \) from \({\mathcal F} \). When we try the core growth for the second family \( {\mathcal F}_2 \), we notice the remaining \( {\mathcal F} \) depends on both \(X_1 \) and \(C_1 \). Try finding \(C_2 \) but the already fixed \(X_1 \) for each \( C_1 \) may not work for \( C_2 \).

This kind of big subtleties happen very often in this study almost as a default, which can be both interesting and frustrating. Fortunately, there’s a good repair for this particular case. Instead of fixing the target \( q \), you allow all \( U \in {\mathcal F}_i \) such that \[ (3) \qquad \frac{1}{2} m_* < q < 2m_*, \quad \textrm{where} \quad q = |X_{j+1} \cap U|, \quad m_* = \frac{m}{r} \simeq m^{\frac{1}{2} +\epsilon^2}, \] and \( j=0 \). It can be shown that a vast majority of \( U \) meet the double inequality, so we can a priori eliminate all \( X_{j+1} \) undesired for any \( {\mathcal F}_i \) with trivial sparsity increases.

Now let’s consider the general \( j^{th} \) induction step where we are given three \( {\mathcal F}_i \subset {\mathcal F}[C_i] \) with a \( \Gamma(b_j) \)-condition. After the \( (X_{j+1}, U) \)-arguments in \( X’ \) as 5/20/24, fix any remaining good \( X_{j+1} \) to eliminate all \( U \) such that \( \neg \) (3) anywhere. Then \( {\mathcal F}_i\) satisfy the \( \Gamma(b_j) \)-condition approximately because \( |{\mathcal F}_i| \) didn’t change too much.

Do the simultaneous core growth we mentioned last time: let \( {\mathcal F}_{1, q} \) be the family of \( U \in {\mathcal F}_1 \) such that (3) with the \(q \). Choose a \( q \) with the largest \( {\mathcal F}_{1, q} \). Find a maximal \( S \subset X- C_1 \) such that \( |{\mathcal F}_{1, q}[S]| \ge b_{j+1}^{-|S|} |{\mathcal F}_{1, q}| \) for some \( b_{j+1} \) slightly smaller than \( b_j \). After it, perform the updates \( C_1 \leftarrow C_1 \cup S \) and \( {\mathcal F}_1 \leftarrow {\mathcal F}_{1, q}[S] \) that satisfies the \( \Gamma (b_{j+1}) \)-condition. Because \( |S|=O(r \log m) \), we can delete \( {\mathcal F}[C_1] \) from the other two \( {\mathcal F}_i \) keeping the \( \Gamma(b_j) \)-condition approximately true.

Now that the \( X_{j+1} \) is fixed, we can grow the second core \( C_2 \) without worrying about errors. Perform the same core growth for \( i=2\), after which eliminate \( {\mathcal F}[C_2] \) from both \( {\mathcal F}_1 \) and \( {\mathcal F}_3 \); it doesn’t disturb the approximate \( \Gamma \)-conditions since \( |C_2| \ll b_{j+1} \). Do the same for \( {\mathcal F}_3 \), and do it recursively for \( j=0, 1, \ldots, r\) to complete the construction. On the obtained \( r \)-split, the \( m\)-sets \( U \) just meet (3) rather than \( q= m_* \) exactly, but a constant factor on \( m_* \) isn’t an issue for the rank \( g \)-construction. The final \( {\mathcal F}_i \) satisfy the \( \Gamma(b_r) \)-condition for some \( b_r > \epsilon b \), and the three \( C_i \) are mutually disjoint.

This description of the construction process is almost complete to find our final \( {\boldsymbol X} \), \( C_i \), and \( {\mathcal F}_i \), except for the justifications of some number bounds. Especially, we need to be sure that the first step \( j=0 \) reduces less than \( \exp ( -\sqrt m ) \) of \( {\mathcal F} \) to eliminate \( U \) such that \( \neg \) (3).

In other words, we want to show \[ (4) \qquad \sum_{2^{-1} m_* < q < 2 m_* } \frac{{m \choose q}{n-m \choose d-q}}{{n \choose d}} > 1 – e^{-\sqrt m}. \] Graphically, it should look like:

It’s about the distribution of the summand \( {m \choose q}{n-m \choose d-q} \) of the Vandermode’s identity \( \sum_{q=0}^m {m \choose q}{n-m \choose d-q} = {n \choose d} \); we used it on 6/6/23 to prove the complement sparsity lemma. We see in the chart that \( f(q) \) is maximized when \( q=\frac{dm}{n}=\frac{m}{r} =m_* \), much larger than the values with \( q>20=2m_* \) or \( q<5=m_*/2 \). The function decreases quicker when \( q> 2 m_* \).

Lemma A.1 in Appendix of Paper S4 proves it quantitatively implying (4). When arranging materials for the next posts, I noticed that its formulation copied from Paper 3 was not quite relevant in order to confirm (4). So I fixed it with the same proof to meet our need. Also, the parameter of the \( \Gamma \)-condition in Proposition 3 on 11/13/23 should read \( c m^{\frac{1}{2} +\epsilon} k \ln^2 k \) changing \( \ln k \) into \( \ln^2 k = (\ln k)^2 \). This works although it may not be the optimal bound.

Lemma A.1 is useful in this research. We’ll need it later when we discuss the circuit complexity of the clique function. In the next couple of posts, we’ll break down the proof for complete understanding.

Links for mobile devices

Leave a Reply

Your email address will not be published. Required fields are marked *