H-NCut – Hypergeometric NCut

H-NCut is the default and recommended objective. It provides a per-vertex hypergeometric envelope using Holder-binned degree weights (Theorem 2).

Quick usage

from pgcuts import HyCut

# hyp_ncut is the default
labels = HyCut(n_clusters=10).fit_predict(X)

Objective

\[\text{H-NCut} = \sum_\ell \sum_i \frac{M_{i\ell}(P)}{d_i} \cdot \Phi_\ell(d_i)\]

where the per-vertex envelope \(\Phi_\ell(d_i)\) is a Holder product over degree bins:

\[\Phi_\ell(d_i) = \prod_j \left[ {}_{2}F_{1}\!\left(-m, 1;\, \frac{d_i}{\beta^*_j}+1;\, \bar{\alpha}_{\ell j}\right) \right]^{w_j}\]

Here \(\beta^*_j = \min_{v \in \text{bin } j} d_v\) and \(w_j\) is the bin weight.

Binning strategies

from pgcuts import equal_size_bins, log_kmeans_bins

degrees = W.sum(axis=1).A1  # from the KNN graph
bins = log_kmeans_bins(degrees, num_bins=16)
  • equal_size_bins: Simple, robust. Bins have equal vertex count.

  • log_kmeans_bins: K-Means on log-degrees. Better for heavy-tailed distributions (recommended).

Key hyperparameters

m (int, default: 512)

Polynomial degree for \({}_{2}F_{1}\). Higher = tighter bound.

num_bins (int, default: 16)

Number of degree bins. More bins = finer envelope approximation.

n_neighbors (int, default: 50)

KNN graph density. 20–100 typical range.

Low-level API

from pgcuts.algorithms import hyp_ncut_step

cut_loss, balance, alpha_ema = hyp_ncut_step(
    logits, left_idx, right_idx, w,
    alpha_ema, degrees, node_to_bin,
    q_stars, beta_stars, bin_weights,
    m=512, ema=0.9,
)
H-RCut vs H-NCut

Property

H-RCut

H-NCut

Normalization

RatioCut (\(1/n\))

NCut (\(1/\text{vol}\))

Vertex weights

Homogeneous (\(\beta = 1\))

Per-vertex (\(\beta = d_i\))

Envelope

Per-cluster

Per-vertex (Holder-binned)

Best for

Equal-sized clusters

Unbalanced clusters, varied degrees