Our paper with Sebastian Macaluso, Craig Greenberg, Nick Monath, Kyle Cranmer, Andrew McGregor, and Andrew McCallum is now published in AISTATS2021. You can find a preprint of the article on arXiv. We present novel dynamic-programming algorithms for exact inference in hierarchical clustering based on a novel trellis data structure, and we prove that we can exactly compute the partition function, maximum likelihood hierarchy, and marginal probabilities of sub-hierarchies and clusters. Our algorithms scale in time and space proportional to the powerset of N elements which is super-exponentially more efficient than explicitly considering each of the (2N-3)!! possible hierarchies. Thanks to NSF 1934846 for funding this work.
Cluster Trellis: Data Structures & Algorithms for Exact Inference in Hierarchical Clustering appears in AISTATS2021
Published on:
Apr 12, 2021