Loading…
This event has ended. Visit the official site or create your own event on Sched.
Click here to return to main conference site. For a one page, printable overview of the schedule, see this.
View analytic
Thursday, June 30 • 10:45am - 10:50am
Clustering of Hierarchically-Linked Multivariate Datasets

Log in to save this to your schedule and see who's attending!

We present the growclusters package for R that implements a maximum posterior estimation of partitions (clusters) using a penalized optimization function derived from the limit of a Bayesian probability model under a multivariate Gaussian mixture on the mean, either under a Dirichlet process (DP) mixing measure or a hierarchical DP (HDP) mixing measure in the limit of a function of the global variance (to zero). We illustrate this package using data collected from a federal survey of business establishments. A special feature of this data is that it is collected under an informative sampling design. Under an informative sampling design the probability of inclusion depends on the surveyed response. We demonstrate a feature of the growclusters package that incorporates the sampling weights to “undo” the effects of the informative design to yield asymptotically unbiased estimation of the clusters.

Moderators
avatar for Max Kuhn

Max Kuhn

Pfizer

Thursday June 30, 2016 10:45am - 10:50am
Econ 140 579 Serra Mall, Stanford, CA 94305

Attendees (56)