Multiclass Blob Generator

A dataset generation tool by Allison Wong, Alark Joshi, and Sophie Engle

This tool allows you to generate a multiclass scatterplot dataset with a specific number of classes and blobs. A 'blob' is a set of randomly generated points in a circle around a central x, y coordinate within a given radius. Inspired by the make-blobs dataset generator in scikit-learn.

Dataset Configuration Required

The following configuration applies to the entire dataset. Modification to the dataset configuration may reset the per-blob configuration.


Class Distribution Per Blob Required

After completing the dataset configuration above, configure the number of points per class for each blob using the sliders below. The center x and y coordinates will be automatically calculated if left blank for all blobs. Otherwise, the center coordinates must be configured for all blobs.


JSON Configuration Optional

View Configuration

Click Generate Dataset above to generate a JSON configuration from your current selections.

Or, paste in a previously saved JSON configuration and click the Load Configuration button below.