What am I looking at?
Each dot is one of ~1,750 estimated users who wrote 5 or more fiction-related conversations in the WildChat dataset. Users with similar fiction prompts appear close together.
Each user's position is the mean embedding of all their fiction prompts (all-MiniLM-L6-v2, 384 dims), reduced to 50 dims with PCA, then to 2D with UMAP (n_neighbors=15, min_dist=0.05, cosine metric).
Topic labels come from a spatial grid over the 2D layout (4×4 coarse cells, 3×3 fine cells within each populated coarse cell), with each cell named by GPT-4o from a sample of representative prompts. Coarse labels appear at moderate zoom; fine labels appear when zoomed in further. Enable Topics in the legend to color users by their topic cluster.
Dot size reflects conversation count. Colors show majority category (>50% of conversations). Click a category to toggle; double-click to isolate. Click a topic to isolate it.