You’re looking at a map of ~1,750 estimated users who wrote 5+ fiction-related conversations in the WildChat dataset. Each dot is one user, placed by the average embedding of their prompts; dot size shows how many conversations they had.
User Embedding Explorer
What am I looking at?

Each dot is one of ~1,750 estimated users who wrote 5 or more fiction-related conversations in the WildChat dataset. Users with similar fiction prompts appear close together.

Each user's position is the mean embedding of all their fiction prompts (all-MiniLM-L6-v2, 384 dims), reduced to 50 dims with PCA, then to 2D with UMAP (n_neighbors=15, min_dist=0.05, cosine metric).

Topic labels come from a spatial grid over the 2D layout (4×4 coarse cells, 3×3 fine cells within each populated coarse cell), with each cell named by GPT-4o from a sample of representative prompts. Coarse labels appear at moderate zoom; fine labels appear when zoomed in further. Enable Topics in the legend to color users by their topic cluster.

Dot size reflects conversation count. Colors show majority category (>50% of conversations). Click a category to toggle; double-click to isolate. Click a topic to isolate it.

Learn more →

Loading...
Click to pin & view details