Comprehensive Guide: Creating a Symmetrical Beeswarm Plot in Stata

This document captures the full step-by-step development of creating an organic, curved, and perfectly symmetrical vertical beeswarm plot for 200 countries using log(GDP) data in Stata.


Step 1: The Standard Horizontal Stacking Plot

User Query

how do I make a stripplot in stata so it looks like a beeswarm. 200 countries with log(gdp)

Solution

To make a beeswarm-style plot in Stata, use the stripplot package with the stack option. This command algorithmically stacks the points, spreading out overlapping values into a symmetrical “swarm” shape.

First, install the package from the SSC:

ssc install stripplot, replace

Assuming your variable for log(GDP) is named log_gdp, run this syntax:

stripplot log_gdp, over(country) stack width(0.1) mlabel(country) ///
    msymbol(circle_hollow) msize(small) yscale(off)

Key Options Explained:


Step 2: Transitioning to a Single Vertical Swarm

User Query

starting with one swarm. Want it verically

Solution

To create a single, vertical beeswarm plot for your 200 countries, you must include the vertical option and completely remove any reference to the over() option. By default, omitting over() forces Stata to plot everything as one unified distribution.

Run this exact syntax:

stripplot log_gdp, vertical stack width(0.1) center ///
    msymbol(circle_hollow) msize(small)     ///
    xscale(off) xtitle("") ylabel(#10)

Key adjustments for a vertical swarm:

💡 Pro Tip for Country Labels:

With 200 data points, adding labels to every dot will make the plot unreadable. If you want to highlight specific country names, use a conditional mlabel() rule instead:

* This only prints the country name if log_gdp is exceptionally high or low
stripplot log_gdp, vertical stack width(0.1) center ///
    msymbol(circle_hollow) msize(small) xscale(off) xtitle("") ///
    mlabel(country) mlabvpos(12) if log_gdp > 11 | log_gdp < 6

Step 3: Replicating a Curved Shape (Moving Away from Rigid Lines)

User Query

not now. Many beeswarms have a curved shape while yoy plot countries on horizontal lines. Can we replicate the curved?

Solution

Yes, you can absolutely replicate that fluid, continuous organic shape.

The rigid “horizontal line grid” look happens because the stack option forces data into distinct mathematical histogram-style bins. To get a smooth, curved shape where points glide naturally around each other, you need to switch from algorithmic stacking to a dense random horizontal jitter combined with point transparency.

Use this syntax to get a beautifully contoured, non-gridded vertical swarm:

stripplot log_gdp, vertical jitter(3) jitterseed(1234) ///
    msymbol(circle) msize(small) mcolor(navy%40)      ///
    xscale(off) xtitle("") ylabel(#10)

Why this creates the “Curved” look:


Step 4: Expanding Plot Width

User Query

works, but there is hardly any width in the graph

Solution

When Stata plots a single group vertically, it defaults to an ultra-skinny horizontal scale. To blow up the width and make the curves highly visible, you need to force the window to be wider with an aspect ratio and scale up the jitter value.

Run this modified code to get a wider, more expansive swarm:

stripplot log_gdp, vertical jitter(15) jitterseed(1234) ///
    msymbol(circle) msize(small) mcolor(navy%40)       ///
    xscale(range(0 2) off) xtitle("") ylabel(#10)       ///
    aspect(0.8)

What changed to fix the width:


Step 5: Achieving True Left/Right Symmetry

User Query

works. It is projecting to the right. Can it be symmetrical pos. and neg. on x-axis

Solution

By default, Stata’s jitter() option only projects points randomly to the right (positive side) of the axis line. To achieve a perfectly balanced, symmetrical swarm on both the positive and negative sides of the center axis, you need to use the center option alongside the jitter.

However, because Stata’s native jittering inside stripplot is inherently one-sided, the clever workaround is to pair center with the stack option, but use a very tiny bin width. This mimics continuous random jittering while forcing Stata’s symmetric balancing algorithm to kick in.

Run this updated syntax to center your swarm:

stripplot log_gdp, vertical stack width(0.01) center    ///
    msymbol(circle) msize(small) mcolor(navy%40)        ///
    xscale(off) xtitle("") ylabel(#10)                  ///
    aspectratio(1.5) xsize(4) ysize(6)

Why this forces symmetry:


Final Micro-Adjustment Cheat Sheet