And, by adding noise ("jitter" or "dither") to each point, you can still use a p...

And, by adding noise ("jitter" or "dither") to each point, you can still use a plain scatterplot even for many kinds of overlapping data.

It's simple to do and mimics reversing the effect of truncation of the data (at least for continuous quantities). Just use uniformly distributed values that are as wide as one bin width.

For most purposes, I prefer adding dither, and then using transparency, to moving to a density plot, for exactly the reason you mention -- the density plot introduces another parameter, the smoothing method, which puts another layer between you and the data.