A histogram is not a bar chart with vertical bars.
In a bar chart, the bar provides the graphical representation of a particular value. So the bar might be marked '1995' and the length of the bar would be the GDP for 1995.
In a histogram, the bar represents the number of samples that fall between two values. The important word is 'between'. In a vertical bar chart, axis labels are the value associated with each bar and they're located under each bar; in a histogram, each bar is about what's happening between two values, so labeling each bar with one value is at best ambiguous and usually confusing.
If the value is under the bar, which value is it? Is it the low value? the high value? the average?
No. The labels should be between the bars so that the low and high are obvious. Then the only thing left to interpretation is which of the two is 'or equal'. Because we're going to include a cumulative probability line that expresses the probability of 'this value or less', it makes sense for the bar to express the same relationship so that the high side is 'or equal'.
A picture demonstrates the principle:
There are a number of things about this presentation that make it ideal:
- The x-axis markings are on the lines between the bars. We can see near the center of the bars that the value has about 5% probability of being between 2.6 and 2.8.
- In the interests of legibility, we haven't tried to mark every grid line.
- We're usually not so much interested in the probability that the project will cost $3 million as we are that it will cost at most $3 million. That's what the cumulative probability plot is about. We can see that there's an 80% probability that the project will cost $3.4 million or less.
- The axis labels are in the middle of the chart. This puts them closer to interesting points on the chart.
- It puts the median value, the even-money bet--usually the most interesting value--near the center of the crosshairs formed by the two sets of axis labels. We can see immediately that the odds are the same that the project will cost more than or less than $2.8 million.
- It's pretty to look at.
It takes a little extra effort to do one of these but it's worth it. A couple of notes:
First, there's one more axis label than there are bars. Histogram widgets usually expect the same number of each.
Second, it's a lot plainer if you make sure the axis label range goes from smaller than the minimum to bigger than the maximum. You can do this in concert with choosing nicely rounded interval values so you aren't marking axes with 10-digit fractions.