python - Matplotlib cumulative histogram - vertical line placement bug or misinterpretation? -
i not sure if bug or if misinterpreting output of matplotlib's cumulative histogram. e.g., expect "at x value, corresponding y-value tells me how many samples <= x."
import matplotlib.pyplot plt x = [1.1, 3.1, 2.1, 3.9] n, bins, patches = plt.hist(x, normed=false, histtype='step', cumulative=true) plt.ylim([0, 5]) plt.grid() plt.show()
see 2nd vertical line @ x=1.9
? shouldn't @ 2.1 given data in x
? e.g., @ x=3 read "3 samples have value x <= 3.1" ...
so, expect similar step plot.
plt.step(sorted(x), range(1, len(x)+1), where='post') plt.ylim([0, 5]) plt.grid()
edit:
i using python 3.4.3 & matplotlib 1.4.3
if not set the bins
parameter yourself, plt.hist
choose (by default, 10) bins you:
in [58]: n, bins, patches = plt.hist(x, normed=false, histtype='step', cumulative=true) in [59]: bins out[59]: array([ 1.1 , 1.38, 1.66, 1.94, 2.22, 2.5 , 2.78, 3.06, 3.34, 3.62, 3.9 ])
the return value bins
shows edges of bins matplotlib chose.
it sounds want values in x serve bin edges. using bins=sorted(x)+[np.inf]
:
import numpy np import matplotlib.pyplot plt x = [1.1, 3.1, 2.1, 3.9] bins = sorted(x) + [np.inf] n, bins, patches = plt.hist(x, normed=false, histtype='step', cumulative=true, bins=bins) plt.ylim([0, 5]) plt.grid() plt.show()
yields
the [np.inf]
makes right edge of final bin extend infinity. matplotlib smart enough not try draw non-finite values, see left-edge of last bin.