python - Matplotlib cumulative histogram - vertical line placement bug or misinterpretation? -


i not sure if bug or if misinterpreting output of matplotlib's cumulative histogram. e.g., expect "at x value, corresponding y-value tells me how many samples <= x."

import matplotlib.pyplot plt  x = [1.1, 3.1, 2.1, 3.9] n, bins, patches = plt.hist(x, normed=false, histtype='step', cumulative=true) plt.ylim([0, 5]) plt.grid() plt.show() 

enter image description here

see 2nd vertical line @ x=1.9? shouldn't @ 2.1 given data in x? e.g., @ x=3 read "3 samples have value x <= 3.1" ...

so, expect similar step plot.

plt.step(sorted(x), range(1, len(x)+1), where='post') plt.ylim([0, 5]) plt.grid() 

enter image description here

edit:

i using python 3.4.3 & matplotlib 1.4.3

if not set the bins parameter yourself, plt.hist choose (by default, 10) bins you:

in [58]: n, bins, patches = plt.hist(x, normed=false, histtype='step', cumulative=true)  in [59]: bins out[59]:  array([ 1.1 ,  1.38,  1.66,  1.94,  2.22,  2.5 ,  2.78,  3.06,  3.34,         3.62,  3.9 ]) 

the return value bins shows edges of bins matplotlib chose.

it sounds want values in x serve bin edges. using bins=sorted(x)+[np.inf]:

import numpy np import matplotlib.pyplot plt  x = [1.1, 3.1, 2.1, 3.9] bins = sorted(x) + [np.inf] n, bins, patches = plt.hist(x, normed=false, histtype='step', cumulative=true,                              bins=bins) plt.ylim([0, 5]) plt.grid() plt.show() 

yields

the [np.inf] makes right edge of final bin extend infinity. matplotlib smart enough not try draw non-finite values, see left-edge of last bin.


Popular posts from this blog

c# - ODP.NET Oracle.ManagedDataAccess causes ORA-12537 network session end of file -

utf 8 - split utf-8 string into bytes in python -

matlab - Compression and Decompression of ECG Signal using HUFFMAN ALGORITHM -