2020年5月16日土曜日

ヒストグラム

講義で使える統計素材」シリーズ.今回は,ヒストグラム.(相対)(累積)度数分布図.ビンの数を多くするとどうなるかも説明できます.

In [46]:
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as st

データ

In [47]:
x = np.random.normal(170, 10, 100)
print(x)
[158.24320433 162.94838568 162.11133712 175.35537947 172.6962708
 179.10338098 155.81699752 158.92002963 156.46826019 172.28567808
 197.95079012 188.07917114 181.00968357 170.99783736 184.54482046
 175.00446723 170.74687268 166.18332842 170.2128389  146.3574405
 163.562939   169.95986657 156.32694669 169.15644232 155.84344875
 170.1999929  166.31515079 166.86144516 166.37166196 170.32741908
 147.78055693 170.16736071 164.02677852 194.21864022 177.42429086
 168.74330465 174.8634415  184.46847934 165.53672366 171.22202865
 171.03201133 186.5275664  163.61939686 163.42799341 172.28811819
 170.93431017 182.48670153 161.72179879 181.63220132 163.37448657
 163.00334978 162.65161134 154.19043857 170.13033343 141.61066959
 165.69358082 186.09074608 179.57656712 169.06850193 169.06331382
 167.8413889  189.75613997 170.85406415 183.85982523 162.17183487
 165.10150045 162.23758274 176.634447   166.36371228 166.1648098
 183.78525623 167.21273418 191.1178987  179.76105503 171.90026591
 179.01265344 173.68042536 184.00147874 177.38163259 157.01653991
 175.72816863 162.82530166 151.41549064 167.48216197 155.58627431
 168.86557917 164.49105832 177.12232003 176.06627892 182.35356304
 171.68729297 182.33701318 174.93332109 181.28981897 188.47039026
 178.66431214 172.49978129 164.12885896 164.49599365 163.61077239]

階級

In [3]:
step = 10
bins = range(140, 210, step)

度数分布

In [67]:
plt.figure(figsize=(4,4))
plt.ylim(0, 25)
n, bins_, patches  = plt.hist(x, bins, rwidth=0.8)

plt.savefig('plot_out.svg')

累積度数分布

In [60]:
plt.figure(figsize=(4,4))
plt.ylim(0, 100)
n, bins, patches  = plt.hist(x, bins, rwidth=0.8, cumulative=True)
plt.savefig('plot_out.svg')

相対度数分布

In [65]:
plt.figure(figsize=(4,4))
plt.ylim(0, 0.25)
#n, bins, patches  = plt.hist(x, bins, rwidth=0.8, density=True) #合計が1にならないバグがある
n, bins  = np.histogram(x, bins, density=True) #合計が1にならないバグがある
plt.bar(bins[:-1]+step/2, n/sum(n), width=step*0.8)
plt.savefig('plot_out.svg')

累積相対度数分布

In [62]:
plt.figure(figsize=(4,4))
plt.ylim(0, 1.0)
n, bins, patches  = plt.hist(x, bins, rwidth=0.8, density=True, cumulative=True)
#plt.bar(bins[:-1]+step/2, n, width=step*0.9)
plt.savefig('plot_out.svg')