You are a guest. Restricted access. Read more.

# Differences

This shows you the differences between two versions of the page.

man:stat:statistics [2013/07/04 21:52] admin |
man:stat:statistics [2014/04/04 21:32] (current) admin [Setting limits] |
||
---|---|---|---|

Line 40: | Line 40: | ||

Run this script and you will get a very detailed information about this distribution (rather self-explanatory) | Run this script and you will get a very detailed information about this distribution (rather self-explanatory) | ||

- | <hidden Click here to see the output of this script> | + | <hidden Click here to see the result of this code> |

<code> | <code> | ||

Size: 1000 | Size: 1000 | ||

Line 80: | Line 80: | ||

Distinct elements & frequencies not printed (too many). | Distinct elements & frequencies not printed (too many). | ||

</code> | </code> | ||

- | |||

</hidden> | </hidden> | ||

Line 121: | Line 120: | ||

This will print the following values: | This will print the following values: | ||

- | <hidden Click here to see the output of this script> | + | <hidden Click here to see the output> |

<code> | <code> | ||

error 0.996592835069 | error 0.996592835069 | ||

Line 132: | Line 131: | ||

+ | ====== Comparing two histograms====== | ||

- | + | Comparison of two histograms test hypotheses that two histograms represent identical distributions. | |

- | | + | Both [[/scavis/api/doc.php/jhplot/H1D | H1D]] and [[/scavis/api/doc.php/jhplot/H2D | H2D]] histograms have the method called "compareChi2(h1,h2)" |

- | ====== Statistical tests====== | + | It calculates Chi2 between 2 histograms taking into account errors on the heights of the bins. The number chi2/ndf gives the estimate: values smaller or close to 1 indicates similarity between 2 histograms. |

- | Two distributions (1D and 2D histograms, P1D data points) can be compared by applying several | + | |

- | statistical tests. The following statistical comparisons are available | + | |

- | | + | |

- | * Chi2 | + | |

- | * Anderson-Darling | + | |

- | * Kolmogorov-Smirnov | + | |

- | * Goodman | + | |

- | * Kuiper | + | |

- | * Tiku | + | |

- | | + | |

- | | + | |

- | Consider a simple statistical test: compare 2 histograms. You can generate 2 similar histograms using this code snippet: | + | |

<code python> | <code python> | ||

- | from java.awt import Color | + | d=compareChi2(h1,h2) # h1, h2 are H1D or H2D histograms defined above |

- | from java.util import Random | + | chi2=d[0] # chi2 |

- | from jhplot import * | + | ndf =d[1] # number of degrees of freedom |

- | | + | p =d[2] # probability (p-value) |

- | c1 = HPlotJa("Canvas") | + | |

- | c1.setGTitle("Statistical comparisons") | + | |

- | c1.visible() | + | |

- | c1.setAutoRange() | + | |

- | | + | |

- | h1 = H1D("Histo1",20, -2, 2.0) | + | |

- | h1.setColor(Color.blue) | + | |

- | h2 = H1D("Histo2",20, -2, 2.0) | + | |

- | r = Random() | + | |

- | for i in range(10000): | + | |

- | h1.fill(r.nextGaussian()) | + | |

- | h2.fill(r.nextGaussian()) | + | |

- | if (i<100): h2.fill(2*r.nextGaussian()+2) | + | |

- | h1.setErrAll(1) | + | |

- | h2.setErrAll(0) | + | |

- | c1.draw(h1) | + | |

- | c1.draw(h2) | + | |

</code> | </code> | ||

- | Here we show statistical uncertainties only for the first (blue) histogram (see the method setErrAll(0)). | ||

- | The output of this code is shown below | ||

- | <hidden Click here to see the output of this script> | + | Two histograms are identical if chi2=0. Make sure that both histograms have error (or set them to small values). |

- | {{statistical_comparison.png | Two similar histograms}} | + | |

- | </hidden> | + | |

- | Now we can perform a several tests to calculate the degree of similarity of these distributions (including their uncertainties). | + | A similar method also exists for [[/scavis/api/doc.php/jhplot/P1D | P1D]] data points. The comparison is done for Y-values, assuming symmetric errors on Y. |

- | Below we show a code which compares these two histograms and calculate Chi2 per degree of freedom: | + | However, data should be ordered in X for correct comparison. |

- | <ifauth !@member> | ||

- | <note important> | ||

- | Unregistered users have a limited access to this section. One can unlock this example after becoming [[/scavis/members/selock| a full member]]. | ||

- | </note> | ||

- | </ifauth> | ||

- | <ifauth @member,@admin,@editor> | ||

- | <code python 1|t stat_comparisons.py> | ||

- | extern> stat_comparisons.py | ||

- | </code> | ||

- | </ifauth> | ||

- | |||

- | |||

- | The output of this script is shown here: | ||

- | <code> | ||

- | AndersonDarling method= 2.21779532164 / 20 | ||

- | Chi2 method= 0.786556311893 / 20 | ||

- | Goodman method= 0.624205522632 / 20 | ||

- | KolmogorovSmirnov method= 0.419524135727 / 20 | ||

- | </code> | ||

====== Linear regression analysis ====== | ====== Linear regression analysis ====== | ||

Line 225: | Line 172: | ||

{{lin_reg.png|}} | {{lin_reg.png|}} | ||

- | |||

- | |||

- | |||

- | ====== Distribution functions ====== | ||

- | Many useful distribution functions can be found in [[/scavis/api/doc.php/cern/jet/stat/Probability.html|cern.jet.stat.Probability]] package. The package contains | ||

- | numerical integration of certain probability distributions. Below we will show how to use | ||

- | the normal distribution which is very useful distribution in many statistical analyses. | ||

- | |||

- | In the example below we will compute the probability that our random outcome is within a specified interval using the normal distribution. | ||

- | |||

- | The code below returns the area under the normal probability density function, integrated from minus infinity to -1.17 (assumes mean is zero, variance is one). | ||

- | <code python> | ||

- | from cern.jet.stat.Probability import * | ||

- | print normal(-1.17) | ||

- | </code> | ||

- | For the two-sided case, one can multiply the result by 2. | ||

- | |||

- | |||

- | <ifauth !@member> | ||

- | <note important> | ||

- | Unregistered users have a limited access to this section. One can unlock this example after becoming [[/scavis/members/selock| a full member]]. | ||

- | </note> | ||

- | </ifauth> | ||

- | <ifauth @member,@admin,@editor> | ||

- | |||

- | |||

- | One can also calculate the inverse function: | ||

- | This returns the value, x, for which the area under the Normal (Gaussian) probability density function (integrated from minus infinity to x) is equal to the argument y (assumes mean is zero, variance is one): | ||

- | <code python> | ||

- | from cern.jet.stat.Probability import * | ||

- | print normalInverse(0.12) | ||

- | </code> | ||

- | |||

- | This is especially important it statistics: assume that in 12% cases a value X can be as large as X(max) due to a chance. | ||

- | Then the above code generate in how many "sigma"s you can define such access of events (1.174 sigma). | ||

- | |||

- | |||

- | |||

- | </ifauth> | ||

- | |||

Line 357: | Line 264: | ||

<ifauth @member,@admin,@editor> | <ifauth @member,@admin,@editor> | ||

- | Please go to [[statistics_limit]] | + | Please go to [[man:stat:slimits]] |

</ifauth> | </ifauth> | ||

Line 366: | Line 273: | ||

- | <hidden click here if you want to know more> A complete description of how to use Java, Jython and SCaVis for scientific analysis is described in the book [[/scavis/book/|Scientific data analysis using Jython and Java]] published by [[http://www.springer.com/computer/book/978-1-84996-286-5| Springer Verlag, London, 2010]] (by S.V.Chekanov) </hidden> | + | <hidden Click to read more> A complete description of how to use Java, Jython and SCaVis for scientific analysis is described in the book [[/scavis/book/|Scientific data analysis using Jython and Java]] published by [[http://www.springer.com/computer/book/978-1-84996-286-5| Springer Verlag, London, 2010]] (by S.V.Chekanov) </hidden> |

- | | + | |

- | | + | |

- | | + | |

- | | + | |

- | <ifauth !@member> | + | |

- | <note important> | + | |

- | One can comment and discuss this section after becoming | + | |

- | [[/scavis/members/selock| a full member]]. | + | |

- | </note> | + | |

- | </ifauth> | + | |

- | <ifauth @member,@admin,@editor> | + | |

- | ~~DISCUSSION~~ | + | |

- | </ifauth> | + | |