You are a guest. Restricted access. Read more.
SCaVis manual

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
man:stat:statistics [2013/07/04 21:52]
admin
man:stat:statistics [2014/12/13 20:33] (current)
admin
Line 6: Line 6:
  
  
-The package ​[[/​scavis/​api/​doc.php/​jhplot/​stat/​|jhplot.stat]] can be used for descriptive ​+The package ​<javadoc sc>jhplot/​stat/​package-summary|jhplot.stat</​javadoc> ​can be used for descriptive ​
 analysis of random distributions. ​ analysis of random distributions. ​
-Similarly, ​[[/​scavis/​api/​doc.php/​cern/​jet/​stat/​Descriptive | cern.jet.stat.Descriptive]]+Similarly, ​<javadoc sc>cern.jet.stat.Descriptive</​javadoc>​
 package contains descriptive methods to calculate many statistical characteristics. package contains descriptive methods to calculate many statistical characteristics.
  
 Consider also several other packages: Consider also several other packages:
  
-  * [[/​scavis/​api/​doc.php/​org/​apache/​commons/​math3/​stat/​StatUtils | Descriptive statistics]] +  * <javadoc sc>org/​apache/​commons/​math3/​stat/​StatUtils| Descriptive statistics</​javadoc>​ package 
-  * [[/​scavis/​api/​doc.php/​org/​apache/​commons/​math3/​distribution/​package-summary.html | Major statistical distributions]]+  * <javadoc sc>org/​apache/​commons/​math3/​distribution/​package-summary| Major statistical distributions</​javadoc>​ 
 +  * <javadoc sc>​cern/​jet/​stat/​package-summary| Colt statistics</​javadoc>​ package ​
  
  
 But before using such packages, check again the data containers such as But before using such packages, check again the data containers such as
-[[/​scavis/​api/​doc.php/jhplot/P1D | P1D]] or  [[/​scavis/​api/​doc.php/jhplot/H1D | H1D]]. They already have many useful methods to access statistical information on data.+<javadoc sc>​jhplot.P1D</javadoc> or <javadoc sc>jhplot.H1D</javadoc> ​. They already have many useful methods to access statistical information on data.
  
  
Line 40: Line 41:
 Run this script and you will get a very detailed information about this distribution (rather self-explanatory) Run this script and you will get a very detailed information about this distribution (rather self-explanatory)
  
-<hidden Click here to see the output ​of this script>+<hidden Click here to see the result ​of this code>
 <​code>​ <​code>​
 Size: 1000 Size: 1000
Line 80: Line 81:
 Distinct elements & frequencies not printed (too many). Distinct elements & frequencies not printed (too many).
 </​code>​ </​code>​
- 
 </​hidden>​ </​hidden>​
  
-One can access all such values using the method "​getStat()"​ which returns a Java Map (or Jython dictionary) with the key representing statistical characteristics ​ 
-of this array. 
  
 +Let us continue with this example and now we would like to return all statistical characteristics
 +of the sample as a dictionary. We can do this by appending the following lines that
 +1) create a dictionary "​stat"​ with key/value pairs; 2) retrieve a variance of the sample using the key ``Variance''​.
  
-You can also visualize the random numbers in the form of a histogram:+<code python>​ 
 +stat=p0.getStat() 
 +print "​Variance=",​stat["​variance"​] 
 +</​code>​ 
 + 
 +which will print "​Variance= 757.3"​. If not sure about the names of the keys, simply print the dictionary as 
 +"print stat"​. 
 + 
 +One can create histograms that catch the most basic 
 +characteristics of data.  This is especially important if there is no particular reasons 
 +to deal with complete data arrays. We can easily do this with above Fibonacci sequence as: 
 + 
 +<code python>​ 
 +h=p0.getH1D(10,​ 0, 100) 
 +print h.getStat() 
 +</​code>​ 
 + 
 +The code converts the array into a histogram with 10  equidistant bins in the range 0-100, and then 
 +it prints the map with statistical characteristics.  
 + 
 + 
 + 
 +You can also visualize the random numbers in the form of a histogram ​as shown in this detailed example above. 
 +We create random numbers, convert them to histograms and plot them. 
 +<ifauth !@member>​ 
 +<note important>​ 
 +Unregistered users have a limited access to this section. 
 +You can unlock advanced pages after  becoming [[/​scavis/​members/​selock| a full member]].  
 +You can also request to edit this manual and insert comments.  
 +</​note>​ 
 +</​ifauth>​ 
 +<ifauth @member,​@admin,​@editor>​
  
 <file python example.py>​ <file python example.py>​
Line 103: Line 135:
 c1.draw(h) c1.draw(h)
 </​file>​ </​file>​
 +
 +</​ifauth>​
 +
 +
 +
 +
  
 ====== Statistics with P1D ====== ====== Statistics with P1D ======
Line 121: Line 159:
 This will print the following values: This will print the following values:
  
-<hidden Click here to see the output ​of this script>+<hidden Click here to see the output>
 <​code>​ <​code>​
 error 0.996592835069 error 0.996592835069
Line 132: Line 170:
  
  
 +====== Comparing two histograms======
  
- +Comparison of two histograms test hypotheses that two histograms represent identical distributions. ​ 
- +Both <javadoc sc>​jhplot.H1D</​javadoc> ​and  <​javadoc sc>​jhplot.H2D</​javadoc> ​histograms ​have the method called "​compareChi2(h1,h2)" 
-====== Statistical tests====== +It calculates Chi2 between 2 histograms taking into account errors on the heights of the bins. The number chi2/ndf gives the estimatevalues smaller or close to 1 indicates similarity between ​2 histograms. ​
-Two distributions (1D and 2D histograms, ​P1D data pointscan be compared by applying several +
-statistical tests. The following statistical comparisons are available +
- +
-  * Chi2 +
-  * Anderson-Darling +
-  * Kolmogorov-Smirnov +
-  * Goodman +
-  * Kuiper +
-  * Tiku +
- +
- +
-Consider a simple statistical testcompare ​2 histograms. ​You can generate 2 similar histograms using this code snippet:  +
  
 <code python> <code python>
-from java.awt import Color +d=compareChi2(h1,h2h1, h2 are H1D or H2D histograms defined above 
-from java.util import Random +chi2=d[0] # chi2 
-from jhplot ​ import * +ndf =d[1] # number of degrees of freedom 
- +  ​=d[2] # probability ​(p-value)
-c1 HPlotJa("​Canvas"​) +
-c1.setGTitle("​Statistical comparisons"​) +
-c1.visible() +
-c1.setAutoRange() +
- +
-h1 = H1D("​Histo1"​,20, -2, 2.0) +
-h1.setColor(Color.blue) +
-h2 H1D("​Histo2",​20,​ -2, 2.0) +
-Random() +
-for i in range(10000):​ +
-   h1.fill(r.nextGaussian()) +
-   ​h2.fill(r.nextGaussian()) +
-   if (i<100): h2.fill(2*r.nextGaussian()+2) +
-h1.setErrAll(1) +
-h2.setErrAll(0) +
-c1.draw(h1) +
-c1.draw(h2)+
 </​code>​ </​code>​
-Here we show statistical uncertainties only for the first (blue) histogram (see the method setErrAll(0)). 
-The output of this code is shown below 
  
-<hidden Click here to see the output of this script>​ +Two  histograms ​are identical if chi2=0. Make sure that both histograms have error (or set them to small values).
-{{statistical_comparison.png | Two similar ​histograms}} +
-</​hidden>​+
  
-Now we can perform a several tests to calculate the degree of similarity of these distributions (including their uncertainties)+A similar method also exists for <javadoc sc>​jhplot.P1D</​javadoc> ​ data points. The comparison is done for Y-values, assuming symmetric errors on Y
-Below we show a code which compares these two histograms and calculate Chi2 per degree of freedom:+However, data should be ordered in X for correct comparison.
  
-<ifauth !@member>​ 
-<note important>​ 
-Unregistered users have a limited access to this section. One can unlock this example after becoming [[/​scavis/​members/​selock| a full member]]. ​ 
-</​note>​ 
-</​ifauth>​ 
-<ifauth @member,​@admin,​@editor>​ 
-<code python 1|t  stat_comparisons.py>​ 
-extern> stat_comparisons.py 
-</​code>​ 
-</​ifauth>​ 
- 
- 
-The output of this script is shown here: 
-<​code>​ 
-AndersonDarling method= 2.21779532164 / 20 
-Chi2 method= 0.786556311893 / 20 
-Goodman method= 0.624205522632 / 20 
-KolmogorovSmirnov ​ method= 0.419524135727 / 20 
-</​code>​ 
  
 ====== Linear regression analysis ====== ====== Linear regression analysis ======
Line 225: Line 211:
  
 {{lin_reg.png|}} {{lin_reg.png|}}
- 
- 
- 
-====== Distribution functions ====== 
-Many  useful distribution functions can be found in  [[/​scavis/​api/​doc.php/​cern/​jet/​stat/​Probability.html|cern.jet.stat.Probability]] package. The package contains 
-numerical integration of certain probability distributions. Below we will show how to use 
-the normal distribution which is very useful distribution in many  statistical analyses. ​ 
- 
-In the example below we will compute the probability that our random outcome is within a specified interval using the normal distribution. 
- 
-The code below returns the area under the normal probability density function, integrated from minus infinity to -1.17 (assumes mean is zero, variance is one).  
-<code python> 
-from cern.jet.stat.Probability import * 
-print normal(-1.17) ​ 
-</​code>​ 
-For the two-sided case, one can multiply the result by 2. 
- 
- 
-<ifauth !@member>​ 
-<note important>​ 
-Unregistered users have a limited access to this section. One can unlock this example after becoming [[/​scavis/​members/​selock| a full member]]. ​ 
-</​note>​ 
-</​ifauth>​ 
-<ifauth @member,​@admin,​@editor>​ 
- 
- 
-One can also calculate the inverse function: 
-This returns the value, x, for which the area under the Normal (Gaussian) probability density function (integrated from minus infinity to x) is equal to the argument y (assumes mean is zero, variance is one): 
-<code python> 
-from cern.jet.stat.Probability import * 
-print  normalInverse(0.12) ​ 
-</​code>​ 
- 
-This is especially important it statistics: assume that in 12% cases a value X can be as large as X(max) due to a chance. 
-Then the above code generate in how many "​sigma"​s you can define such access of events (1.174 sigma). 
- 
- 
- 
-</​ifauth>​ 
- 
  
  
Line 357: Line 303:
 <ifauth @member,​@admin,​@editor>​ <ifauth @member,​@admin,​@editor>​
  
-Please go to [[statistics_limit]]+Please go to [[man:​stat:​slimits]]
  
 </​ifauth>​ </​ifauth>​
Line 366: Line 312:
  
  
-<​hidden ​click here if you want to know more> A complete description of how to use Java, Jython and SCaVis for scientific analysis is described in the book [[/​scavis/​book/​|Scientific data analysis using Jython and Java]] published by [[http://​www.springer.com/​computer/​book/​978-1-84996-286-5| Springer Verlag, London, 2010]] (by S.V.Chekanov) </hidden+<​hidden ​Click to read more> A complete description of how to use Java, Jython and SCaVis for scientific analysis is described in the book [[/​scavis/​book/​|Scientific data analysis using Jython and Java]] published by [[http://​www.springer.com/​computer/​book/​978-1-84996-286-5| Springer Verlag, London, 2010]] (by S.V.Chekanov) </​hidden>​
- +
- +
- +
- +
-<ifauth !@member>​ +
-<note important>​ +
-One can comment and discuss this section after becoming  +
-[[/​scavis/​members/​selock| a full member]].  +
-</​note>​ +
-</​ifauth>​ +
-<ifauth @member,​@admin,​@editor>​ +
-~~DISCUSSION~~ +
-</ifauth>+
  
  
man/stat/statistics.1372996359.txt.gz · Last modified: 2013/07/04 21:52 by admin
CC Attribution-Share Alike 3.0 Unported
Powered by PHP Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0 Valid HTML5