Table of Contents
Fitting data with functions
SCaVis offers many reach classes for linear and non-linear regressions.
In particular, ScaVis support:
- Linear-regression fit (fit with a straight lines)
- Non-linear regression (fitting with more complex functions)
- Fitting data with shapes (circles, ellipses). See the section Fitting with shapes
For fits with analytic functions, fit minimisation can include statistical errors on input data points. In addition, several approaches to minimization procedure can be used.
Linear regression
- Snippet from Wikipedia: Linear regression
In statistics, linear regression is an approach for modeling the relationship between a scalar dependent variable y and one or more explanatory variables denoted X. The case of one explanatory variable is called simple linear regression. For more than one explanatory variable, the process is called multiple linear regression.
SCaVis offers many reach classes for linear and non-linear regressions. To perform a linear regression, use the class LinReg. The example below shows how to use this class to perform the linear regression:
1: from jhplot import * 2: from jhplot.stat import LinReg 3: from java.awt import Color 4: from java.util import Random 5: 6: c1 = HPlot("Canvas",600,400) 7: c1.visible(1) 8: c1.setGTitle("Linear regression") 9: c1.setAutoRange() 10: 11: p1= P1D("data") 12: rand = Random(); 13: for i in range(200): 14: x=rand.nextGaussian() 15: y=rand.nextGaussian() 16: p1.add(0.2*x, y) 17: c1.draw(p1); 18: 19: r = LinReg(p1) 20: print "Intercept=",r.getIntercept(), "+/-",r.getInterceptError() 21: print "Slope=",r.getSlope(),"+/-",r.getSlopeError() 22: 23: # get predictions as a band 24: pP=r.getPredictionBand(Color.red,0.5) 25: c1.draw(pP) 26: # draw F1D function with the result 27: c1.draw(r.getResult()) 28: # draw P1D[2] with predictions 29: c1.draw(r.getPrediction())
Linear regression with HFitter
Now we can do a bit more flexible fit defining a linear function analytically. We will use HFitter which allows to define any fit function. We simulate a linear dependence of the variable Y on X using random numbers:
1: from jhplot import * 2: from java.util import Random 3: 4: tTpoints = P1D("Data") 5: rand = Random() 6: for i in range(10): 7: x=rand.nextGaussian() 8: y=2*x+10*(1+0.01*rand.nextGaussian()) 9: tTpoints.add(x,y) 10: 11: fitter = HFitter("leastsquares") 12: fitter.setFunc("linear", 1, "a+b*x[0]","a,b") 13: fitter.setPar("a", 10) 14: fitter.setPar("b", 10) 15: fitter.fit(tTpoints) 16: fitresult = fitter.getResult() 17: a = fitresult.fittedParameter("a") 18: b = fitresult.fittedParameter("b") 19: print("a = {0}, b = {1}".format(a,b)) 20: 21: c1=HPlot("Canvas",) 22: c1.visible() 23: c1.setAutoRange() 24: ff=fitter.getFittedFunc() 25: max_X = max(tTpoints.getArrayX()) 26: min_X = min(tTpoints.getArrayX()) 27: c1.draw(tTpoints) 28: f1=F1D("fit",ff,min_X,max_X) 29: c1.draw(f1)
The fit returns exactly the same a and b values which were used to generate data:
a = 9.99526486454, b = 1.99636040719
Non-linear fitting
Data fits can be done by using the Java class HFitter. But before, one needs to construct a function and then set initial values for free parameters. Fitting can be done either in the interactive mode or using a script.
By default, HFitter fits data using chi2 methods. It is important to specify errors on Y-values of input data. You can print the available methods as:
from jhplot import * f=HFitter() print f.getFitMethod()
which prints “leastsquares”, “cleverchi2”, “chi2”, “bml” fitting methods. You can set the fit method when initialize the fitter:
from jhplot import * f=HFitter("leastsquares")
In this case, instead “chi2”, “leastsquares” fit method is used. In this case, you can fit “X-Y” data arrays without specifying errors on Y.
Let's make a simple chi2 fit of experimental data using an analytic function , where parameters Tu, Ta and kk need to be determined by fitting data stored as array. The data has experimental (statistical or systematic) uncertainties. This is mandatory for chi2 minimization. The resulting fit is shown below (example is provided by Klaus Rohe).
This image is generated by the code given below where we use P1D container to store input data. You can access fit errors and the fit quality (chi2/ndf) as described by the HFitter class.
Let use somewhat different approach and fit a Gaussian using the chi2 fit:
The code which performs this fit is shown below:
Below we will illustrate how to perform a rather complicated fit using the chi2 method. The fit will be done in several steps. In this example we fit data which can be described by multiple Gaussians, in which next Gaussians fit takes the values from the previous fit:
1: from jhplot import * 2: from jhplot.io import * 3: from java.awt import * 4: from java.util import * 5: from jhplot.math.StatisticSample import * 6: 7: xmin=0 8: xmax=20 9: h1 = H1D('Data',100,xmin,xmax) 10: r= Random() 11: 12: f=F1D('10+10*x',xmin,xmax) 13: p=f.getParse() 14: max=f.eval(xmax) 15: for i in range(10000): 16: a=randomRejection(10,p,max,xmin,xmax) 17: h1.fill(a) 18: h1.fill(0.3*r.nextGaussian()+4) 19: h1.fill(0.6*r.nextGaussian()+10) 20: h1.fill(0.8*r.nextGaussian()+15) 21: 22: 23: c1 = HPlot('Canvas') 24: c1.setRange(0,20,0,4000) 25: c1.visible() 26: c1.draw(h1) 27: 28: f=HFitter() 29: f.setFunc('p1+g') 30: func=f.getFunc() 31: f.setPar('mean',4); f.setPar('amplitude',100) 32: print func.parameterNames() 33: f.setRange(0,7) 34: f.fit(h1) 35: 36: ff=f.getFittedFunc() 37: r=f.getResult() 38: fPars = r.fittedParameters() 39: 40: ## next Gaussiaon 41: f.setFunc('p1+g+g') 42: func=f.getFunc() 43: func.setParameters(fPars.tolist()+[500,10,0.5]) 44: f.setRange(0,13) 45: f.fit(h1) 46: ff=f.getFittedFunc() 47: r=f.getResult() 48: fPars = r.fittedParameters() 49: print func.parameterNames(), func.parameters() 50: # next Gaussian 51: f.setFunc('p1+g+g+g') 52: func=f.getFunc() 53: func.setParameters(fPars.tolist()+[500,15,0.5]) 54: print func.parameterNames(), func.parameters() 55: f.setRange(0,20) 56: f.fit(h1) 57: ff=f.getFittedFunc() 58: # plot all 59: f2 = F1D('Gaussians+background',ff,0,20) 60: f2.setPenWidth(1) 61: f2.setColor(Color.blue) 62: c1.draw(f2)
The output is shown here:
Numerical interpolation
One can use numerical way to describe data using interpolation and smoothing One example is shown on this figure:
where we attempted to smooth data using a non-analytical approach. Such approach is often considered for various predictions when to find an appropriate analytical function is difficult or impossible. SCaVis provides a several flexible methods to smooth data and perform interpolation. The code of this example is given below:
Interactive fit
Data can be fitted using many predefined functions in an more interactive way. Let's create a few data containers (1D array, 1D histograms) and start HPlotJas plotter based on JAS2 program from Slac:
from java.util import Random from jhplot import * h1 = H1D("1D histogram",100, -2, 2.0) p1=P1D("data with errors") rand = Random() for i in range(600): h1.fill(rand.nextGaussian()) p0=P0D("Normal distribution") p0.randomNormal(1000,0.0,10.0) p1.add(1,10,3) p1.add(2,5,1) p1.add(3,12,2) from java.util import ArrayList a=ArrayList([h1,p1,p0]) c=HPlotJas("JAS",ArrayList([h1,p1,p0]))
This will bring up a “JAS” program with all objects shown on the left side.
You can expand the tree (left) and click on each object to plot on the canvas. Then try to fit the data: using the mouse pop-up dialog, press “Add function” (for example, a Gaussian function) and click on “Fit”. The program will perform chi2 minimisation and you will see a Gaussian line on to of the data. You can adjust the initial values of the function by dragging 3 points of this function.
Look at the example Fit video.
In the above example we used pre-built functions to perform fits. You can add your own custom fit function to the menu and use it for fitting as well. You can do this directly in the Jython script and pass this function to HplotJas canvas. Below we show an example in which we create 2 custom functions (a new Gaussian and a parabola) and passed them to the HPlotJas for interactive fitting.
Advanced fitting of data
Section Advanced fitting discusses how to perform non-linear fit in Java using data in many dimensions and using any complex function that can be defined not as a string, but totally programically.
Symbolic regression
You can also perform symbolic regression using generic programming approach.
Fitting shapes
See the section Fitting with shapes.
More examples
Read the book "Scientific data analysis using Jython scripting and Java for more details.
— Sergei Chekanov 2010/03/07 16:37