# Curve Fitting

This post goes through the basics of fitting functions to data sets.  There are many tricks to optimizing fitting foutines for speed, but I will only present a general method here in the interest of keeping things simple and easy to follow.  We start out with some generated data, and finish by fitting curves to a real data set.

We will also explore the possibilities of giving certain parameters limits in our fits.  Python has some great tools to help you with all your fitting needs!

For this tutorial, I am using:

•  python v2.7
•  numpy v1.5.1
•  matplotlib v1.1.0
•  scipy v0.8.0
•  lmfit v0.4 – for constraining parameters
•  pyfits 2.3.1 – for I/O of the .fits file

The basic method I will outline below:

• Load or define the data array that you want to fit
• Define a function for your desired equation
• Fit with scipy.optimize.leastsq  (or with lmfit.minimize, if you want to constrain certain parameters)

Import the modules we will use

import numpy as np
from matplotlib import pyplot as plt
from scipy.optimize import leastsq
from lmfit import minimize, Parameters
import pyfits

Let’s start with a simple example: we’ll generate some data with random noise, and fit a quadratic curve. Then we’ll plot the data, the fit, and residuals.

Creating the example data.  X-range from -3 to 10 in increments of 0.2.  Our y-data will start out as x2

xarray1=np.arange(-3,10,.2)
yarray1=xarray1**2

Now add gaussian random noise to each element.

yarray1+=np.random.randn(yarray1.shape[0])*2

There is a cheap and easy way to perform fits with little setup if you are only interested in polynomials.  scipy.polyfit() allows you to give an x-array and a y-array as input, with the order of polynomial to fit, then spits out an array of the fit coefficients with little fuss.  To fit a quadratic to our data generated above, for example:

from scipy import polyfit

fitcoeffs=polyfit(xarray1,yarray1,2)

print fitcoeffs
# --> Returns array (<coeff. order 2>, <coeff. order 1>, <coeff. order 0>)

If we want to fit an arbitrary expression, though, we must define a python function which will compute our desired equation.  Our function will take two arrays as inputs: the list of coefficiencts (the parameters that we will fit – I will label them “p”), and the array of x-axis data points. Later we will define the initial guesses for the parameters, called “p0”.Our function will take the form

def quadratic(p,x):
y_out=p[0]*(x**2)+p[1]*x+p[2]
return y_out

There is a simpler way to do this for short/simple expressions: python’s lambda functions are essentially one-line function definitions.

quadratic = lambda p,x: p[0]*(x**2)+p[1]*x+p[2]

We will also need to define a function of the difference between the fitted function with the current iteration’s parameters and the original data.

quadraticerr = lambda p,x,y: quadratic(p,x)-y

Define our initial guesses of the quadratic’s coefficients (I’m making bad guesses here on purpose, just to show that it still works)

p0=[2,2,2]

Now for the actual fitting.  We will tell scipy.optimize.leastsq to minimize the DIFFERENCE between the original data and iterated set of data given by the defined equation with the current coefficients.  So we tell it to perform a least-squares fit on “quadraticerr”, and give it the x- and y-data arrays as arguments.

fitout=leastsq(quadraticerr,p0[:],args=(xarray1,yarray1))
paramsout=fitout[0]
#covar=out[1] #This is the covariance matrix output
print('Fitted Parameters:\na = %.2f , b = %.2f , c = %.2f' % (paramsout[0],paramsout[1],paramsout[2]))

Now let’s plot our results. I will create a smoother curve for the fit on the fly within pyplot.  I will also add a text box with the fit parameters and plot the residuals just below the data.

plt.rc('font',family='serif')
fig1=plt.figure(1)
frame1=fig1.add_axes((.1,.3,.8,.6))
#xstart, ystart, xwidth, yheight --> units are fraction of the image from bottom left

Plot the data in frame1

xsmooth=np.linspace(xarray1[0],xarray1[-1])
plt.plot(xarray1,yarray1,'r.')
plt.plot(xsmooth,quadratic(paramsout,xsmooth),'b')
frame1.set_xticklabels([]) #We will plot the residuals below, so no x-ticks on this plot
plt.title('Quadratic Fit Example')
plt.ylabel('y-data')
plt.grid(True)
frame1.annotate('$y$ = %.2f$\cdot x^2$+%.2f$\cdot x$+%.2f'%(paramsout[0],paramsout[1],paramsout[2]), \
xy=(.05,.95),xycoords='axes fraction',ha="left",va="top",bbox=dict(boxstyle="round", fc='1'))

If we want to keep the plot tidy:

from matplotlib.ticker import MaxNLocator
plt.gca().yaxis.set_major_locator(MaxNLocator(prune='lower')) #Removes lowest ytick label

Now plot the residuals in frame2

frame2=fig1.add_axes((.1,.1,.8,.2))

plt.plot(xarray1,quadratic(paramsout,xarray1)-yarray1,'k')
plt.ylabel('Residuals')
plt.grid(True)

plt.savefig('quadratic_fit_example.png',dpi=100)
#plt.show()

plt.clf()

Here is the result:

### ==== CONSTRAINING PARAMETERS ====

Now we will set limits for what values the coefficients can be fit to. This is particularly useful if you know (or want to impose) on the physics of the phenomenon represented by the data. scipy.optimize.minimize() has this functionality, and I will probably add an example using this in the future.  Another useful module is lmfit, which I will use from here on out.  Here we will fit a power law to data for the M-Sigma relation (data from Gultekin et al. 2009 – http://arxiv.org/abs/0903.4897 )

dat=np.genfromtxt('M-SigmaData.csv',skiprows=2)
M=dat[:,0]      #The y-data
Sigma=dat[:,1]  #The x-data

The general (simple) form for a power law is .  We can do this simply enough directly, or we can be clever and note that we could fit a line for .

lmfit requires us to define the coefficients as Parameters() objects. We then define their initial guesses, whether or not they can vary, and what their min/max values can be.  Finally, when we define our equation functions, we must call the parameters with p0[‘par_x’].value :

params=Parameters()
params.add('alpha',value=8,vary=True)
params.add('beta',value=4,vary=True,max=5.0) #,min=...

power=lambda p,x: params['alpha'].value+params['beta'].value*x
powererr=lambda p,x,y: power(p,x)-y

Perform the fit with lmfit.minimize.  This output gives us a whole suite of info, such as reduced chi-square values.  With lmfit, the parameters will be updated within the Parameters() object

fitout2=minimize(powererr,params,args=(np.log10(Sigma),np.log10(M)))

print('Fitted Parameters:\nalpha = %.2f , beta = %.2f' % (params['alpha'].value,params['beta'].value))

xsmooth2=np.linspace(np.min(Sigma),np.max(Sigma))
fit2=10**params['alpha'].value*xsmooth2**params['beta'].value

Plot the data and the fit

fig2=plt.figure(2)
sp1=fig2.add_subplot(1,1,1)

plt.plot(Sigma,M,'b*',linewidth=0)
plt.plot(xsmooth2,fit2,'g')
plt.yscale('log')
plt.xscale('log')
plt.xlim(np.min(Sigma),np.max(Sigma))
plt.xlabel('$\sigma$ (km s$^{-1}$)')
plt.ylabel('M (M$_{\odot}$)')
plt.title('Power Law Fit Example')
sp1.annotate('log($y$) = %.2f+%.2f$\cdot$log($x$)\n$y$ = %.2f$\cdot x^{%.2f}$' \

%(params['alpha'].value,params['beta'].value,10**params['alpha'].value,params['beta'].value),xy=(.05,.95), \
xycoords='axes fraction',ha="left", va="top",bbox=dict(boxstyle="round", fc='1'))

plt.savefig('power_fit_example.png')
#plt.show()

plt.clf()

Here is the result:

Now is a good time to mention that lmfit.minimize() gives you some freebie goodness-of-fit diagnostics:

 nfev number of function evaluations success boolean (True/False) for whether fit succeeded. errorbars boolean (True/False) for whether uncertainties were estimated. message message about fit success. ier integer error value from scipy.optimize.leastsq lmdif_message message from scipy.optimize.leastsq nvarys number of variables in fit ndata number of data points: nfree degrees of freedom in fit: residual residual array (return of func(): chisqr chi-square: redchi reduced chi-square:

Call with, e.g., output.chisqr or output.residual

Again, see http://cars9.uchicago.edu/software/python/lmfit/ for details.

### ==== FITTING GAUSSIANS TO DATA FROM A .FITS FILE ====

Next, let’s read in data from a .fits file and perform a fit.  This particular  data cube is a neutral hydrogen VLA observation of dwarf galaxy WLM.  The cube  can be downloaded (~430MB) from  https://science.nrao.edu/science/surveys/littlethings/data/wlm.html  –> I have reproduced the ‘stackspec’ and ‘vels’ arrays at the end of this tutorial. Specifically, I will sum up all of the x-y data to make one big stacked z-array, or ‘Super Profile’.  Then I will fit various gaussian functions (Gaus + Hermite  polynomials and double gaussian) and plot them with their residuals.  Of course,  in real life you would want to ‘Shuffle’ the spectra and account for other factors, but this is just an example.

I am using the default Marquardt-Levenberg algorithm.  Note that fitting results will depend quite a bit on what you give as initial guesses – ML finds LOCAL  extrema quite well, but it doesn’t necessarily find the global extrema.  In short, do your best to provide a good first guess to the fit parameters.

Load in the data. We don’t care about the 4th axis, so cut it out. Also get the header and specify the velocity parameters.

cube=pyfits.getdata('WLM_NA_ICL001.FITS')[0,:,:,:]
cubehdr=pyfits.getheader('WLM_NA_ICL001.FITS')

cdelt3=cubehdr['CDELT3']/1000.; crval3=cubehdr['CRVAL3']/1000.; crpix3=cubehdr['CRPIX3'];
minvel=crval3+(-crpix3+1)*cdelt3; maxvel=crval3+(cube.shape[0]-crpix3)*cdelt3
chanwidth=abs(cdelt3)

#minvel=20.452084230999986; maxvel=-252.452084231
#cdelt3=-2.574567627; chanwidth=2.574567627

Stack all the spectra

stackspec=np.sum(np.sum(cube,axis=2),axis=1)

vels=np.arange(minvel,maxvel+int(cdelt3),cdelt3)

The velocity array in the cube goes from positive to negative, so let’s reverse it to make the fitting go smoother.

vels=vels[::-1]; stackspec=stackspec[::-1]

Alternatively, to skip using pyfits and load the velocity and spectrum data from the .csv file provided below:

#dat=np.genfromtxt('GausFitData.csv',skiprows=1)
#vels=dat[:,0]
#stackspec=dat[:,1]
#chanwidth=2.574567627

Next, we define our fit functions:

Gaussian plus Hermite Polynomials H3/H4   (labeled below as ~_gh)
, where

p_gh=Parameters()
p_gh.add('amp',value=np.max(stackspec),vary=True);
p_gh.add('center',value=vels[50],min=np.min(vels),max=np.max(vels));
p_gh.add('sig',value=3*chanwidth,min=chanwidth,max=abs(maxvel-minvel));
p_gh.add('skew',value=0,vary=True,min=None,max=None);
p_gh.add('kurt',value=0,vary=True,min=None,max=None);

def gaussfunc_gh(paramsin,x):
amp=paramsin['amp'].value
center=paramsin['center'].value
sig=paramsin['sig'].value
c1=-np.sqrt(3); c2=-np.sqrt(6); c3=2/np.sqrt(3); c4=np.sqrt(6)/3; c5=np.sqrt(6)/4
skew=paramsin['skew'].value
kurt=paramsin['kurt'].value

gaustot_gh=amp*np.exp(-.5*((x-center)/sig)**2)*(1+skew*(c1*((x-center)/sig)+c3*((x-center)/sig)**3)+kurt*(c5+c2*((x-center)/sig)**2+c4*((x-center)/sig)**4))
return gaustot_gh

Double Gaussian   (labeled below as ~_2g)

Bounds–> amp: 10% of max to max   center: velocity range   disp: channel width to velocity range

p_2g=Parameters()
p_2g.add('amp1',value=np.max(stackspec)/2.,min=.1*np.max(stackspec),max=np.max(stackspec));
p_2g.add('center1',value=vels[50+10],min=np.min(vels),max=np.max(vels));
p_2g.add('sig1',value=2*chanwidth,min=chanwidth,max=abs(maxvel-minvel));
p_2g.add('amp2',value=np.max(stackspec)/2.,min=.1*np.max(stackspec),max=np.max(stackspec));
p_2g.add('center2',value=vels[50-10],min=np.min(vels),max=np.max(vels));
p_2g.add('sig2',value=3*chanwidth,min=chanwidth,max=abs(maxvel-minvel));

def gaussfunc_2g(paramsin,x):
amp1=paramsin['amp1'].value; amp2=paramsin['amp2'].value;
center1=paramsin['center1'].value; center2=paramsin['center2'].value;
sig1=paramsin['sig1'].value; sig2=paramsin['sig2'].value;

gaus1=amp1*np.exp(-.5*((x-center1)/sig1)**2)
gaus2=amp2*np.exp(-.5*((x-center2)/sig2)**2)
gaustot_2g=(gaus1+gaus2)
return gaustot_2g

And now the functions that compute the difference between the fit iteration and data. In addition, define a function for a simple single gaussian.

gausserr_gh = lambda p,x,y: gaussfunc_gh(p,x)-y
gausserr_2g = lambda p,x,y: gaussfunc_2g(p,x)-y
gausssingle = lambda a,c,sig,x: a*np.exp(-.5*((x-c)/sig)**2)

We will minimize with lmfit again, in order to keep limits on parameters

fitout_gh=minimize(gausserr_gh,p_gh,args=(vels,stackspec))
fitout_2g=minimize(gausserr_2g,p_2g,args=(vels,stackspec))

Let’s make some shorthand definitions to keep things tidy

pars_gh=[p_gh['amp'].value,p_gh['center'].value,p_gh['sig'].value,p_gh['skew'].value,p_gh['kurt'].value]
pars_2g=[p_2g['amp1'].value,p_2g['center1'].value,p_2g['sig1'].value,p_2g['amp2'].value,p_2g['center2'].value,p_2g['sig2'].value]

Aside:
If you want to use the simple fitter, without defining bounds, revert  to scipy.optimize.leastsq().  Note, however, that with complex functions such as two gaussians, you run the risk of getting nonphysical results, for example one gaussian being negative.  That said, here is the code:

pars1=[np.max(stackspec),vels[50],2*abs(cdelt3),0,0]
pars2=[np.max(stackspec),vels[50],2*abs(cdelt3),np.max(stackspec),vels[50],2*cdelt3]

def gaussfunc_gh(p,x):
c1=-np.sqrt(3); c2=-np.sqrt(6); c3=2/np.sqrt(3); c4=np.sqrt(6)/3; c5=np.sqrt(6)/4
return p[0]*np.exp(-.5*((x-p[1])/p[2])**2)*(1+p[3]*(c1*((x-p[1])/p[2])+c3*((x-p[1])/p[2])**3)+p[4]*(c5+c2*((x-p[1])/p[2])**2+c4*((x-p[1])/p[2])**4))

def gaussfunc_2g(p,x):
gaus1=p[0]*np.exp(-.5*((x-p[1])/p[2])**2)
gaus2=p[3]*np.exp(-.5*((x-p[4])/p[5])**2)
return gaus1+gaus2

fitout_gh=leastsq(gausserr_gh,pars1[:],args=(vels,stackspec))
pars_gh=fitout_gh[0]
fitout_2g=leastsq(gausserr_2g,pars2[:],args=(vels,stackspec))
pars_2g=fitout_2g[0]

print('Fitted Parameters (Gaus+Hermite):\nAmp = %.2f , Center = %.2f , Disp = %.2f\nSkew = %.2f , Kurt = %.2f' % (pars_gh[0],pars_gh[1],pars_gh[2],pars_gh[3],pars_gh[4]))
print('Fitted Parameters (Double Gaussian):\nAmp1 = %.2f , Center1 = %.2f , Sig1 = %.2f\nAmp2 = %.2f , Center2 = %.2f , Sig2 = %.2f' % (pars_2g[0],pars_2g[1],pars_2g[2],pars_2g[3],pars_2g[4],pars_2g[5]))

[End aside]

fit_gh=gaussfunc_gh(pars_gh,vels)
fit_2g=gaussfunc_2g(pars_2g,vels)
resid_gh=fit_gh-stackspec
resid_2g=fit_2g-stackspec

Print the fitted parameters

print('Fitted Parameters (Gaus+Hermite):\nAmp = %.2f , Center = %.2f , Disp = %.2f\nSkew = %.2f , Kurt = %.2f' \
%(pars_gh[0],pars_gh[1],pars_gh[2],pars_gh[3],pars_gh[4]))

print('Fitted Parameters (Double Gaussian):\nAmp1 = %.2f , Center1 = %.2f , Sig1 = %.2f\nAmp2 = %.2f , Center2 = %.2f , Sig2 = %.2f' \
%(pars_2g[0],pars_2g[1],pars_2g[2],pars_2g[3],pars_2g[4],pars_2g[5]))

Now let’s plot the results

fig3=plt.figure(3)
f1=fig3.add_axes((.1,.3,.8,.6))
#xstart, ystart, xwidth, yheight --> units are fraction of the image from bottom left

Plot the data in frame1

plt.plot(vels,stackspec,'k.')
pgh,=plt.plot(vels,fit_gh,'b')
p2g,=plt.plot(vels,fit_2g,'r')
p2ga,=plt.plot(vels,gausssingle(pars_2g[0],pars_2g[1],pars_2g[2],vels),'-.',color='orange')
p2gb,=plt.plot(vels,gausssingle(pars_2g[3],pars_2g[4],pars_2g[5],vels),'-.',color='green')
f1.set_xticklabels([]) #We will plot the residuals below, so no x-ticks on this plot
plt.title('Multiple Gaussian Fit Example')
plt.ylabel('Amplitude (Some Units)')
f1.legend([pgh,p2g,p2ga,p2gb],['Gaus-Hermite','2-Gaus','Comp. 1','Comp2'],prop={'size':10},loc='center left')

from matplotlib.ticker import MaxNLocator
plt.gca().yaxis.set_major_locator(MaxNLocator(prune='lower')) #Removes lowest ytick label

Let’s also put in some text boxes with the fit results

f1.annotate('Gauss-Hermite:\nAmp = %.2f\nCenter = %.2f\n$\sigma$ = %.2f\nH3 = %.2f\nH4 = %.2f' \
%(pars_gh[0],pars_gh[1],pars_gh[2],pars_gh[3],pars_gh[4]),xy=(.05,.95), \
xycoords='axes fraction',ha="left", va="top", \
bbox=dict(boxstyle="round", fc='1'),fontsize=10)
f1.annotate('Double Gaussian:\nAmp$_1$ = %.2f\nAmp$_2$ = %.2f\nCenter$_1$ = %.2f\nCenter$_2$ = %.2f\n$\sigma_1$ = %.2f\n$\sigma_2$ = %.2f' \
%(pars_2g[0],pars_2g[3],pars_2g[1],pars_2g[4],pars_2g[2],pars_2g[5]),xy=(.95,.95), \
xycoords='axes fraction',ha="right", va="top", \
bbox=dict(boxstyle="round", fc='1'),fontsize=10)

Now plot the residuals in frame2

f2=fig3.add_axes((.1,.1,.8,.2))

resgh,res2g,=plt.plot(vels,resid_gh,'k--',vels,resid_2g,'k')

plt.ylabel('Residuals')
plt.xlabel('Velocity (km s$^{-1}$)')
f2.legend([resgh,res2g],['Gaus-Hermite','2-Gaus'],numpoints=4,prop={'size':9},loc='upper left')

plt.savefig('gaussian_fit_example.png',dpi=100)
#plt.show()

plt.clf()

Finally, here is the result:

### ==== EVERYTHING CONDENSED ====

Here is a condensed version of the commands.

Condensed Code
#!/usr/bin/python

import numpy as np
from matplotlib import pyplot as plt
from scipy.optimize import leastsq
from lmfit import minimize, Parameters
import pyfits

### Example 1: Quadratic fitting with scipy.optimize.leastsq() ###
xarray1=np.arange(-3,10,.2)
yarray1=xarray1**2
yarray1+=np.random.randn(yarray1.shape[0])*2

quadratic = lambda p,x: p[0]*(x**2)+p[1]*x+p[2]
quadraticerr = lambda p,x,y: quadratic(p,x)-y

p0=[2,2,2] #Initial guesses

fitout=leastsq(quadraticerr,p0[:],args=(xarray1,yarray1))
paramsout=fitout[0]
print('Fitted Parameters:\na = %.2f , b = %.2f , c = %.2f' % (paramsout[0],paramsout[1],paramsout[2]))

#Plotting
plt.rc('font',family='serif')
fig1=plt.figure(1)
frame1=fig1.add_axes((.1,.3,.8,.6)) #xstart, ystart, xwidth, yheight --> units are fraction of the image from bottom left

xsmooth=np.linspace(xarray1[0],xarray1[-1])
plt.plot(xarray1,yarray1,'r.')
plt.plot(xsmooth,quadratic(paramsout,xsmooth),'b')
frame1.set_xticklabels([]) #We will plot the residuals below, so no x-ticks on this plot
plt.title('Quadratic Fit Example')
plt.ylabel('y-data')
plt.grid(True)
frame1.annotate('$y$ = %.2f$\cdot x^2$+%.2f$\cdot x$+%.2f' \
%(paramsout[0],paramsout[1],paramsout[2]),xy=(.05,.95), \
xycoords='axes fraction',ha="left", va="top", \
bbox=dict(boxstyle="round", fc='1'))

from matplotlib.ticker import MaxNLocator
plt.gca().yaxis.set_major_locator(MaxNLocator(prune='lower')) #Removes lowest ytick label

frame2=fig1.add_axes((.1,.1,.8,.2))
plt.plot(xarray1,quadratic(paramsout,xarray1)-yarray1,'k')
plt.ylabel('Residuals')
plt.grid(True)

plt.savefig('quadratic_fit_example.png',dpi=100)
#plt.show()
plt.clf()

### Example 2: power law with lmfit.minimize() ###

dat=np.genfromtxt('M-SigmaData.csv',skiprows=2)
M=dat[:,0]
Sigma=dat[:,1]

params=Parameters()
params.add('alpha',value=8,vary=True)
params.add('beta',value=4,vary=True,max=5.0) #,min=...

power=lambda p,x: params['alpha'].value+params['beta'].value*x
powererr=lambda p,x,y: power(p,x)-y

fitout2=minimize(powererr,params,args=(np.log10(Sigma),np.log10(M)))
print('Fitted Parameters:\nalpha = %.2f , beta = %.2f' % (params['alpha'].value,params['beta'].value))

xsmooth2=np.linspace(np.min(Sigma),np.max(Sigma))
fit2=10**params['alpha'].value*xsmooth2**params['beta'].value

# Plot the data and the fit
fig2=plt.figure(2)
sp1=fig2.add_subplot(1,1,1)
plt.plot(Sigma,M,'b*',linewidth=0)
plt.plot(xsmooth2,fit2,'g')
plt.yscale('log')
plt.xscale('log')
plt.xlim(np.min(Sigma),np.max(Sigma))
plt.xlabel('$\sigma$ (km s$^{-1}$)')
plt.ylabel('M (M$_{\odot}$)')
plt.title('Power Law Fit Example')
sp1.annotate('log($y$) = %.2f+%.2f$\cdot$log($x$)\n$y$ = %.2f$\cdot x^{%.2f}$' \
%(params['alpha'].value,params['beta'].value,10**params['alpha'].value,params['beta'].value),xy=(.05,.95), \
xycoords='axes fraction',ha="left", va="top", \
bbox=dict(boxstyle="round", fc='1'))

plt.savefig('power_fit_example.png')
#plt.show()
plt.clf()

### Example 3: multiple Gaussians from data cube with parameter limits ###

#cube=pyfits.getdata('WLM_NA_ICL001.FITS')[0,:,:,:]
#cubehdr=pyfits.getheader('WLM_NA_ICL001.FITS')

#cdelt3=cubehdr['CDELT3']/1000.; crval3=cubehdr['CRVAL3']/1000.; crpix3=cubehdr['CRPIX3'];
#minvel=crval3+(-crpix3+1)*cdelt3; maxvel=crval3+(cube.shape[0]-crpix3)*cdelt3
#chanwidth=abs(cdelt3)

#stackspec=np.sum(np.sum(cube,axis=2),axis=1) # Stack all the spectra
#vels=np.arange(minvel,maxvel+int(cdelt3),cdelt3)
#vels=vels[::-1]; stackspec=stackspec[::-1] #Reversing the arrays to go from - to +

dat=np.genfromtxt('GausFitData.csv',skiprows=1)
vels=dat[:,0]
stackspec=dat[:,1]
chanwidth=2.574567627

# Gaussian plus Hermite Polynomials H3/H4   (labeled below as ..._gh)
p_gh=Parameters()
p_gh.add('amp',value=np.max(stackspec),vary=True);
p_gh.add('center',value=vels[50],min=np.min(vels),max=np.max(vels));
p_gh.add('sig',value=3*chanwidth,min=chanwidth,max=abs(maxvel-minvel));
p_gh.add('skew',value=0,vary=True,min=None,max=None);
p_gh.add('kurt',value=0,vary=True,min=None,max=None);

def gaussfunc_gh(paramsin,x):
amp=paramsin['amp'].value
center=paramsin['center'].value
sig=paramsin['sig'].value
c1=-np.sqrt(3); c2=-np.sqrt(6); c3=2/np.sqrt(3); c4=np.sqrt(6)/3; c5=np.sqrt(6)/4
skew=p_gh['skew'].value
kurt=p_gh['kurt'].value
return amp*np.exp(-.5*((x-center)/sig)**2)*(1+skew*(c1*((x-center)/sig)+c3*((x-center)/sig)**3)+kurt*(c5+c2*((x-center)/sig)**2+c4*((x-center)/sig)**4))

# Double Gaussian   (labeled below as ..._2g)
p_2g=Parameters()
p_2g.add('amp1',value=np.max(stackspec)/2.,min=.1*np.max(stackspec),max=np.max(stackspec));
p_2g.add('center1',value=vels[50+10],min=np.min(vels),max=np.max(vels));
p_2g.add('sig1',value=2*chanwidth,min=chanwidth,max=abs(maxvel-minvel));
p_2g.add('amp2',value=np.max(stackspec)/2.,min=.1*np.max(stackspec),max=np.max(stackspec));
p_2g.add('center2',value=vels[50-10],min=np.min(vels),max=np.max(vels));
p_2g.add('sig2',value=3*chanwidth,min=chanwidth,max=abs(maxvel-minvel));

def gaussfunc_2g(paramsin,x):
amp1=paramsin['amp1'].value; amp2=paramsin['amp2'].value;
center1=paramsin['center1'].value; center2=paramsin['center2'].value;
sig1=paramsin['sig1'].value; sig2=paramsin['sig2'].value;
gaus1=amp1*np.exp(-.5*((x-center1)/sig1)**2)
gaus2=amp2*np.exp(-.5*((x-center2)/sig2)**2)
return (gaus1+gaus2)

gausserr_gh = lambda p,x,y: gaussfunc_gh(p,x)-y
gausserr_2g = lambda p,x,y: gaussfunc_2g(p,x)-y
gausssingle = lambda a,c,sig,x: a*np.exp(-.5*((x-c)/sig)**2)

fitout_gh=minimize(gausserr_gh,p_gh,args=(vels,stackspec))
fitout_2g=minimize(gausserr_2g,p_2g,args=(vels,stackspec))

pars_gh=[p_gh['amp'].value,p_gh['center'].value,p_gh['sig'].value,p_gh['skew'].value,p_gh['kurt'].value]
pars_2g=[p_2g['amp1'].value,p_2g['center1'].value,p_2g['sig1'].value,p_2g['amp2'].value,p_2g['center2'].value,p_2g['sig2'].value]
fit_gh=gaussfunc_gh(pars_gh,vels)
fit_2g=gaussfunc_2g(pars_2g,vels)
resid_gh=fit_gh-stackspec
resid_2g=fit_2g-stackspec

print('Fitted Parameters (Gaus+Hermite):\nAmp = %.2f , Center = %.2f , Disp = %.2f\nSkew = %.2f , Kurt = %.2f' \
%(pars_gh[0],pars_gh[1],pars_gh[2],pars_gh[3],pars_gh[4]))
print('Fitted Parameters (Double Gaussian):\nAmp1 = %.2f , Center1 = %.2f , Sig1 = %.2f\nAmp2 = %.2f , Center2 = %.2f , Sig2 = %.2f' \
%(pars_2g[0],pars_2g[1],pars_2g[2],pars_2g[3],pars_2g[4],pars_2g[5]))

#Plotting
fig3=plt.figure(3)
f1=fig3.add_axes((.1,.3,.8,.6))
plt.plot(vels,stackspec,'k.')
pgh,=plt.plot(vels,fit_gh,'b')
p2g,=plt.plot(vels,fit_2g,'r')
p2ga,=plt.plot(vels,gausssingle(pars_2g[0],pars_2g[1],pars_2g[2],vels),'-.',color='orange')
p2gb,=plt.plot(vels,gausssingle(pars_2g[3],pars_2g[4],pars_2g[5],vels),'-.',color='green')
f1.set_xticklabels([]) #We will plot the residuals below, so no x-ticks on this plot
plt.title('Multiple Gaussian Fit Example')
plt.ylabel('Amplitude (Some Units)')
f1.legend([pgh,p2g,p2ga,p2gb],['Gaus-Hermite','2-Gaus','Comp. 1','Comp2'],prop={'size':10},loc='center left')

from matplotlib.ticker import MaxNLocator
plt.gca().yaxis.set_major_locator(MaxNLocator(prune='lower')) #Removes lowest ytick label

f1.annotate('Gauss-Hermite:\nAmp = %.2f\nCenter = %.2f\n$\sigma$ = %.2f\nH3 = %.2f\nH4 = %.2f'%(pars_gh[0],pars_gh[1],pars_gh[2],pars_gh[3],pars_gh[4]),xy=(.05,.95),xycoords='axes fraction',ha="left", va="top",bbox=dict(boxstyle="round", fc='1'),fontsize=10)
f1.annotate('Double Gaussian:\nAmp$_1$ = %.2f\nAmp$_2$ = %.2f\nCenter$_1$ = %.2f\nCenter$_2$ = %.2f\n$\sigma_1$ = %.2f\n$\sigma_2$ = %.2f'%(pars_2g[0],pars_2g[3],pars_2g[1],pars_2g[4],pars_2g[2],pars_2g[5]),xy=(.95,.95),xycoords='axes fraction',ha="right", va="top",bbox=dict(boxstyle="round", fc='1'),fontsize=10)

f2=fig3.add_axes((.1,.1,.8,.2))
resgh,res2g,=plt.plot(vels,resid_gh,'k--',vels,resid_2g,'k')
plt.ylabel('Residuals')
plt.xlabel('Velocity (km s$^{-1}$)')
f2.legend([resgh,res2g],['Gaus-Hermite','2-Gaus'],numpoints=4,prop={'size':9},loc='upper left')
plt.savefig('gaussian_fit_example.png',dpi=100)
#plt.show()
plt.clf()

Here is the sample data for the Gaussian fitting, for those who opt not to download the data cube.  Formatted velocities on the left, stacked spectra values on the right.

Click here for GausFitData.csv
-252.452084231    7.0471
-249.877516604    5.41091
-247.302948977    -0.0805578
-244.72838135    9.80676
-242.153813723    -2.64569
-239.579246096    3.44144
-237.004678469    -0.797753
-234.430110842    -2.29659
-231.855543215    -0.593432
-229.280975588    5.25571
-226.706407961    8.75715
-224.131840334    9.0627
-221.557272707    -6.08199
-218.98270508    -3.14464
-216.408137453    -2.99775
-213.833569826    -4.07002
-211.259002199    -2.21975
-208.684434572    -2.19285
-206.109866945    -3.57909
-203.535299318    -1.38921
-200.960731691    -2.03753
-198.386164064    -9.93365
-195.811596437    -1.14918
-193.23702881    3.11781
-190.662461183    -1.26111
-188.087893556    -7.38781
-185.513325929    2.30879
-182.938758302    0.976938
-180.364190675    -4.54513
-177.789623048    -2.34182
-175.215055421    0.835287
-172.640487794    0.578699
-170.065920167    7.0345
-167.49135254    9.30069
-164.916784913    13.3074
-162.342217286    21.4498
-159.767649659    39.6186
-157.193082032    61.6891
-154.618514405    71.2766
-152.043946778    84.5251
-149.469379151    92.3035
-146.894811524    103.288
-144.320243897    103.722
-141.74567627    105.536
-139.171108643    110.039
-136.596541016    119.542
-134.021973389    112.635
-131.447405762    111.95
-128.872838135    133.203
-126.298270508    135.486
-123.723702881    145.488
-121.149135254    156.818
-118.574567627    176.232
-116    190.496
-113.425432373    198.189
-110.850864746    195.464
-108.276297119    184.057
-105.701729492    170.15
-103.127161865    149.234
-100.552594238    118.705
-97.978026611    101.411
-95.403458984    92.2328
-92.828891357    73.3222
-90.25432373    67.2127
-87.679756103    41.8572
-85.105188476    37.7619
-82.530620849    20.5096
-79.956053222    13.5301
-77.381485595    15.9504
-74.806917968    13.5154
-72.232350341    5.63028
-69.657782714    3.38025
-67.083215087    -2.62706
-64.50864746    2.74693
-61.934079833    1.86001
-59.359512206    1.59868
-56.784944579    2.25844
-54.210376952    -0.369037
-51.635809325    0.334506
-49.061241698    -0.196296
-46.486674071    -1.56728
-43.912106444    -1.1896
-41.337538817    1.94682
-38.76297119    0.268496
-36.188403563    7.77246
-33.613835936    -5.04572
-31.039268309    -4.57877
-28.464700682    5.39095
-25.890133055    -0.00106338
-23.315565428    4.24355
-20.740997801    -2.42964
-18.166430174    -3.30451
-15.591862547    -0.803972
-13.01729492    -8.57761
-10.442727293    -14.549
-7.868159666    -16.3109
-5.293592039    10.3016
-2.719024412    16.9872
-0.144456785    14.6043
2.430110842    2.62755
5.004678469    -0.388005
7.579246096    -10.0996
10.153813723    -3.00706
12.72838135    2.11268
15.302948977    5.23242
17.877516604    4.40532
20.452084231    -1.07435

## One thought on “Curve Fitting”

1. Simply want to say your article is as astounding. The clearness in your post is just cool and i could assume you’re an expert on this subject. Well with your permission allow me to grab your feed to keep updated with forthcoming post. Thanks a million and please keep up the enjoyable work.