Curve Fitting

quadratic_fit_example power_fit_example gaussian_fit_example


This post goes through the basics of fitting functions to data sets.  There are many tricks to optimizing fitting foutines for speed, but I will only present a general method here in the interest of keeping things simple and easy to follow.  We start out with some generated data, and finish by fitting curves to a real data set.

We will also explore the possibilities of giving certain parameters limits in our fits.  Python has some great tools to help you with all your fitting needs!

For this tutorial, I am using:

  •  python v2.7
  •  numpy v1.5.1
  •  matplotlib v1.1.0
  •  scipy v0.8.0
  •  lmfit v0.4 – for constraining parameters
  •  pyfits 2.3.1 – for I/O of the .fits file

The basic method I will outline below:

  • Load or define the data array that you want to fit
  • Define a function for your desired equation
  • Fit with scipy.optimize.leastsq  (or with lmfit.minimize, if you want to constrain certain parameters)


Import the modules we will use

import numpy as np
from matplotlib import pyplot as plt
from scipy.optimize import leastsq
from lmfit import minimize, Parameters
import pyfits

Let’s start with a simple example: we’ll generate some data with random noise, and fit a quadratic curve. Then we’ll plot the data, the fit, and residuals.


Creating the example data.  X-range from -3 to 10 in increments of 0.2.  Our y-data will start out as x2

xarray1=np.arange(-3,10,.2)
yarray1=xarray1**2

Now add gaussian random noise to each element.

yarray1+=np.random.randn(yarray1.shape[0])*2

There is a cheap and easy way to perform fits with little setup if you are only interested in polynomials.  scipy.polyfit() allows you to give an x-array and a y-array as input, with the order of polynomial to fit, then spits out an array of the fit coefficients with little fuss.  To fit a quadratic to our data generated above, for example:

from scipy import polyfit

fitcoeffs=polyfit(xarray1,yarray1,2)

print fitcoeffs
# --> Returns array (<coeff. order 2>, <coeff. order 1>, <coeff. order 0>)

If we want to fit an arbitrary expression, though, we must define a python function which will compute our desired equation.  Our function will take two arrays as inputs: the list of coefficiencts (the parameters that we will fit – I will label them “p”), and the array of x-axis data points. Later we will define the initial guesses for the parameters, called “p0”.Our function will take the form     $\em y = a x ^2 + b x + c \em$

def quadratic(p,x):
    y_out=p[0]*(x**2)+p[1]*x+p[2]
    return y_out

There is a simpler way to do this for short/simple expressions: python’s lambda functions are essentially one-line function definitions.

quadratic = lambda p,x: p[0]*(x**2)+p[1]*x+p[2]

We will also need to define a function of the difference between the fitted function with the current iteration’s parameters and the original data.

quadraticerr = lambda p,x,y: quadratic(p,x)-y

Define our initial guesses of the quadratic’s coefficients (I’m making bad guesses here on purpose, just to show that it still works)

p0=[2,2,2]

Now for the actual fitting.  We will tell scipy.optimize.leastsq to minimize the DIFFERENCE between the original data and iterated set of data given by the defined equation with the current coefficients.  So we tell it to perform a least-squares fit on “quadraticerr”, and give it the x- and y-data arrays as arguments.

fitout=leastsq(quadraticerr,p0[:],args=(xarray1,yarray1))
paramsout=fitout[0]
#covar=out[1] #This is the covariance matrix output
print('Fitted Parameters:\na = %.2f , b = %.2f , c = %.2f' % (paramsout[0],paramsout[1],paramsout[2]))

Now let’s plot our results. I will create a smoother curve for the fit on the fly within pyplot.  I will also add a text box with the fit parameters and plot the residuals just below the data.

plt.rc('font',family='serif')
fig1=plt.figure(1)
frame1=fig1.add_axes((.1,.3,.8,.6))
    #xstart, ystart, xwidth, yheight --> units are fraction of the image from bottom left

Plot the data in frame1

xsmooth=np.linspace(xarray1[0],xarray1[-1])
plt.plot(xarray1,yarray1,'r.')
plt.plot(xsmooth,quadratic(paramsout,xsmooth),'b')
frame1.set_xticklabels([]) #We will plot the residuals below, so no x-ticks on this plot
plt.title('Quadratic Fit Example')
plt.ylabel('y-data')
plt.grid(True)
frame1.annotate('$y$ = %.2f$\cdot x^2$+%.2f$\cdot x$+%.2f'%(paramsout[0],paramsout[1],paramsout[2]), \
                xy=(.05,.95),xycoords='axes fraction',ha="left",va="top",bbox=dict(boxstyle="round", fc='1'))

If we want to keep the plot tidy:

from matplotlib.ticker import MaxNLocator
plt.gca().yaxis.set_major_locator(MaxNLocator(prune='lower')) #Removes lowest ytick label

Now plot the residuals in frame2

frame2=fig1.add_axes((.1,.1,.8,.2))

plt.plot(xarray1,quadratic(paramsout,xarray1)-yarray1,'k')
plt.ylabel('Residuals')
plt.grid(True)

plt.savefig('quadratic_fit_example.png',dpi=100)
#plt.show()

plt.clf()

Here is the result:

quadratic_fit_example



==== CONSTRAINING PARAMETERS ====

Now we will set limits for what values the coefficients can be fit to. This is particularly useful if you know (or want to impose) on the physics of the phenomenon represented by the data. scipy.optimize.minimize() has this functionality, and I will probably add an example using this in the future.  Another useful module is lmfit, which I will use from here on out.  Here we will fit a power law to data for the M-Sigma relation (data from Gultekin et al. 2009 – http://arxiv.org/abs/0903.4897 )

dat=np.genfromtxt('M-SigmaData.csv',skiprows=2)
M=dat[:,0]      #The y-data
Sigma=dat[:,1]  #The x-data

The general (simple) form for a power law is    $\em y = a \cdot  x^b \em$ .  We can do this simply enough directly, or we can be clever and note that we could fit a line for    $\em log( y ) = log( a ) + log( x^b ) = \alpha + \beta \cdot log( x ) \em$ .

lmfit requires us to define the coefficients as Parameters() objects. We then define their initial guesses, whether or not they can vary, and what their min/max values can be.  Finally, when we define our equation functions, we must call the parameters with p0[‘par_x’].value :

params=Parameters()
params.add('alpha',value=8,vary=True)
params.add('beta',value=4,vary=True,max=5.0) #,min=...

power=lambda p,x: params['alpha'].value+params['beta'].value*x
powererr=lambda p,x,y: power(p,x)-y

Perform the fit with lmfit.minimize.  This output gives us a whole suite of info, such as reduced chi-square values.  With lmfit, the parameters will be updated within the Parameters() object

fitout2=minimize(powererr,params,args=(np.log10(Sigma),np.log10(M)))

print('Fitted Parameters:\nalpha = %.2f , beta = %.2f' % (params['alpha'].value,params['beta'].value))

xsmooth2=np.linspace(np.min(Sigma),np.max(Sigma))
fit2=10**params['alpha'].value*xsmooth2**params['beta'].value

Plot the data and the fit

fig2=plt.figure(2)
sp1=fig2.add_subplot(1,1,1)

plt.plot(Sigma,M,'b*',linewidth=0)
plt.plot(xsmooth2,fit2,'g')
plt.yscale('log')
plt.xscale('log')
plt.xlim(np.min(Sigma),np.max(Sigma))
plt.xlabel('$\sigma$ (km s$^{-1}$)')
plt.ylabel('M (M$_{\odot}$)')
plt.title('Power Law Fit Example')
sp1.annotate('log($y$) = %.2f+%.2f$\cdot $log($x$)\n$y$ = %.2f$\cdot x^{%.2f}$' \

%(params['alpha'].value,params['beta'].value,10**params['alpha'].value,params['beta'].value),xy=(.05,.95), \
xycoords='axes fraction',ha="left", va="top",bbox=dict(boxstyle="round", fc='1'))

plt.savefig('power_fit_example.png')
#plt.show()

plt.clf()

Here is the result:

power_fit_example


Now is a good time to mention that lmfit.minimize() gives you some freebie goodness-of-fit diagnostics:

nfev number of function evaluations
success boolean (True/False) for whether fit succeeded.
errorbars boolean (True/False) for whether uncertainties were estimated.
message message about fit success.
ier integer error value from scipy.optimize.leastsq
lmdif_message message from scipy.optimize.leastsq
nvarys number of variables in fit    $\em N_\mathrm{varys}\em$
ndata number of data points:    $\em N \em$
nfree degrees of freedom in fit:    $\em N - N_\mathrm{varys}\em$
residual residual array (return of func()   $\em \ \mathrm{Resid} \em$
chisqr chi-square:    $\em \chi^2 = \Sigma_i^N \left[ \mathrm{Resid}_i \right] \em$
redchi reduced chi-square:    $\em \chi^2_\nu = \chi^2 / \left(  N - N_\mathrm{varys} \right)\em$

Call with, e.g., output.chisqr or output.residual

Again, see http://cars9.uchicago.edu/software/python/lmfit/ for details.



==== FITTING GAUSSIANS TO DATA FROM A .FITS FILE ====

Next, let’s read in data from a .fits file and perform a fit.  This particular  data cube is a neutral hydrogen VLA observation of dwarf galaxy WLM.  The cube  can be downloaded (~430MB) from  https://science.nrao.edu/science/surveys/littlethings/data/wlm.html  –> I have reproduced the ‘stackspec’ and ‘vels’ arrays at the end of this tutorial. Specifically, I will sum up all of the x-y data to make one big stacked z-array, or ‘Super Profile’.  Then I will fit various gaussian functions (Gaus + Hermite  polynomials and double gaussian) and plot them with their residuals.  Of course,  in real life you would want to ‘Shuffle’ the spectra and account for other factors, but this is just an example.

I am using the default Marquardt-Levenberg algorithm.  Note that fitting results will depend quite a bit on what you give as initial guesses – ML finds LOCAL  extrema quite well, but it doesn’t necessarily find the global extrema.  In short, do your best to provide a good first guess to the fit parameters.

Load in the data. We don’t care about the 4th axis, so cut it out. Also get the header and specify the velocity parameters.


cube=pyfits.getdata('WLM_NA_ICL001.FITS')[0,:,:,:]
cubehdr=pyfits.getheader('WLM_NA_ICL001.FITS')

cdelt3=cubehdr['CDELT3']/1000.; crval3=cubehdr['CRVAL3']/1000.; crpix3=cubehdr['CRPIX3'];
minvel=crval3+(-crpix3+1)*cdelt3; maxvel=crval3+(cube.shape[0]-crpix3)*cdelt3
chanwidth=abs(cdelt3)

#minvel=20.452084230999986; maxvel=-252.452084231
#cdelt3=-2.574567627; chanwidth=2.574567627

Stack all the spectra

stackspec=np.sum(np.sum(cube,axis=2),axis=1)

vels=np.arange(minvel,maxvel+int(cdelt3),cdelt3)

The velocity array in the cube goes from positive to negative, so let’s reverse it to make the fitting go smoother.

vels=vels[::-1]; stackspec=stackspec[::-1]

Alternatively, to skip using pyfits and load the velocity and spectrum data from the .csv file provided below:

    #dat=np.genfromtxt('GausFitData.csv',skiprows=1)
    #vels=dat[:,0]
    #stackspec=dat[:,1]
    #chanwidth=2.574567627

Next, we define our fit functions:


Gaussian plus Hermite Polynomials H3/H4   (labeled below as ~_gh)
   $\em f(x) = A  \; e^{-\frac{g^2}{2}} \left[ 1 + h_3 \left( -\sqrt{3} \; g + \frac{2}{\sqrt{3}} \; g^3 \right) + h_4 \left( \frac{\sqrt{6}}{4} - \sqrt{6} \; g^2 + \frac{\sqrt{6}}{3} \; g^4 \right) \right] \em$ , where    $\em g=(x-x_c)/\sigma} \em$

p_gh=Parameters()
p_gh.add('amp',value=np.max(stackspec),vary=True);
p_gh.add('center',value=vels[50],min=np.min(vels),max=np.max(vels));
p_gh.add('sig',value=3*chanwidth,min=chanwidth,max=abs(maxvel-minvel));
p_gh.add('skew',value=0,vary=True,min=None,max=None);
p_gh.add('kurt',value=0,vary=True,min=None,max=None);

def gaussfunc_gh(paramsin,x):
    amp=paramsin['amp'].value
    center=paramsin['center'].value
    sig=paramsin['sig'].value
    c1=-np.sqrt(3); c2=-np.sqrt(6); c3=2/np.sqrt(3); c4=np.sqrt(6)/3; c5=np.sqrt(6)/4
    skew=paramsin['skew'].value
    kurt=paramsin['kurt'].value

    gaustot_gh=amp*np.exp(-.5*((x-center)/sig)**2)*(1+skew*(c1*((x-center)/sig)+c3*((x-center)/sig)**3)+kurt*(c5+c2*((x-center)/sig)**2+c4*((x-center)/sig)**4))
    return gaustot_gh

Double Gaussian   (labeled below as ~_2g)
   $\em F(x) =  \sum\limits_{i}  f_i (x) = A_1 \;  e^{-\frac{1}{2} (\frac{x-x_{c,1}}{\sigma_1})^2} +  A_2 \;  e^{-\frac{1}{2} (\frac{x-x_{c,2}}{\sigma_2})^2} + \ldots \em$
Bounds–> amp: 10% of max to max   center: velocity range   disp: channel width to velocity range

p_2g=Parameters()
p_2g.add('amp1',value=np.max(stackspec)/2.,min=.1*np.max(stackspec),max=np.max(stackspec));
p_2g.add('center1',value=vels[50+10],min=np.min(vels),max=np.max(vels));
p_2g.add('sig1',value=2*chanwidth,min=chanwidth,max=abs(maxvel-minvel));
p_2g.add('amp2',value=np.max(stackspec)/2.,min=.1*np.max(stackspec),max=np.max(stackspec));
p_2g.add('center2',value=vels[50-10],min=np.min(vels),max=np.max(vels));
p_2g.add('sig2',value=3*chanwidth,min=chanwidth,max=abs(maxvel-minvel));

def gaussfunc_2g(paramsin,x):
    amp1=paramsin['amp1'].value; amp2=paramsin['amp2'].value;
    center1=paramsin['center1'].value; center2=paramsin['center2'].value;
    sig1=paramsin['sig1'].value; sig2=paramsin['sig2'].value;

    gaus1=amp1*np.exp(-.5*((x-center1)/sig1)**2)
    gaus2=amp2*np.exp(-.5*((x-center2)/sig2)**2)
    gaustot_2g=(gaus1+gaus2)
    return gaustot_2g

And now the functions that compute the difference between the fit iteration and data. In addition, define a function for a simple single gaussian.

gausserr_gh = lambda p,x,y: gaussfunc_gh(p,x)-y
gausserr_2g = lambda p,x,y: gaussfunc_2g(p,x)-y
gausssingle = lambda a,c,sig,x: a*np.exp(-.5*((x-c)/sig)**2)

We will minimize with lmfit again, in order to keep limits on parameters

fitout_gh=minimize(gausserr_gh,p_gh,args=(vels,stackspec))
fitout_2g=minimize(gausserr_2g,p_2g,args=(vels,stackspec))

Let’s make some shorthand definitions to keep things tidy

pars_gh=[p_gh['amp'].value,p_gh['center'].value,p_gh['sig'].value,p_gh['skew'].value,p_gh['kurt'].value]
pars_2g=[p_2g['amp1'].value,p_2g['center1'].value,p_2g['sig1'].value,p_2g['amp2'].value,p_2g['center2'].value,p_2g['sig2'].value]


Aside:
If you want to use the simple fitter, without defining bounds, revert  to scipy.optimize.leastsq().  Note, however, that with complex functions such as two gaussians, you run the risk of getting nonphysical results, for example one gaussian being negative.  That said, here is the code:

pars1=[np.max(stackspec),vels[50],2*abs(cdelt3),0,0]
pars2=[np.max(stackspec),vels[50],2*abs(cdelt3),np.max(stackspec),vels[50],2*cdelt3]

def gaussfunc_gh(p,x):
    c1=-np.sqrt(3); c2=-np.sqrt(6); c3=2/np.sqrt(3); c4=np.sqrt(6)/3; c5=np.sqrt(6)/4
    return p[0]*np.exp(-.5*((x-p[1])/p[2])**2)*(1+p[3]*(c1*((x-p[1])/p[2])+c3*((x-p[1])/p[2])**3)+p[4]*(c5+c2*((x-p[1])/p[2])**2+c4*((x-p[1])/p[2])**4))

def gaussfunc_2g(p,x):
    gaus1=p[0]*np.exp(-.5*((x-p[1])/p[2])**2)
    gaus2=p[3]*np.exp(-.5*((x-p[4])/p[5])**2)
    return gaus1+gaus2

fitout_gh=leastsq(gausserr_gh,pars1[:],args=(vels,stackspec))
pars_gh=fitout_gh[0]
fitout_2g=leastsq(gausserr_2g,pars2[:],args=(vels,stackspec))
pars_2g=fitout_2g[0]

print('Fitted Parameters (Gaus+Hermite):\nAmp = %.2f , Center = %.2f , Disp = %.2f\nSkew = %.2f , Kurt = %.2f' % (pars_gh[0],pars_gh[1],pars_gh[2],pars_gh[3],pars_gh[4]))
print('Fitted Parameters (Double Gaussian):\nAmp1 = %.2f , Center1 = %.2f , Sig1 = %.2f\nAmp2 = %.2f , Center2 = %.2f , Sig2 = %.2f' % (pars_2g[0],pars_2g[1],pars_2g[2],pars_2g[3],pars_2g[4],pars_2g[5]))

[End aside]


fit_gh=gaussfunc_gh(pars_gh,vels)
fit_2g=gaussfunc_2g(pars_2g,vels)
resid_gh=fit_gh-stackspec
resid_2g=fit_2g-stackspec

Print the fitted parameters

print('Fitted Parameters (Gaus+Hermite):\nAmp = %.2f , Center = %.2f , Disp = %.2f\nSkew = %.2f , Kurt = %.2f' \
%(pars_gh[0],pars_gh[1],pars_gh[2],pars_gh[3],pars_gh[4]))

print('Fitted Parameters (Double Gaussian):\nAmp1 = %.2f , Center1 = %.2f , Sig1 = %.2f\nAmp2 = %.2f , Center2 = %.2f , Sig2 = %.2f' \
%(pars_2g[0],pars_2g[1],pars_2g[2],pars_2g[3],pars_2g[4],pars_2g[5]))


Now let’s plot the results

fig3=plt.figure(3)
f1=fig3.add_axes((.1,.3,.8,.6))
    #xstart, ystart, xwidth, yheight --> units are fraction of the image from bottom left

Plot the data in frame1

plt.plot(vels,stackspec,'k.')
pgh,=plt.plot(vels,fit_gh,'b')
p2g,=plt.plot(vels,fit_2g,'r')
p2ga,=plt.plot(vels,gausssingle(pars_2g[0],pars_2g[1],pars_2g[2],vels),'-.',color='orange')
p2gb,=plt.plot(vels,gausssingle(pars_2g[3],pars_2g[4],pars_2g[5],vels),'-.',color='green')
f1.set_xticklabels([]) #We will plot the residuals below, so no x-ticks on this plot
plt.title('Multiple Gaussian Fit Example')
plt.ylabel('Amplitude (Some Units)')
f1.legend([pgh,p2g,p2ga,p2gb],['Gaus-Hermite','2-Gaus','Comp. 1','Comp2'],prop={'size':10},loc='center left')

from matplotlib.ticker import MaxNLocator
plt.gca().yaxis.set_major_locator(MaxNLocator(prune='lower')) #Removes lowest ytick label

Let’s also put in some text boxes with the fit results

f1.annotate('Gauss-Hermite:\nAmp = %.2f\nCenter = %.2f\n$\sigma$ = %.2f\nH3 = %.2f\nH4 = %.2f' \
    %(pars_gh[0],pars_gh[1],pars_gh[2],pars_gh[3],pars_gh[4]),xy=(.05,.95), \
    xycoords='axes fraction',ha="left", va="top", \
bbox=dict(boxstyle="round", fc='1'),fontsize=10)
f1.annotate('Double Gaussian:\nAmp$_1$ = %.2f\nAmp$_2$ = %.2f\nCenter$_1$ = %.2f\nCenter$_2$ = %.2f\n$\sigma_1$ = %.2f\n$\sigma_2$ = %.2f' \
    %(pars_2g[0],pars_2g[3],pars_2g[1],pars_2g[4],pars_2g[2],pars_2g[5]),xy=(.95,.95), \
    xycoords='axes fraction',ha="right", va="top", \
    bbox=dict(boxstyle="round", fc='1'),fontsize=10)

Now plot the residuals in frame2

f2=fig3.add_axes((.1,.1,.8,.2))

resgh,res2g,=plt.plot(vels,resid_gh,'k--',vels,resid_2g,'k')

plt.ylabel('Residuals')
plt.xlabel('Velocity (km s$^{-1}$)')
f2.legend([resgh,res2g],['Gaus-Hermite','2-Gaus'],numpoints=4,prop={'size':9},loc='upper left')

plt.savefig('gaussian_fit_example.png',dpi=100)
#plt.show()

plt.clf()

Finally, here is the result:

gaussian_fit_example


==== EVERYTHING CONDENSED ====

Here is a condensed version of the commands.

Condensed Code
#!/usr/bin/python

import numpy as np
from matplotlib import pyplot as plt
from scipy.optimize import leastsq
from lmfit import minimize, Parameters
import pyfits

### Example 1: Quadratic fitting with scipy.optimize.leastsq() ###
xarray1=np.arange(-3,10,.2)
yarray1=xarray1**2
yarray1+=np.random.randn(yarray1.shape[0])*2

quadratic = lambda p,x: p[0]*(x**2)+p[1]*x+p[2]
quadraticerr = lambda p,x,y: quadratic(p,x)-y

p0=[2,2,2] #Initial guesses

fitout=leastsq(quadraticerr,p0[:],args=(xarray1,yarray1))
paramsout=fitout[0]
print('Fitted Parameters:\na = %.2f , b = %.2f , c = %.2f' % (paramsout[0],paramsout[1],paramsout[2]))

#Plotting
plt.rc('font',family='serif')
fig1=plt.figure(1)
frame1=fig1.add_axes((.1,.3,.8,.6)) #xstart, ystart, xwidth, yheight --> units are fraction of the image from bottom left

xsmooth=np.linspace(xarray1[0],xarray1[-1])
plt.plot(xarray1,yarray1,'r.')
plt.plot(xsmooth,quadratic(paramsout,xsmooth),'b')
frame1.set_xticklabels([]) #We will plot the residuals below, so no x-ticks on this plot
plt.title('Quadratic Fit Example')
plt.ylabel('y-data')
plt.grid(True)
frame1.annotate('$y$ = %.2f$\cdot x^2$+%.2f$\cdot x$+%.2f' \
%(paramsout[0],paramsout[1],paramsout[2]),xy=(.05,.95), \
xycoords='axes fraction',ha="left", va="top", \
bbox=dict(boxstyle="round", fc='1'))

from matplotlib.ticker import MaxNLocator
plt.gca().yaxis.set_major_locator(MaxNLocator(prune='lower')) #Removes lowest ytick label

frame2=fig1.add_axes((.1,.1,.8,.2))
plt.plot(xarray1,quadratic(paramsout,xarray1)-yarray1,'k')
plt.ylabel('Residuals')
plt.grid(True)

plt.savefig('quadratic_fit_example.png',dpi=100)
#plt.show()
plt.clf()

### Example 2: power law with lmfit.minimize() ###

dat=np.genfromtxt('M-SigmaData.csv',skiprows=2)
M=dat[:,0]
Sigma=dat[:,1]

params=Parameters()
params.add('alpha',value=8,vary=True)
params.add('beta',value=4,vary=True,max=5.0) #,min=...

power=lambda p,x: params['alpha'].value+params['beta'].value*x
powererr=lambda p,x,y: power(p,x)-y

fitout2=minimize(powererr,params,args=(np.log10(Sigma),np.log10(M)))
print('Fitted Parameters:\nalpha = %.2f , beta = %.2f' % (params['alpha'].value,params['beta'].value))

xsmooth2=np.linspace(np.min(Sigma),np.max(Sigma))
fit2=10**params['alpha'].value*xsmooth2**params['beta'].value

# Plot the data and the fit
fig2=plt.figure(2)
sp1=fig2.add_subplot(1,1,1)
plt.plot(Sigma,M,'b*',linewidth=0)
plt.plot(xsmooth2,fit2,'g')
plt.yscale('log')
plt.xscale('log')
plt.xlim(np.min(Sigma),np.max(Sigma))
plt.xlabel('$\sigma$ (km s$^{-1}$)')
plt.ylabel('M (M$_{\odot}$)')
plt.title('Power Law Fit Example')
sp1.annotate('log($y$) = %.2f+%.2f$\cdot $log($x$)\n$y$ = %.2f$\cdot x^{%.2f}$' \
%(params['alpha'].value,params['beta'].value,10**params['alpha'].value,params['beta'].value),xy=(.05,.95), \
xycoords='axes fraction',ha="left", va="top", \
bbox=dict(boxstyle="round", fc='1'))

plt.savefig('power_fit_example.png')
#plt.show()
plt.clf()

### Example 3: multiple Gaussians from data cube with parameter limits ###

#cube=pyfits.getdata('WLM_NA_ICL001.FITS')[0,:,:,:]
#cubehdr=pyfits.getheader('WLM_NA_ICL001.FITS')

#cdelt3=cubehdr['CDELT3']/1000.; crval3=cubehdr['CRVAL3']/1000.; crpix3=cubehdr['CRPIX3'];
#minvel=crval3+(-crpix3+1)*cdelt3; maxvel=crval3+(cube.shape[0]-crpix3)*cdelt3
#chanwidth=abs(cdelt3)

#stackspec=np.sum(np.sum(cube,axis=2),axis=1) # Stack all the spectra
#vels=np.arange(minvel,maxvel+int(cdelt3),cdelt3)
#vels=vels[::-1]; stackspec=stackspec[::-1] #Reversing the arrays to go from - to +

dat=np.genfromtxt('GausFitData.csv',skiprows=1)
vels=dat[:,0]
stackspec=dat[:,1]
chanwidth=2.574567627

# Gaussian plus Hermite Polynomials H3/H4   (labeled below as ..._gh)
p_gh=Parameters()
p_gh.add('amp',value=np.max(stackspec),vary=True);
p_gh.add('center',value=vels[50],min=np.min(vels),max=np.max(vels));
p_gh.add('sig',value=3*chanwidth,min=chanwidth,max=abs(maxvel-minvel));
p_gh.add('skew',value=0,vary=True,min=None,max=None);
p_gh.add('kurt',value=0,vary=True,min=None,max=None);

def gaussfunc_gh(paramsin,x):
 amp=paramsin['amp'].value
 center=paramsin['center'].value
 sig=paramsin['sig'].value
 c1=-np.sqrt(3); c2=-np.sqrt(6); c3=2/np.sqrt(3); c4=np.sqrt(6)/3; c5=np.sqrt(6)/4
 skew=p_gh['skew'].value
 kurt=p_gh['kurt'].value
 return amp*np.exp(-.5*((x-center)/sig)**2)*(1+skew*(c1*((x-center)/sig)+c3*((x-center)/sig)**3)+kurt*(c5+c2*((x-center)/sig)**2+c4*((x-center)/sig)**4))

# Double Gaussian   (labeled below as ..._2g)
p_2g=Parameters()
p_2g.add('amp1',value=np.max(stackspec)/2.,min=.1*np.max(stackspec),max=np.max(stackspec));
p_2g.add('center1',value=vels[50+10],min=np.min(vels),max=np.max(vels));
p_2g.add('sig1',value=2*chanwidth,min=chanwidth,max=abs(maxvel-minvel));
p_2g.add('amp2',value=np.max(stackspec)/2.,min=.1*np.max(stackspec),max=np.max(stackspec));
p_2g.add('center2',value=vels[50-10],min=np.min(vels),max=np.max(vels));
p_2g.add('sig2',value=3*chanwidth,min=chanwidth,max=abs(maxvel-minvel));

def gaussfunc_2g(paramsin,x):
 amp1=paramsin['amp1'].value; amp2=paramsin['amp2'].value;
 center1=paramsin['center1'].value; center2=paramsin['center2'].value;
 sig1=paramsin['sig1'].value; sig2=paramsin['sig2'].value;
 gaus1=amp1*np.exp(-.5*((x-center1)/sig1)**2)
 gaus2=amp2*np.exp(-.5*((x-center2)/sig2)**2)
 return (gaus1+gaus2)

gausserr_gh = lambda p,x,y: gaussfunc_gh(p,x)-y
gausserr_2g = lambda p,x,y: gaussfunc_2g(p,x)-y
gausssingle = lambda a,c,sig,x: a*np.exp(-.5*((x-c)/sig)**2)

fitout_gh=minimize(gausserr_gh,p_gh,args=(vels,stackspec))
fitout_2g=minimize(gausserr_2g,p_2g,args=(vels,stackspec))

pars_gh=[p_gh['amp'].value,p_gh['center'].value,p_gh['sig'].value,p_gh['skew'].value,p_gh['kurt'].value]
pars_2g=[p_2g['amp1'].value,p_2g['center1'].value,p_2g['sig1'].value,p_2g['amp2'].value,p_2g['center2'].value,p_2g['sig2'].value]
fit_gh=gaussfunc_gh(pars_gh,vels)
fit_2g=gaussfunc_2g(pars_2g,vels)
resid_gh=fit_gh-stackspec
resid_2g=fit_2g-stackspec

print('Fitted Parameters (Gaus+Hermite):\nAmp = %.2f , Center = %.2f , Disp = %.2f\nSkew = %.2f , Kurt = %.2f' \
%(pars_gh[0],pars_gh[1],pars_gh[2],pars_gh[3],pars_gh[4]))
print('Fitted Parameters (Double Gaussian):\nAmp1 = %.2f , Center1 = %.2f , Sig1 = %.2f\nAmp2 = %.2f , Center2 = %.2f , Sig2 = %.2f' \
%(pars_2g[0],pars_2g[1],pars_2g[2],pars_2g[3],pars_2g[4],pars_2g[5]))

#Plotting
fig3=plt.figure(3)
f1=fig3.add_axes((.1,.3,.8,.6))
plt.plot(vels,stackspec,'k.')
pgh,=plt.plot(vels,fit_gh,'b')
p2g,=plt.plot(vels,fit_2g,'r')
p2ga,=plt.plot(vels,gausssingle(pars_2g[0],pars_2g[1],pars_2g[2],vels),'-.',color='orange')
p2gb,=plt.plot(vels,gausssingle(pars_2g[3],pars_2g[4],pars_2g[5],vels),'-.',color='green')
f1.set_xticklabels([]) #We will plot the residuals below, so no x-ticks on this plot
plt.title('Multiple Gaussian Fit Example')
plt.ylabel('Amplitude (Some Units)')
f1.legend([pgh,p2g,p2ga,p2gb],['Gaus-Hermite','2-Gaus','Comp. 1','Comp2'],prop={'size':10},loc='center left')

from matplotlib.ticker import MaxNLocator
plt.gca().yaxis.set_major_locator(MaxNLocator(prune='lower')) #Removes lowest ytick label

f1.annotate('Gauss-Hermite:\nAmp = %.2f\nCenter = %.2f\n$\sigma$ = %.2f\nH3 = %.2f\nH4 = %.2f'%(pars_gh[0],pars_gh[1],pars_gh[2],pars_gh[3],pars_gh[4]),xy=(.05,.95),xycoords='axes fraction',ha="left", va="top",bbox=dict(boxstyle="round", fc='1'),fontsize=10)
f1.annotate('Double Gaussian:\nAmp$_1$ = %.2f\nAmp$_2$ = %.2f\nCenter$_1$ = %.2f\nCenter$_2$ = %.2f\n$\sigma_1$ = %.2f\n$\sigma_2$ = %.2f'%(pars_2g[0],pars_2g[3],pars_2g[1],pars_2g[4],pars_2g[2],pars_2g[5]),xy=(.95,.95),xycoords='axes fraction',ha="right", va="top",bbox=dict(boxstyle="round", fc='1'),fontsize=10)

f2=fig3.add_axes((.1,.1,.8,.2))
resgh,res2g,=plt.plot(vels,resid_gh,'k--',vels,resid_2g,'k')
plt.ylabel('Residuals')
plt.xlabel('Velocity (km s$^{-1}$)')
f2.legend([resgh,res2g],['Gaus-Hermite','2-Gaus'],numpoints=4,prop={'size':9},loc='upper left')
plt.savefig('gaussian_fit_example.png',dpi=100)
#plt.show()
plt.clf()


Here is the sample data for the Gaussian fitting, for those who opt not to download the data cube.  Formatted velocities on the left, stacked spectra values on the right.

Click here for GausFitData.csv
-252.452084231    7.0471
-249.877516604    5.41091
-247.302948977    -0.0805578
-244.72838135    9.80676
-242.153813723    -2.64569
-239.579246096    3.44144
-237.004678469    -0.797753
-234.430110842    -2.29659
-231.855543215    -0.593432
-229.280975588    5.25571
-226.706407961    8.75715
-224.131840334    9.0627
-221.557272707    -6.08199
-218.98270508    -3.14464
-216.408137453    -2.99775
-213.833569826    -4.07002
-211.259002199    -2.21975
-208.684434572    -2.19285
-206.109866945    -3.57909
-203.535299318    -1.38921
-200.960731691    -2.03753
-198.386164064    -9.93365
-195.811596437    -1.14918
-193.23702881    3.11781
-190.662461183    -1.26111
-188.087893556    -7.38781
-185.513325929    2.30879
-182.938758302    0.976938
-180.364190675    -4.54513
-177.789623048    -2.34182
-175.215055421    0.835287
-172.640487794    0.578699
-170.065920167    7.0345
-167.49135254    9.30069
-164.916784913    13.3074
-162.342217286    21.4498
-159.767649659    39.6186
-157.193082032    61.6891
-154.618514405    71.2766
-152.043946778    84.5251
-149.469379151    92.3035
-146.894811524    103.288
-144.320243897    103.722
-141.74567627    105.536
-139.171108643    110.039
-136.596541016    119.542
-134.021973389    112.635
-131.447405762    111.95
-128.872838135    133.203
-126.298270508    135.486
-123.723702881    145.488
-121.149135254    156.818
-118.574567627    176.232
-116    190.496
-113.425432373    198.189
-110.850864746    195.464
-108.276297119    184.057
-105.701729492    170.15
-103.127161865    149.234
-100.552594238    118.705
-97.978026611    101.411
-95.403458984    92.2328
-92.828891357    73.3222
-90.25432373    67.2127
-87.679756103    41.8572
-85.105188476    37.7619
-82.530620849    20.5096
-79.956053222    13.5301
-77.381485595    15.9504
-74.806917968    13.5154
-72.232350341    5.63028
-69.657782714    3.38025
-67.083215087    -2.62706
-64.50864746    2.74693
-61.934079833    1.86001
-59.359512206    1.59868
-56.784944579    2.25844
-54.210376952    -0.369037
-51.635809325    0.334506
-49.061241698    -0.196296
-46.486674071    -1.56728
-43.912106444    -1.1896
-41.337538817    1.94682
-38.76297119    0.268496
-36.188403563    7.77246
-33.613835936    -5.04572
-31.039268309    -4.57877
-28.464700682    5.39095
-25.890133055    -0.00106338
-23.315565428    4.24355
-20.740997801    -2.42964
-18.166430174    -3.30451
-15.591862547    -0.803972
-13.01729492    -8.57761
-10.442727293    -14.549
-7.868159666    -16.3109
-5.293592039    10.3016
-2.719024412    16.9872
-0.144456785    14.6043
2.430110842    2.62755
5.004678469    -0.388005
7.579246096    -10.0996
10.153813723    -3.00706
12.72838135    2.11268
15.302948977    5.23242
17.877516604    4.40532
20.452084231    -1.07435

Plotting Sky Maps with the Kapteyn Package

mapexample


This script demonstrates simple map visualization of astronomical data with the Kapteyn package.  There are other excellent modules for displaying astronomical maps, such as pywcsgrid2, which I will cover in later tutorials.
The optical fits file (of M101) and examples used in this script come from the Kapteyn package website, http://www.astro.rug.nl/software/kapteyn/
The VLA fits file can be found on NED.



First, import the pertinent python modules.


import numpy
import pyfits
from matplotlib import pyplot as plt
from kapteyn import maputils


The simplest way to view a fits file is to load the data into an array with pyfits, then plot the image with pyplot.imshow

* Note I’m only doing the bare minimum to load.  Much more sophisticated options are available.


dat=pyfits.getdata('m101.fits')

numpy arrays are indexed as [ydata,xdata] for 2D, or [zdata,ydata,xdata] for 3D…  By default, plt.imshow tries to plot from the top left, but pyfits reads from the bottom left.  So we specify that our first data points should be plotted at the bottom of the image


plot1=plt.imshow(dat,origin='lower')

plt.show()

plt.clf() #Clearing the plotter for the next example.

The Kapteyn package has a great wealth of built-in routines for dealing with wcs coordinates, defining colorbars, annotations, etc., and many standard matplotlib commands will work, too.  The basic steps for displaying with kapteyn.maputils:

  1.  Load the data into a maputils.FITSreader() object
  2.  Create an Annotatedimage()
  3.  display the data as Image() or Contours()
  4.  Add colorbar, graticule, etc…
  5.  Annotatedimage().plot()
m101dat=maputils.FITSimage('m101.fits')

* Note: We can also use an external dataset and header with data=maputils.FITSimage(externaldata=…,externalheader=…) There are options for loading certain slices for 3D cubes, e.g.: data.set_imageaxes(axnr1=1,axnr2=2,slicepos=17)


Set up the figure and subplot

fig1=plt.figure(1)
fig1.suptitle('Mapping Demo Using the Kapteyn Package',fontsize=16)

sub1=fig1.add_subplot(2,1,1)
#In a grid of 2 rows, 1 column, add the 1st subplot
plt.title('Subplot 1 - Simple Map',fontsize=12)

Here we create the plottable image with maputils.  Let’s give it the ‘jet’ colormap for fun.  Standard matplotlib.cm maps are understood

annim=m101dat.Annotatedimage(sub1,cmap='jet')
im1=annim.Image() #kwargs include interpolation commands, etc.

If we want to display the wcs coordinates, we create a Graticule() This will automatically generate coords from a valid header

grat1=annim.Graticule()

Add a colorbar.  Note the capital “C” in the call. Specifying a matplotlib Axes frame adjusts the size.

cbar1=annim.Colorbar(fontsize=8)
      #...,frame=fig1.add_axes((0.65,0.58,.02,.32))
cbar1.set_label(label='Some Units',fontsize=12)

Note that we are not using pyplot.plot here, but rather the Kapteyn built-in routine

annim.plot()

==== OVERLAYS ====

Here we will overlay optical contours on a VLA 21cm background image.  This requires regridding the second image onto the first’s wcs coordinates.

sub2=fig1.add_subplot(2,1,2) #In a grid of 2 rows, 1 column, add the 2nd subplot
plt.title('Subplot 2 - Grayscale (VLA 21cm) \nwith Heat Contours (Optical)',fontsize=12)

Load up the data for the background (optical) and overlay (VLA)

bg=maputils.FITSimage('m101_VLA.fits')
ol=maputils.FITSimage('m101.fits')

Create the re-projection object

Reprojfits=ol.reproject_to(bg)

Create the background image the same way as before, this time in grayscale

bgim=bg.Annotatedimage(sub2,cmap='gist_yarg')
bgim.Image()
grat2=bgim.Graticule()
grat2.setp_gratline(visible=False)
  #This turns off the grid lines, but keeps the ticks

Now we make an Annotatedimage for the overlay based on the bg

olim=bg.Annotatedimage(sub2,cmap='gist_heat',boxdat=Reprojfits.boxdat)

Let’s define the contour levels explicitly and create the Contours() object

contlevs=numpy.arange(3e3,15e3,2e3)
cont=olim.Contours(filled=False,levels=contlevs,alpha=0.3)

We could add a beam, a skyobject polygon, a ruler, etc. here…


Now plot both the background and overlay images

bgim.plot()
olim.plot()

Let’s add a text object to sub2 with the pyplot.subplot built-in text annotator: We will print ‘M101’, in a box anchored at x=.05,y=.95 (based on the axes of the subplot), aligned to the top left of that

sub2.text(.05,.95,'M101',ha='left',va='top',transform=sub2.transAxes,size=10,bbox=None)

Tweaking the spacing of the subplots

plt.subplots_adjust(hspace=.5)

Making the default font ‘serif’

plt.rc('font',family='serif')

Finally, view the end result

plt.savefig('mapexample.png',dpi=150)
plt.show()
plt.clf()

mapexample



==== CONDENSED COMMANDS ====

This produces the same results, just without the comments from above.

Condensed Code

#!/usr/bin/python

import numpy
import pyfits
from matplotlib import pyplot as plt
from kapteyn import maputils

dat=pyfits.getdata('m101.fits')
plot1=plt.imshow(dat,origin='lower')
plt.show()
plt.clf()

#============================================================

m101dat=maputils.FITSimage('m101.fits')
fig1=plt.figure(1)
fig1.suptitle('Mapping Demo Using the Kapteyn Package',fontsize=16)
sub1=fig1.add_subplot(2,1,1) #In a grid of 2 rows, 1 column, add the 1st subplot
plt.title('Subplot 1 - Simple Map',fontsize=12)

annim=m101dat.Annotatedimage(sub1,cmap='jet')
im1=annim.Image() #kwargs include interpolation commands, etc.
grat1=annim.Graticule()
cbar1=annim.Colorbar(fontsize=8) #...,frame=fig1.add_axes((0.65,0.58,.02,.32))
cbar1.set_label(label='Some Units',fontsize=12)

annim.plot()

sub2=fig1.add_subplot(2,1,2) #In a grid of 2 rows, 1 column, add the 2nd subplot
plt.title('Subplot 2 - Grayscale with Heat Contours',fontsize=12)

bg=maputils.FITSimage('m101.fits')
ol=maputils.FITSimage('m101.fits') #This should be a different file, but I'll use this again for simplicity
Reprojfits=ol.reproject_to(bg)

bgim=m101dat.Annotatedimage(sub2,cmap='gist_yarg')
bgim.Image()
grat2=bgim.Graticule()
grat2.setp_gratline(visible=False) #This turns off the grid lines, but keeps the ticks

olim=bg.Annotatedimage(sub2,cmap='gist_heat',boxdat=Reprojfits.boxdat)
contlevs=numpy.arange(3e3,15e3,2e3)
cont=olim.Contours(filled=False,levels=contlevs,alpha=0.3)
bgim.plot()
olim.plot()

sub2.text(.05,.95,'M101',ha='left',va='top',transform=sub2.transAxes,size=10,bbox=None)
plt.subplots_adjust(hspace=.5)
plt.rc('font',family='serif')

plt.savefig('mapexample.png')
plt.show()
plt.clf()