ALMA Imaging Tutorial: Z-CMa

Z Canis Major (or Z CMa) is a young binary system in the constellation Canis Major. The binary consists of two pre-main-sequence stars, one of the classification Herbig Be star and the second a FU Orionis object. 

A Herbig Ae/Be stars are young enough to still be embedded in their contracting natal dust and gas envelopes and to not yet be burning hydrogen. FU Orionis objects show high variation in magnitude and spectral type potentially linked to episodic accretion from their accretion disks onto the star.

Both objects in Z CMa drive outflowing jets which can be well traced by molecular line emission observable by ALMA. In this tutorial, we will use ALMA observations of Z CMa to image the gas/dust continuum emission and the 13CS and CO spectral line emission. The latter gives some insight into the outflowing material structure.

The ALMA observations we are using were conducted under project code 2018.1.01131.S on 14-March-2019 in ALMA Band 6 (using frequencies between ~218.2-232.5 GHz). The full project consists of both 12M and 7M array data, but for this tutorial, we will focus only on the 12M data.

Getting started:

As discussed earlier in this workshop we will be using CASA to image these data. You should have CASA installed and check that it starts successfully before the start of this session. You will also need the imaging script scriptForImagingZCMa.py and a CASA measurement set (MS) containing the data.

Whilst these data are fairly typical in terms of file size for an ALMA observation, we know people have differing computing resources. As such we provide two options of starting data which can be used as follows.

1) uid___A002_Xd98580_X354.ms.calibrated (untarred file size 33 GB). This is the end product of running scriptForPI.py  as discussed earlier in this workshop. Here you can work from Step 0 to the end of the tutorial. The final data products will have a size of approximately 50 GB, so make sure you have enough room for this on your computer.

2) Z_CMa_TM2.split.cal (untarred size 6.2 GB). This is the science-only data and is created in the first half of Step 1. If you start with this MS you can follow the hands-on part of this tutorial from the  symbol. If you select this product please be sure to still read Step 0 and the start of Step 1. The final data products starting with these data will have a size of approximately 13 GB.

Once you have downloaded and untarred your data, please start CASA. 

About this tutorial:

This tutorial follows the steps in the scriptForImagingZCMa.py we have provided. There is a 'switch' for each step on lines 51 to 58 of the script which needs to be set equal to True if you wish to run that step and False if not. Typically, you should run one step at a time and in the order they are listed.

Also, to allow you to get some hands-on experience a number of parameters used throughout this script are left blank and need to be set by you. These all appear under the heading 'Initial Parameters' on lines 35 to 49 of the code. Don't worry, we will guide you as to how to calculate or find the required values.


STEP 0: Obtaining and inspecting the data

If you have successfully run the scriptForPI.py on the downloaded data you will have in the /calibrated/ directory a measurement set (MS) named uid___A002_Xd98580_X354.ms. If scriptForPI.py failed to run (or you didn't have enough disk space) the full MS can be found here (untarred file size 33 GB). If you lack disk space for this MS please read the remainder of this step and Step 1, you can proceed with the hands-on elements of this tutorial partway through Step 1 at the  symbol

Our first step is to inspect what this MS contains. The CASA task listobs can be run to do this. You can run this directly in the CASA terminal as follows:

listobs('uid___A002_Xd98580_X354.ms')

This will print a lot of useful information about the observation into the CASA logger. Another way of running listobs is used in Step 0 of the tutorial script. Before running this step we need to set the following values in the 'Initial Parameters' section of the script.

myMS = 'uid___A002_Xd98580_X354.ms'
target = 'Z_CMa'

These values are used throughout the script. Next, we need to set the parameter step0 = True in the 'Logic switches' section of the code. With this set, let's have a look at the slightly different listobs command used in Step 0. 

listobs(myMs,listfile=myMS+'.listobs')

Here we use the myMS variable we set above, rather than writing out its full name each time, and we set the listobs parameter listfile to be the MS name plus the extension .listobs. This will produce a text file named uid___A002_Xd98580_X354.ms.listobs which contains the same information as was printed to the terminal in the previous listobs command.

To execute this step of the script type:

execfile('scriptForImagingZCMa.py')

into CASA. Once it has finished you can open the file uid___A002_Xd98580_X354.ms.listobs in your favorite text editor. We'll go through the file now.


STEP 1: Split out science target and science SPWs

Inspecting the .listobs made in step 0 you will see it contains a lot of data we won't need when imaging the source. For example, myMS contains scans on the calibrators used during observation and spectral windows (SPWs) which are used only for calibration steps. To give us a smaller MS to work with we will `split' out from myMS just the data relating to the target source and the science SPWs.

We can find the `science-only' spectral windows 'by-hand' using the information in the .listobs file created in Step 0. Or we can use a handy Python function named getTargetSciSPWwhich is included in the tutorial script, scriptForImagingZCMa.py. The first part of Step 1 finds the science spectral windows with this function and then splits out just those SPWs and data related to the target (the name of which we set in the previous step).

scispws = getTargetSciSPW(myMS)

split(vis=myMS,
intent = 'OBSERVE_TARGET#ON_SOURCE',
datacolumn='corrected',
field=target,
spw = scispws,
outputvis=target+'_TM2.split.cal')

Using the above commands we generate a new MS called Z_CMa_TM2.split.cal.

We can now generate a new .listobs file for this MS with the command:

listobs(target+'_TM2.split.cal',listfile=target+'_TM2.split.cal.listobs')

You will notice when inspecting the new .listobs file that the SPWs numbers from scispw have now been shifted down to 0-10.

To run this step from the script set step1 = True and all other steps to False and type:

execfile('scriptForImagingZCMa.py')

If you did not have enough disk space for the whole MS in Step 0 a copy of Z_CMa_TM2.split.cal can be downloaded here (file size 6.2 GB untarred), you can proceed with the hands-on tutorial from here.

For those starting here, make sure you have set the myMS and target variables in your script as described in Step 0 and then you can enter this command into your CASA terminal to generate your copy of the new .listobs file. 

listobs('Z_CMa_TM2.split.cal',listfile='Z_CMa_TM2.split.cal.listobs')

From this point on, everyone can follow exactly the same steps.


STEP 2: Inspect science SPWs for spectral line emission and subtract the continuum

To perform accurate science with any detected spectral lines we need to work with continuum subtracted data; conversely, to do reliable science with the continuum emission we need to use only spectral line-free channels. As such, we next inspect each SPW to see if there is a spectral line detection and to look for any obvious continuum emission within the data. 

To inspect the data we need to use the CASA task plotms. This is CASA's general-purpose plotter for the visibility data, to open plotms you just need to type plotms in the CASA terminal. Now we need to tell plotms what to plot.

First, on the "Data" tab (the first which opens when you start plotms) click the Browse button and select your newly created .split.cal MS. In the "Averaging" panel below select time and enter in the text box a LARGE value (something greater than the total on-source time for the observations, e.g 1e8 [which is about 3 years 2 months in seconds]). Also, select 'All Baselines'. These two values will average your visibilities in both time and across all array baselines giving plotms far fewer points to plot. Always be careful with plotms, it will try and plot whatever you tell it, but without averaging, the number of data points can be in the several millions to billions!

Next, we select the axes we want, in the 'Axes' tab down the left-hand side of the GUI. Set x-axis as Channel and y-axis as Amp, meaning we will plot the visibility amplitudes as a function of channel.

Finally, select the 'Page' tab down the left-hand side and set the 'Axis' value in the 'Iteration' panel to 'Spw', this will tell plotms to plot one SPW at a time, for easier inspection. Finally, click on the 'Plot' button at the bottom.

Let plotms do its thing for a minute or two (we still have a lot of data points to plot (approx 12e6)). When it loads you can page through each SPW using the green arrows along the bottom of plotms.

Selection of line-free channels

Whilst plotms is loading, Figure 1 gives an example of the 'line free channels' which we wish to make note of for both imaging the continuum (Step 3) and continuum subtraction and line imagine (Step 4).

Fig1: Synthetic spectrum plotted as Channel vs Flux Density. The line-free channel ranges are noted by the black lines and channel values.

For this simple case, the line-free channels are 0 to 90, 150 to 280, 317 to 370, and 430 to 510. In CASA syntax this would be written as SPW No.:0~90;150~280;317~370;430~510 (where 'SPW No.' would be replaced by the SPW number and remember these are not the values to use with our data).

Once plotms has loaded all the plots for our data set, you should see something like these example plots (one for each SPW if you use the green arrows to move back and forth between them):

 

It seems that there is a good level of continuum in this source (all SPWs have visibility amplitudes offset from 0), and there are two clear line detections in SPWs 8 and 9.

SPW 8 is CO(2-1) at 230.538 GHz (shifted to the source velocity of ~14 km/s)
SPW 9 is 13CS at ~231.2206 GHz

In terms of channels, the CO line appears in SPW 8 from channel shortly after channel 0 to around 250. As such, and to add a little buffer we can say the line-free channels in SPW are from 300 to 959 (there are 960 channels in this SPW and CASA starts counting at 0). In CASA syntax this channel range can be written as  8:300~959.

Similarly, for the much narrower line in SPW 9, we see the emission between channels 90 and 110. This gives us line-free channels (again with a bit of a buffer) of 9:0~85;115~959. Notice here we have defined two line-free channel ranges on either side of the line and separated them by a semi-colon.

All the other SPWs are empty of spectral line emission, so we can use all available channels when continuum imaging. Given this, we can write a single string that defines all the line-free channels across all the SPWs in the data set. In CASA syntax you separate SPWs in such a string with a comma. If you want to include all channels from a given SPW you just need to provide its SPW number. So for Z_CMa, we can define a variable called contSPW which lists all the channels, in all SPWs, which are spectral line free as:

contSPW = '0,1,2,3,4,5,6,7,8:300~959,9:0~85;115~959,10'

Remember to fill this in the 'Initial Parameters' section of your copy of ScriptForImagingZCMa.py.

Subtracting the continuum

In Step 4 we will image the lines we find in spw 8 and 9. Doing it accurately requires us to remove any continuum emission from our data. This is done using the CASA task uvcontsub, the specific command is given in Step 2 of ScriptForImagingZCMa.py and looks like this:

uvcontsub(vis=target+'_TM2.split.cal',
               fitspw = contSPW)

This task will fit a first-order polynomial to visibility data in each SPW using only the channels we defined in our contSPW variable. It then subtracts from the visibilities the generated fit to remove the continuum emission and creates a new MS called target+'_TM2.split.cal.contsub'. This is the MS that we will use for spectral line imaging later (Step 4).

To run this step from the script set step2 = True and all other steps to False and type:

execfile('scriptForImagingZCMa.py')

STEP 3: Image continuum

The first step in any imaging is to calculate the required imaging parameters cellimsize and threshold. The first two define the size of the pixels in the image we are going to make and the total size in pixels of that image, respectively. The third helps to constrain the tclean algorithm so that we don't overclean our data. To calculate these values we need to know the following properties of our data: The field of view (FoV), the expected resolution, the number of antennas used during the observation, the total time on-source, the frequency bandwidth, and the system temperature during our observations.

Field of View and Resolution

The observed FoV is approximately the same as the primary beam size of a single antenna in our array at our observing frequency. For this, we need to know the diameter of our ALMA antennas. For these data, we have 12m ALMA dishes and we can use the average frequency value of the 11 SPWs we are going to image simultaneously (225.261 GHz or converted to wavelength 1.33 mm). Thus, our FoV ~ wavelength/12.0 ~ 23.0 arcsec.

Similarly, we need the expected resolution, for which we need to know the maximum baseline of our observation. To do this we can use a task from the Analysis Utilities package called getBaselineLengths() as follows:  

    blines = aU.getBaselineLengths(target+'_TM2.split.cal')

This function returns the maximum baseline length, b_max, of 360.56 m. Our resolution (res) is then , res ~ wavelength/b_max ~ 0.76 arcsec. To ensure we sample our data well over the beam we want to have more than 3 pixels across the beam. We set the pixel size using the cell parameter in tclean. Typically, ALMA tries to use between 5 and 6 pixels across a beam. Set the cell value in your copy of the ScriptForImagingZCMa.py script.

To get our imsize in pixels we divide or FoV by cell size = 23.0/cell, we can then set imsize to approx twice this. Note that CASA has some 'preferred' image sizes (even values and factorizable by 2, 3, 5 only), which make tclean work more quickly. Please, select the closest value to your calculated imsize.

We can now fill in the myCell and myImsize values in the 'Initial Parameters' section of the script, to be used by the tclean command at step 3.

Estimating a cleaning threshold

Interferometric imaging works best when the tclean task is given some sensible constraints. Primarily these take the form of tclean mask and a threshold on the amount of cleaning which is undertaken. We will set the former interactively during the tclean, but the latter can be set beforehand.

Two parameters can limit the amount of cleaning which gets done, they are niter and threshold. niter sets the number of iterations of clean which can be done, so making this a small number means only a small amount of cleaning is done. Whilst this method has its uses, it is hard to know a priori how many clean cycles are enough. The second limiter on the amount of cleaning undertaken is the parameter threshold. This sets an absolute noise value which when reached within your clean mask the algorithm will stop. Combining threshold with a reasonable niter value (several thousand) means tclean will stop either when the noise limit is reached or some (but not too many) iterations of clean have occurred.

So, how do we calculate a sensible noise threshold? A reasonable first estimate is a few times the theoretical RMS noise of your observation. We can calculate this from the available data in our listobs file (plus one other value). From the listobs we need the number of antennas used (Nants), the time on source (Δt), and the total bandwidth (Δν).

Nants  can be read straight from the listobs file itself. The 'timerange' of the on-source scan is also given in listobs, so we can trivially convert that to a time on-source (in seconds). Finally, to calculate our total bandwidth we need to take the number of channels we are using in each SPW (the contSPW selection in Step 2) and multiply this by the channel width (ChanWid in listobs) in that SPW. The total bandwidth is then the sum of these values for each SPW. We need this total bandwidth in units Hz.

The final value required to calculate a theoretical RMS is an approximate value of the system temperature (Tsys). This is not as easy to find, and the easiest approach is to look in the weblog (discussed in the tutorial Examining Downloaded Data) which normally be downloaded with your data when you take it from the ALMA Archive. In the case of these data, we want to look at the plots in Step 6 of the weblog and estimate the average Tsys across all antenna and SPW.

Given all of these values the theoretical RMS noise can be calculated as:

Equation 1: Theoretical sensitivity of an interferometric image.

where, in addition to the variable discussed above,  is the Boltzmann constant and Aeff the effective collecting area of a single dish in your array (area of the dish times ~0.7 to account for various efficiency parameters). N is the number of antennas used Nants.

We can then set the threshold to 3 or 5 (or more) times this value as the parameter myThreshold in scriptForImagingZCMa.py.

With  myCellmyImsize and myThreshold set we can start running tclean on our data. To do this set step3 = True (all others to False) in the script and type the following command into your CASA terminal:

execfile('scriptForImagingZCMa.py')

tclean will then start running interactively and you can proceed through the cleaning. Whilst this step is running please take a look at the tclean command used to generate the continuum image, the tutor will discuss the choice of parameters that have been set for you.


STEP 4: Spectral line imaging

We will then proceed to image both the CO and the 13CS data cubes. We refer to the products as cubes as they have three dimensions, Right Ascension, Declination and Velocity/Frequency.

Before running this step there are some parameters you need to fill in within the script. First, we need to redefine the threshold for cleaning, one for each spectral line giving us myThreshCO and myThreshCS. Now you will note from Equation 1. that the only thing which changes between our previous calculation of a noise threshold and these is parameters is the bandwidth. For spectral cubes we want Δν to be the width of a single channel. Given this, we can simply scale our previous threshold by sqrt(Δνcont/Δνch), where Δνcont is the continuum bandwidth and Δνch is the channel width. The channel width for the CO and 13CS SPWs can be found in the .listobs file.

The last thing to set are the start channel and number of channels (parameters start and nchan in tclean), we define these in the 'Initial Parameters' section of the script as startCO, nChanCO, startCS, and nChanCS, for CO and CS respectively. The value startCO tells tclean  which channel to start making the CO cube from and nChanCO how many channels after the start channel to image (and the same for the 13CS values). We can return to plotms as we did in Step 2 to confirm the channel ranges for the two lines.

With the thresholds, starts, and nchans set for both species you can then run Step 4 by setting step4 = True (all others to False) and typing the execfile command in the CASA terminal. This time we are using CASA's automasking capabilities to draw the clean boxes for us. The tutor will run through how this works and the differences in the tclean command for spectral cube imaging as your cubes are being made.


STEP 5: Advanced imaging steps

The final step in this tutorial is to create some "Advanced imaging" products. Specifically, zeroth, first, and second order moment maps. The definitions for each of these moments (and several more) can be found here, but simply they translate to the integrated intensity of your line (zeroth), the velocity gradient (first), and the velocity dispersion (second). 

In this step, we need to select just the channels that have line emission in the CO cube (including in the moment calculation channels with only continuum emission will make the images noisier). To do this we can open up the CO cube we made in the previous step and inspect both the cube and the spectrum of the line. You can use either CASA's viewer or CARTA to do this.

The parameter you need to set for this tutorial is myLineChans. Using the spectrum of CO you can define the channel numbers to make the image moments over. Here we are talking about channels in the cube, not the channels from the CO SPW.  Then we use the immoments task in Step 5 to create these moment maps. The chans parameter of immoments uses the same syntax as the spw parameter we have seen before to define channel ranges. So to take the moment(s) for channel X to Y then we need to write 'X~Y'.

When you have set  myLineChans you can run Step 4 by setting step5 = True (all others to False) and typing the execfile command in the CASA terminal.