Monday, February 28, 2011

Diagnostic plots, re-plotted with vertical lines:

here is the zoomed plots after subtracting the FeII template and continuum. the dotted-red vertical lines are the approximate boundaries we use to calculate the sigma. of course I didn't recorded the information about these boundaries. I kept it for CIV technique but not the MgII. This approximately based on what I assumed in measurements.

============ Notation:

panel top-left: cyan is the raw-spectrum, black is the reconstructed spectrum. both pseudo-continuum subtracted

panel top-right: black is the spectrum after PCA, red is the FeII+continuum, green is the continuum.

panel top (the big one): spectrum after PCA

panel low-right: same as top-right panel but before PCA

panel low (the big one): spectrum before PCA

=========================

vari=0.79, MBH_ab=1.01E9

image

vari=0.72, MBH_ab=1.47E9

image

for this case, if the line doesn't cross the zero we eventually stop when the profile is approximately horizontal in average.

vari=0.72, MBH_ab=1.7E9

image

===========

this is the case when after and before PCA gives very close masses.

vari=0.06, MBH_ab=9.21E8

image

Saturday, February 26, 2011

Comparing spectra and fitting before and after PCA applied:

notation in all plots:

single plot: black is the raw spectrum, red is the PCA reconstructed spectrum, M_ab is the mass of the BH after PCA, vari is the relative mass ratio = (M[ab]  -  M[bb]) / M[ab]. these single panels show how PCA reconstructed spectra fit the raw spectra.

double plot: the upper panel is after PCA, lower panel is before PCA applied. red is the pesudo-continuum, green is the continuum and black is the spectrum. these plots show how well the mass estimates is in both case before or after PCA applied however in calculating the parameter "vari" I assume that PCA mass is accurate.

1)  vari = 0.79 means raw mass is about 21% of the PCA-mass

image

image

below is the same as above but zoom in around 2800 Angstrom.

image

2) vari = 0.72 means the raw mass is about 28% of the PCA-mass.

image

image

same as above but zoomed in

image 

3) vari = 0.78 means the raw mass is 22% of the PCA-mass

image

image

same as above but zoomed in

image  

=============================

4) a good example for object which PCA doesn't improve the mass estimates. vari = 0.06 means the raw mass is about 94% of the PCA-mass

image

image

same as above but zoomed in

image

Thursday, February 24, 2011

Spiky spectra vs PCA reconstructed:

here is some examples of the spiky spectra with very low BH mass estimates in which we reconstruct the spectra and then we have a normal range BH mass. here the relative difference between after and before PCA applied BH mass is more than 70%.

eta=(M_BH(ab)-M_BH(bb))/M_BH(ab) > 0.7  assuming that M_BH(ab) is correct.

there are cases where eta is larger than 0.85 or 0.9 but the row data has significant problem (missing data around 2800 Angs or absolute noise)

black curve is the row data

red curve is the reconstructed spectra

the variable "vari" in the left panel is the relative differences between masses.

image

image

  image

 

and these are samples for the case when there is very low differences between before of after PCA applied: the relative differences between BH mass are smaller than 10%

image

image

image

Thursday, February 17, 2011

Catalogue paper step 2:

I wanted to understand a little more about differences between Shen et al. 2008 and Shen et al. 2010 results. Here is a plot showing mass-mass contour plots for redshift bins of 0.5 for all 3 emission lines.

image

well, there is something more to consider before interpreting these plots and that is the distribution of objects on redshift.

image

this means, there are more objects between 0.5 < z < 2. but that should not change the mass difference distributions.  So the mass-mass plots may be interpreted in two ways:

1)  assume that the new FWHM (2010) is more accurate, so that means, CIV is less sensitive to number of Gaussian fit.  (since new and old measurements are very close). so multiple Gaussian fit is only working for some of the lines.

2)  not assuming that, then high luminosity in high redshift bins is dominant in mass estimation. thus the calibration-factors for luminosity is determining the mass not the FWHM measurement. that means they need to check the calibration-factor. as you mentioned this somewhere, using 0.6 or 0.5 then matters here and indeed is very significant.

these can be added to Catalogue paper section 8.1.

using only MgII range for redshifts then we have:

image

it looks like semi-random distribution around a mean value for BH mass. this mean value increases from 8.5 in first redshift bin to 9.5 in the last redshift bin. its like the mean of their mass measurement has not changed by changing technique but the error has changed.!!!!

TODO: I need to plot the FWHM(2010) vs FWHM(2008) to see differences too.

Wednesday, February 16, 2011

Catalogue paper Step 1:

126 duplicated objects are out now.

the catalogue list has 27602 lines now with a flag for completness at the end (created by Pat).

The caption for Fig 4 is rewritten. it is more clear now.

I have compared the Shen et al. 2010 with Shen et al. 2008. The question is, should I add this into the Catalogue paper or not?

for example here is the FWHMs:

image

the FWHMs plot is very scattered. good news, some of the objects previously (in 2008) have not the mass estimates, they have has estimates for them in 2010. However, some high redshift objects, in CIV plot, had estimates in 2008 but have zero FWHM estimates in 2010, which is surprising.

but in general, the 2010 FWHMs are lower than 2008 estimates. that explains why in the mass-mass plot there is a trend towards the lower estimates for BH masses. The Lbol vs Lbol shows some scatter but it is moderately symmetrical, possibly due random noise.

so the question is, does it worth anything to show this mass-mass plot.

image

Tuesday, February 8, 2011

MgII mass estimates; and instrumental sigma_line

Pat has provided me 4 files:

  • tab0tab1tab2pjh1perspec.idmpf  ---> (2 columns) has SDSSJ name and MJD-Plate-Fiber (118451 lines), there are more than one SDSSJ name.
  • tab0tab1tab2pjh1perspec.res_canis  ---->  has the instrumental line dispersion for Hbeta, MgII and CIV as well as the complete path address of the fits files (118451 lines)
  • tab0tab1tab2pjh.idmpfs  ---> (up to 10 columns) has SDSSJ and MJD-Plate-Fiber (107194 lines), there are more than one set of MJD-Plate-Fiber for each SDSSJ name sometimes.
  • tab0tab1uniq.basics ---> these are unique objects(1409 lines)

data format:

    #1  2  3    4   5      6    7     8     9   10  11
    #id ra dec zHW zHWerr mjd plate fiber FIRST EBV Mi

    so the sigma_intrinsic[in units km/s] = sqrt(sigma_line**2-(1+z)**2*sigma_instrumental**2)*speed_light/2798

    sigma_line and sigma_instrumental is in Units of Angstrom.

     

Monday, February 7, 2011

New added spectra to DR7

Pat has provided me with some new objects and spectra to reconstruct.

There are 3354 new objects in the file "l_febEV_rpt.dr7qzHW" with the same format as DR7 catalogue. They are additional repeat spectra of objects in the DR7 quasar catalog. I have started applying PCA on these objects using "min3_qev_2.sm".

There are 1514 spectra of 1409 objects in file "tab0tab1.basics" with new format

#1  2  3    4   5      6    7     8     9   10  11
#id ra dec zHW zHWerr mjd plate fiber FIRST EBV Mi

so I have prepared the "min3_qev_3.sm" I will run it when the last list is done. (DONE)

I have also prepared the "min3_qev_4.sm" which uses "dup_dr7qANDhw10.dr7qzHWy.txt" including 7987 duplicated spectra with the same format as DR7 original catalogue. I will reconstruct that too later. (DONE)

all of the reconstructed spectra (118640 spectra) are copied into this directory at Canis:

"/data/arafiee/DR7_ReCon_dered"

the list-files are:

"qso_report_1514PHW_tab0_dup.txt" -->  "tab0tab1.basics"

"qso_report_3354PHW.txt" --> "l_febEV_rpt.dr7qzHW"

"qso_report_7987P_dup.txt" ---> "dup_dr7qANDhw10.dr7qzHWy.txt"

Thursday, February 3, 2011

MgII Mass estimator:

The sigma_line estimator and L_3000 estimator is partially ready. Some test must be done to be sure that everything is clear and well defined.

located at canis: /home/arafiee/work/DR7/SMBH_MgII.sm

uses the MgII_fit.sm and fitspectrum.sm

However, there is a problem here. I am trying to estimate the sigma_instruments at MgII wavelength but I cann't extract the wavelength dispersion d from fits file. I should ask pat about this. I need to know what KEY I should use.

When I use, "specplot" in IRAF/noao/onedspec then I get a plot with flus, error and mask.

 

 

 

NOTES: my source for information on SDSS HDUs: Princeton/MIT SDSS Spectroscopy Home Page

http://spectro.princeton.edu/#dm_spplate

Wednesday, February 2, 2011

126 missing objects problem, resolved!

These objects are duplicated spectra (same RAOBJ and DECOBJ) so they can be removed from our MgII mass catalogue.

Tuesday, February 1, 2011

126 objects in Catalogue paper!

There were 126 objects in our catalogue in which there name was identical. I have tracked the problem back to where I have crosslisted my catalogue with DR3 to record there SDSS name.

I have crosslisted them again, my lists and DR3 list.

There is something that I don't understand.  there are some objects in my list but they are not in the dr3 list! maybe the dr3 list that I use is the old version! it means the qso file exist in the SPECTRA directory on ara but they are not listed in DR3 catalogue! I don't know why yet?

As I thought, these objects (126 objects, listed in my list but they are not listed in the dr3) are exactly the same objects as Pat have found with problem. I checked few of them. they are fine but for some reason they are not listed in the dr3 catalogue-list that I have.

here are a few examples:

image

image

location:  Canis@/data/home/arafiee/work/Catalogue_new_data_match

Friday, January 28, 2011

Fig 2 of SEB paper for two cases Shen08 and Shen10:

here is the Fig 2 using Shen 2008 (Just redoing the Fig 2 in SEB paper for reference)

image

and then using Shen 2010 and exactly the same format as Fig 2:

image

well the SEB is NOT completely gone (the orientation of the distribution has not changed) but since Lbol Shen 2010 is scattered with respect to Shen 2008 so the M-L plane-scatter is larger this time.

There are little more objects here in this sample too, since they are not yet crosslisted with R&H sample but that doesn't change anything significantly.

Using sigma_line creates a counter-clockwise rotation with respect to Fig 2(new/old) which looks different in nature.

Thursday, January 27, 2011

Shen et al. 2010 DR7 mass estimates vs 2008 DR3

at last I could read the fits table in Matlab. It seems that it works well.

Next step is read the 2008 data and cross-list them with 2010. here is a plot of log BH masses from Shen 2008 (76990 objects) crosslisted with Shen 2010

image

image

I was looking for a possible clockwise rotation in the above plot. If there was a clockwise rotation that could be interpreted that the new mass estimates using two Gaussian to fit the emission lines has resolved the difference between sigma_line and fwhm.

There is no such rotation in this plot that means using one Gaussian or two Gaussian doesn't change anything. The main source of the difference is non-Gaussianity of the lines.

Just to be sure that I am comparing same objects from both data sample I have plotted some other quantities like redshift, Lbol and at last MJD,PLate,Fiber. So there is no problem with the crosslisting.

image 

image

Surprisingly their measured Lbol is different now.

 

image

 

 

TO DO NEXT:

about catalogue paper, I need a to have an object with low S/N ratio to see the effect of the PCA. which means before PCA we estimate larger sigma but after PCA sigma is smaller.

Tuesday, January 25, 2011

SEB Paper: referees point:

Peterson has asked "There is, however, one important point where I remain confused (suggesting that other readers might be similarly confused) and perhaps the authors can clear this up. It is not perfectly clear to me where the breakdown in the Shen et al masses actually occurs: It is not clear whether the major contributing factor is (1) use of FWHM rather than line dispersion, (2) the assumption that the line profiles are reasonably characterized by a Gaussian fit, or (3) the combination of the two."

okay, I guess I know how to investigate this. We have Shen et al. 2004 using one Gaussian fit and Shen et al. 2010 using two or more Gaussian fit. Comparing these two will clear situation.

1) if they are mostly the same then using FWHM will be the main source of difference

2) if they are not the same and indeed the new estimates are more close to what line dispersion gives then the main source is the number of Gaussian approximation.

Now I am going to investigate this.

Sunday, January 23, 2011

218 spectra:

dereddening is applied on 218 objects.

The Program is adjusted to use dereddend spectra now and is working well so far.

Dereddend DR7 spectra:

Pat has applied the dereddening on DR7 except the 218 objects added by me.

I have changed the permissions of the objects in plates:

2352, 0733,0734,0736,0737,0739

Here is what Pat has done. I will follow him:

" the best way to proceed is to rerun the reconstruction
on dereddened spectra.  Now, earlier I mentioned that the extinction curve
that's probably best is that of Fitzpatrick, E. L., 1999, PASP 111, 63.
But Yip et al used O'Donnell and CCM; I don't know the O'Donnell reference
but CCM is Cardelli,  Clayton,  and  Mathis,  ApJ  345:245,  1989.

The difference between extinction laws is small in the optical, and the
extinction corrections are small at the high Galactic latitudes of the SDSS.
Small differences to small corrections are not a big deal, so I think the
best thing to do about dereddening is the quickest thing.  That's to use
the IRAF task "deredden" which has the CCM law built-in.  I've run it on
all the spectra on canis that I can, so there are now dereddened versions
of almost all the spectra ready to go, including the duplicate spectra.
(It turns out that all the duplicate spectra of DR7 quasars were already
on canis.)

The only objects which I wasn't able to deredden are the 218 objects whose
spectra you added to canis, because only you have permission to read those
spectra.  To deredden those objects yourself, start IRAF, type "noao" and
"onedspec" to load packages, cd /data/phall/WILD/SKYSUB/ and type:
        cl < finishDR7dered.cl
...and then change the permissions on the original and deredenned spectra
so that everyone has read and execute permission on them.

The dereddened spectra are in the same directories, with "DR" appended
to their names.  That is, you should reconstruct
2152/spSpec-53874-2152-162_skysubDR.fit
instead of
2152/spSpec-53874-2152-162_skysub.fit"

Friday, January 21, 2011

NATS1740 marks correction day.

There was around 20 students with some sort of problem in their marks. I sat in my office to correct their marks or re-evaluate the assignments. It was a busy day. I have Jan 24th for some more students to come and this course will be over after that.

In between, I had time to look on some of the reconstructed spectra. I can definitely tell that the eigenspectra are already corrected for the dereddening.

I will take care of the duplicated spectra soon. may be, I'll wait till I know what to do with dereddening. Now, it will take 10 days to repeat the reconstruction for all objects (we have all the objects and program doesn't need me to check it). so if I have to dereddend the spectra before reconstructing them, I think it is okay, I will repeat them now.

-----

This is what Yip et al. have applied:

assume standard Rv=3.1

default: O'Donnell's extinction curve in range [3030,9091] angstrom
          outside this range, use CCM (1989)

The foreground dust reddening was removed from the QSO spectra using
Schlegel et al. Milky Way dust map. We did not save the dereddened
spectra. The intrinsic reddening was not treated, which in turn would
show up in the PCA mode(s).

---

DR7 quasar spectra reconstructed

 

at last all 105,785 spectra are reconstructed.

Thursday, January 20, 2011

Missing more objects:

plate 0736 and 0737 are empty too. Now I am wondering if there is something important behind this missing objects!

okay. I have also added 40 objects to Canis 0737 directory too.

plate 0739 is empty too. there are 43 objects in the list. These are also added to the Canis now.

I hope I am not destroying Pat's data on Canis by adding these data to them. 

Wednesday, January 19, 2011

New problems with the sample:

apparently there are some more objects missing from plate 0733

so far I have added few more objects to canis but program keep stopping now.

the question is, why there so much more objects in the plate 0733 online directory but less in Canis directory 0733? are they excluded from the directory for some reason or they just left out?

okay. I have added all 50 objects which was missing from the list in to the Canis directory.

there are 44 objects missing from plate 0734!

okay. they are added to Canis now.

Tuesday, January 18, 2011

PCA is working now:

Started the program on Friday night.

it stopped for unknown reason on Monday night on object 39578 with SDSS_ID and pate/mjd/fiber  (in catalogue list  pbh_dr7ANDhw10_dr7qzHW)

103453.55+241332.8     2352/53770/217

doing a little investigation: apparently there is no such file

spSpec-53770-2352-217_skysub.fit

in the /plate  directory! or any where else in SKYSUB directory.

continuing after that object (illuminated from the list for now!)

going back to DR3 mass-catalogue for now

Thursday, January 13, 2011

PCA program is ready to run:

 

I use "pbh_dr7qANDhw10_dr7qzHW" catalogue. I read M_i and redshift (HW) and plate mjd and fiber and SDSS_J2000_ID from this file.

the program min3_qev.sm is adjusted to record  "Mibin" and "zedbin" as well as all 50 eigenspectra in order from 0 to 40 in vector "qsoEigCo" and the reduced-chi2.

I also record the reconstructed file. I will use this to calculate the MgII-mass later.

Wednesday, January 12, 2011

Problems I need to think about:

1) there are about 134,824 objects in DR7 directory in Canis; however, there are only less than 100,000 in the list catalogue "x.dat090616". I need the M_i for the rest of the them.

so either I have to use the small sample or if I have to use the big sample then I need to calculate the M_i the same way they have calculated for DR7 catalogue.

Pat says: leave the duplicted spectra for now. so we use pbh_dr7ANDhw10_dr7qzHW file from http://ara.phys.yorku.ca

Tuesday, January 11, 2011

Work on Catalogue Paper:

 

The figure 3 was suggested to be changed by referee so I have changed the fig3.a (top) color from cyan to black and the middle one to blue and cut the reference line zero from fig3.c (down)

(I use program in line  434 of "ref_fit_calibration.m"  located at  C:\Users\alireza\Work\Work In Progress\Research\P18 Re_fit_calibration. I used "AftbefPCA_sp52355_0574_016.eps" as fig 3)

image

I have answered one of the referees point as:

referee: >Similarly, since another component of this work is the advocacy of
>using the line dispersion over the FWHM, it would also be interesting
>to see if sigma_line(MgII) tracks sigma_line(Hbeta) better than
>FWHM(MgII) tracks FWHM(Hb) for the RM sample.

(Pat) TODO: make figure for inclusion and/or description in paper (these plots didn't help our argument)

(Ali): I should add this to answer to referee:
We have submitted another paper to MNRAS to address this point (the advocacy of using the line dispersion over the FWHM) and to distinguish potential biases caused due using FWHM(MgII). Since the subject is broader than just one paragraph or a section, we prefer not to adress it here.

------------------

these are remin for tomorrow:

TODO:
Figure out how many of 126 partial-duplication quasars should be
added back to the catalog and add them back with correct DR3c flags.
No need to mention the issue to the referee.

 

 

TODO:
[As for demonstrating, you could plot a few objects where a noise spike
was fit in the bb case but the full line was recovered in the ab case.]

(Ali) I can add this in addition to what I have in the top of this page.

image

 

------------------

Monday, January 10, 2011

Adjusting the PCA output:

 

I use min_qev.sm which in tern it uses the qev.sm which tern it uses qfiter for N=50 the number of eigenspectra.

inside "min_qev.sm"

1) the path_file address should be adjusted to the new address for eigenspectra source in Canis. (done)

2) the redshift and absolute magnitude in i band is required here to fill into qfiter program (I got the DR7 list includes the redshift and iMag; done)

3) out_name is created here too (done)

inside "qev.sm"

1) if input is fits format then we use qfiter(I prefer this)

2) if input is ascii file then we use qfiterT (it is in qev2.sm)

---------

what I have done:

1) I have a copy of the ES (the eigenspectra files) in canis now

/data/arafiee/ES

2) I have adjusted the qev.sm and min1_qev.sm on canis at /data/home/arafiee/work/DR7EVR

to record the coefficient (qsoEigCo) as well as the variance of the coefficient (qsoEigVar) and teh reduced chi squared (outchinu2) in a file with name qsoEigCo_MJD_Plate_Fiber

the reconstructed spectra is recorded in a file   ######qsoEigRecon_MJD_Plate_Fiber

3) the ES directory contains information for eigenspectra up to 50 eigenspectra. I need it up to 100 (since we want to post the coefficient it is better  if we go to higher number)

---------

Friday, December 24, 2010

DR7

  1. We need redshift, magnitude, absolute magnitude, and the confidence on redshift
  2. We need to cross-correlate the template to recalculate the redshift (Maybe, unnecessary).
  3. I need DR7 fits file on Canis to test the program output and to estimate the runtime.

Postdoctoral Project Proposal

Summery:

Step 1:

1) Using the SDSS DR7 quasar spectra

    • waiting for Pat to give me the address to DR7 fits file on Canis

2) Using PCA technique

    • I have copied the 50 eigenspectra files from ara to canis (I took the the "ES" directory from ara/work/phall) (I hope you don't mind pat that I get to your directory! I got the information from "qev.sm")

3) Reconstruct the entire DR7 with PCA

    • I am in the middle of adjusting my program to record the weight factor of each eigenspectra as well as reduced_chi2
    • there is only one problem: the ES directory includes only 50 eigenspectra. I was thinking of using 100, and the reasons are (1)  100 eigenspectra is recommended by Yip et al. (2) we want to create a ... catalogue.

4) Create a publicly available catalogue of reconstructed spectra

    • we need to decide where is the best place to post the catalogue

5) Publish the results

Step 2:

1) Using reconstructed spectra

    • I need to keep the reconstructed spectra in canis

2) Using line dispersion of MgII black hole mass technique to estimate the BH mass of DR7 quasars

    • I need to adjust my program to work with the entire sample at once

3) Publish the results

Discussion:

Rafiee & Hall 2010a, 2010b have shown that using principal component analysis (PCA technique) to reconstruct the SDSS quasar spectra in order to increases the signal-to-noise ratio of quasar spectra, will improve the black hole mass measurements. We use the first 100 most significant eigenspectra generated by Yip et al. (2004) to reconstruct the quasar spectra. The weight-factor and corresponding eigenspectra index number will be recorded into a catalogue. The catalogue will be publicly available along with the eigenspectra though researcher can reproduce the quasar spectra with any number and order of the eigenspectra later.

References:

Rafiee, A., & Hall, P. 2010a, ApJS, Submitted

Rafiee, A., & Hall, P. 2010b, MNRAS, submitted