Algorithm Descriptions
DR4 Help
 Archive Intro
 Table Descriptions
 Schema Browser
 Glossary
 Algorithms
 Introduction to SQL
 Form Query User Guide
 Query Limits
 How To
 FAQ
 API
 sdssQA
 Download
 SkyServer Sites
 SkyServer Traffic Page
 Web Browsers
 Site News
 Contact Help Desk
       A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Adaptive moments

Adaptive moments are the second moments of the object intensity, measured using a particular scheme designed to have near-optimal signal-to-noise ratio. Moments are measured using a radial weight function interactively adapted to the shape (ellipticity) and size of the object. This elliptical weight function has a signal-to-noise advantage over axially symmetric weight functions. In principle there is an optimal (in terms of signal-to-noise) radial shape for the weight function, which is related to the light profile of the object itself. In practice a Gaussian with size matched to that of the object is used, and is nearly optimal. Details can be found in Bernstein & Jarvis (2002).

The outputs included in the SDSS data release are the following:

  1. The sum of the second moments in the CCD row and column direction:
    mrr_cc = <col2> + <row2>
    and its error mrr_cc_err.
    The second moments are defined in the following way:
    <col2>= sum[I(col,row) w(col,row) col2]/sum[I*w]
    where I is the intensity of the object and w is the weight function.
  2. The object radius, called size, which is just the square root of mrr_cc
  3. The ellipticity (polarization) components:
    me1 = <col2> - <row2>)/mrr_cc
    me2 = 2.*<col*row>/mrr_cc

    and square root of the components of the covariance matrix:
    me1e1err = sqrt( Var(e1) )
    me1e2err = sign(Covar(e1,e2))*sqrt( abs( Covar(e1,e2) ) )
    me2e2err = sqrt( Var(e2) )

  4. A fourth-order moment
    mcr4 = <r4>/sigma4
    where r2 = col2 + row2, and sigma is the size of the gaussian weight. No error is quoted on this quantity.
  5. These quantities are also measured for the PSF, reconstructed at the position of the object. The names are the same with an appended _psf. No errors are quoted for PSF quantities. These PSF moments can be used to correct the object shapes for smearing due to seeing and PSF anisotropy. See Bernstein & Jarvis (2002) and Hirata & Seljak (2003) for details.

The asinh magnitude

Magnitudes within the SDSS are expressed as inverse hyperbolic sine (or "asinh") magnitudes, described in detail by Lupton, Gunn, & Szalay (1999). They are sometimes referred to informally as luptitudes . The transformation from linear flux measurements to asinh magnitudes is designed to be virtually identical to the standard astronomical magnitude at high signal-to-noise ratio, but to behave reasonably at low signal-to-noise ratio and even at negative values of flux, where the logarithm in the Pogson magnitude fails. This allows us to measure a flux even in the absence of a formal detection; we quote no upper limits in our photometry.
The asinh magnitudes are characterized by a softening parameter b, the typical 1-sigma noise of the sky in a PSF aperture in 1" seeing. The relation between detected flux f and asinh magnitude m is:

m=-(2.5/ln10)*[asinh((f/f0)/2b)+ln(b)].

Here, f0 is given by the classical zero point of the magnitude scale, i.e., f0 is the flux of an object with conventional magnitude of zero. The quantity b is measured relative to f0, and thus is dimensionless; it is given in the table of asinh softening parameters (Table 21 in the EDR paper), along with the asinh magnitude associated with a zero flux object. The table also lists the flux corresponding to 10f0, above which the asinh magnitude and the traditional logarithmic magnitude differ by less than 1% in flux.

Astrometry

A detailed description of the astrometric calibration is given in Pier et al. (2003) (AJ, or astro-ph/0211375). Portions of that discussion are summarized here, and on the astrometry quality overview page.

The r photometric CCDs serve as the astrometric reference CCDs for the SDSS. That is, the positions for SDSS objects are based on the r centroids and calibrations. The r CCDs are calibrated by matching up bright stars detected by SDSS with existing astrometric reference catalogs. One of two reduction strategies is employed, depending on the coverage of the astrometric catalogs:

  1. Whenever possible, stars detected on the r CCDs are matched directly with stars in the United States Naval Observatory CCD Astrograph Catalog (UCAC, Zacharias et al. 2000), an (eventually) all-sky astrometric catalog with a precision of 70 mas at its catalog limit of R = 16, and systematic errors of less than 30 mas. There are approximately 2 - 3 magnitudes of overlay between UCAC and unsaturated stars on the r CCDs. The astrometric CCDs are not used. For DR1, stripes 9-12, 82, and 86 used UCAC.
  2. If a scan is not covered by the current version of UCAC, then it is reduced against Tycho-2 (Hog et al. 2000), an all-sky astrometric catalog with a median precision of 70 mas at its catalog limit of VT = 11.5, and systematic errors of less than 1 mas. All Tycho-2 stars are saturated on the r CCDs; however there are about 3.5 magnitudes of overlap between bright unsaturated stars on the astrometric CCDs and the faint end of Tycho-2 ( 8 < r < 11.5), and about 3 magnitudes of overlap between bright unsaturated stars on the r CCDs and faint stars on the astrometric CCDs (14 < r < 17). The overlap stars in common to the astrometric and r CCDs are used to map detections of Tycho-2 stars on the astrometric CCDs onto the r CCDs. For DR1, stripes 34-37, 42-44, and 76 used Tycho-2.

The r CCDs are therefore calibrated directly against the primary astrometric reference catalog. Frames uses the astrometric calibrations to match up detections of the same object observed in the other four filters. The accuracy of the relative astrometry between filters can thus significantly impact Frames, in particular the deblending of overlapping objects, photometry based on the same aperture in different filters, and detection of moving objects. To minimize the errors in the relative astrometry between filters, the u, g, i, and z CCDs are calibrated against the r CCDs.

Each drift scan is processed separately. All six camera columns are processed in a single reduction. In brief, stars detected on the r CCDs if calibrating against UCAC, or stars detected on the astrometric CCDs transformed to r coordinates if calibrating against Tycho-2, are matched to catalog stars. Transformations from r pixel coordinates to catalog mean place (CMP) celestial coordinates are derived using a running-means least-squares fit to a focal plane model, using all six r CCDs together to solve for both the telescope tracking and the r CCDs' focal plane offsets, rotations, and scales, combined with smoothing spline fits to the intermediate residuals. These transformations, comprising the calibrations for the r CCDs, are then applied to the stars detected on the r CCDs, converting them to CMP coordinates and creating a catalog of secondary astrometric standards. Stars detected on the u, g, i, and z CCDs are then matched to this secondary catalog, and a similar fitting procedure (each CCD is fitted separately) is used to derive transformations from the pixel coordinates for the other photometric CCDs to CMP celestial coordinates, comprising the calibrations for the u, g, i, and z CCDs.

Note: At the edges of pixels, the quantities objc_rowc and objc_colc take integer values.

Image Classification

This page provides detailed descriptions of various morphological outputs of the photometry pipelines. We also provide discussion of some methodology; for details of the Photo pipeline processing please visit the Photo pipeline page. Other photometric outputs, specifically the various magnitudes, are described on the photometry page.

The frames pipeline also provides several characterizations of the shape and morphology of an object.

Star/Galaxy Classification
The frames pipeline provides a simple star/galaxy separator in its type parameters (provided separately for each band) and its objc_type parameters (one value per object); these are set to:
ClassNameCode
Unknown UNK 0
Cosmic Ray CR 1
Defect DEFECT 2
Galaxy GALAXY 3
Ghost GHOST 4
Known object  KNOWNOBJ  5
Star STAR 6
Star trail TRAIL 7
Sky SKY 8

In particular, Lupton et al. (2001a) show that the following simple cut works at the 95% confidence level for our data to r=21 and even somewhat fainter:

psfMag - (dev_L>exp_L)?deVMag:expMag)>0.145

If satisfied, type is set to GALAXY for that band; otherwise, type is set to STAR . The global type objc_type is set according to the same criterion, applied to the summed fluxes from all bands in which the object is detected.

Experimentation has shown that simple variants on this scheme, such as defining galaxies as those objects classified as such in any two of the three high signal-to-noise ratio bands (namely, g, r, and i), work better in some circumstances. This scheme occasionally fails to distinguish pairs of stars with separation small enough (<2") that the deblender does not split them; it also occasionally classifies Seyfert galaxies with particularly bright nuclei as stars.

Further information to refine the star-galaxy separation further may be used, depending on scientific application. For example, Scranton et al. (2001) advocate applying a Bayesian prior to the above difference between the PSF and exponential magnitudes, depending on seeing and using prior knowledge about the counts of galaxies and stars with magnitude.

Radial Profiles
The frames pipeline extracts an azimuthally-averaged radial surface brightness profile. In the catalogs, it is given as the average surface brightness in a series of annuli. This quantity is in units of "maggies" per square arcsec, where a maggie is a linear measure of flux; one maggie has an AB magnitude of 0 (thus a surface brightness of 20 mag/square arcsec corresponds to 10-8 maggies per square arcsec). The number of annuli for which there is a measurable signal is listed as nprof, the mean surface brightness is listed as profMean, and the error is listed as profErr. This error includes both photon noise, and the small-scale "bumpiness" in the counts as a function of azimuthal angle.

When converting the profMean values to a local surface brightness, it is not the best approach to assign the mean surface brightness to some radius within the annulus and then linearly interpolate between radial bins. Do not use smoothing splines, as they will not go through the points in the cumulative profile and thus (obviously) will not conserve flux. What frames does, e.g., in determining the Petrosian ratio, is to fit a taut spline to the cumulative profile and then differentiate that spline fit, after transforming both the radii and cumulative profiles with asinh functions. We recommend doing the same here.
The annuli used are:
ApertureRadius (pixels)Radius (arcsec)Area (pixels)
10.560.231
21.690.689
32.581.0321
44.411.7661
57.513.00177
611.584.63421
718.587.431085
828.5511.422561
945.5018.206505
1070.1528.2015619
11110.5044.2138381
12172.5069.0093475
13269.50107.81228207
14420.50168.20555525
15657.50263.001358149

Surface Brightness & Concentration Index
The frames pipeline also reports the radii containing 50% and 90% of the Petrosian flux for each band, petroR50 and petroR90 respectively. The usual characterization of surface-brightness in the target selection pipeline of the SDSS is the mean surface brightness within petroR50.

It turns out that the ratio of petroR50 to petroR90, the so-called "inverse concentration index", is correlated with morphology (Shimasaku et al. 2001, Strateva et al. 2001). Galaxies with a de Vaucouleurs profile have an inverse concentration index of around 0.3; exponential galaxies have an inverse concentration index of around 0.43. Thus, this parameter can be used as a simple morphological classifier.

An important caveat when using these quantities is that they are not corrected for seeing. This causes the surface brightness to be underestimated, and the inverse concentration index to be overestimated, for objects of size comparable to the PSF. The amplitudes of these effects, however, are not yet well characterized.

Model Fit Likelihoods and Parameters
In addition to the model and PSF magnitudes, the likelihoods deV_L, exp_L, and star_L are also calculated by frames. These are the probabilities of achieving the measured chi-squared for the deVaucouleurs, exponential, and PSF fits, respectively. For instance, star_L is the probability that an object would have at least the measured value of chi-squared if it is really well represented by a PSF. If one wishes to make use of a trinary scheme to classify objects, calculation of the fractional likelihoods is recommended:

f(deV_L)=deV_L/[deV_L+exp_L+star_L]

and similarly for f(exp_L) and f(star_L). A fractional likelihood greater than 0.5 for any of these three profiles is generally a good threshold for object classification. This works well in the range 18<r<21.5; at the bright end, the likelihoods have a tendency to underflow to zero, which makes them less useful. In particular, star_L is often zero for bright stars. For future data releases we will incorporate improvements to the model fits to give more meaningful results at the bright end.

Ellipticities
The model fits yield an estimate of the axis ratio and position angle of each object, but it is useful to have model-independent measures of ellipticity. In the data released here, frames provides two further measures of ellipticity, one based on second moments, the other based on the ellipticity of a particular isophote. The model fits do correctly account for the effect of the seeing, while the methods presented here do not.

The first method measures flux-weighted second moments, defined as:
Mxx = <x2/r2>
Myy = <y2/r2>
Mxy = <xy/r2>

In the case that the object's isophotes are self-similar ellipses, one can show:
Q = Mxx - Myy = [(a-b)/(a+b)]cos2φ
U = Mxy = [(a-b)/(a+b)]sin2φ

where a and b are the semi-major and semi-minor axes, and φ is the position angle. Q and U are Q and U in PhotoObj and are referred to as "Stokes parameters." They can be used to reconstruct the axis ratio and position angle, measured relative to row and column of the CCDs. This is equivalent to the normal definition of position angle (East of North), for the scans on the Equator. The performance of the Stokes parameters are not ideal at low S/N. For future data releases, frames will also output variants of the adaptive shape measures used in the weak lensing analysis of Fischer et al. (2000), which are closer to optimal measures of shape for small objects.

Isophotal Quantities
A second measure of ellipticity is given by measuring the ellipticity of the 25 magnitudes per square arcsecond isophote (in all bands). In detail, frames measures the radius of a particular isophote as a function of angle and Fourier expands this function. It then extracts from the coefficients the centroid (isoRowC,isoColC), major and minor axis (isoA,isoB), position angle (isoPhi), and average radius of the isophote in question (Profile). Placeholders exist in the database for the errors on each of these quantities, but they are not currently calculated. It also reports the derivative of each of these quantities with respect to isophote level, necessary to recompute these quantities if the photometric calibration changes.

Deblending Overlapping Objects

One of the jobs of the frames pipeline is to decide if an initial single detection is in fact a blend of multiple overlapping objects, and, if so, to separate, or deblend them. The deblending process is performed self-consistently across the bands (thus, all children have measurements in all bands). After deblending, the pipeline again measures the properties of these individual children.

Bright objects are measured at least twice: once with a global sky and no deblending run (this detection is flagged BRIGHT) and a second time with a local sky. They may also be measured more times if they are BLENDED and a CHILD.

Once objects are detected, they are deblended by identifying individual peaks within each object, merging the list of peaks across bands, and adaptively determining the profile of images associated with each peak, which sum to form the original image in each band. The originally detected object is referred to as the "parent" object and has the flag BLENDED set if multiple peaks are detected; the final set of subimages of which the parent consists are referred to as the "children" and have the flag CHILD set. Note that all quantities in the photometric catalogs (currently in the tsObj files) are measured for both parent and child. For each child object, the quantity parent gives the object id (object) of the parent (for parents themselves or isolated objects,7 this is set to the object id of the BRIGHT counterpart if that exists; otherwise it is set to -1); for each parent, nchild gives the number of children an object has. Children are assigned the id numbers immediately after the id of the parent. Thus, if an object with id 23 is set as BLENDED and has nchild equal to 2, objects 24 and 25 will be set as CHILD and have parent equal to 23.

The list of peaks in the parent is trimmed to combine peaks (from different bands) that are too close to each other (if this happens, the flag PEAKS_TOO_CLOSE is set in the parent). If there are more than 25 peaks, only the most significant are kept, and the flag DEBLEND_TOO_MANY_PEAKS is set in the parent.

In a number of situations, the deblender decides not to process a BLENDED object; in this case the object is flagged as NODEBLEND. Most objects with EDGE set are not deblended. The exceptions are when the object is large enough (larger than roughly an arcminute) that it will most likely not be completely included in the adjacent scan line either; in this case, DEBLENDED_AT_EDGE is set, and the deblender gives it its best shot. When an object is larger than half a frame,the deblender also gives up, and the object is flagged as TOO_LARGE. Other intricacies of the deblending results are recorded in flags described on the Object Flags section of the Flags page.

On average, about 15% - 20% of all detected objects are blended, and many of these are superpositions of galaxies that the deblender successfully treats by separating the images of the nearby objects. Thus, it is almost always the childless (nChild=0, or !BLENDED || (BLENDED && NODEBLEND)) objects that are of most interest for science applications. Occasionally, very large galaxies may be treated somewhat improperly, but this is quite rare.

The behavior of the deblender of overlapping images has been further improved since the DR1; these changes are most important for bright galaxies of large angular extent (> 1 arcmin). In the EDR, and to a lesser extent in the DR1, bright galaxies were occasionally "shredded" by the deblender, i.e., interpreted as two or more objects and taken apart. With improvements in the code that finds the center of large galaxies in the presence of superposed stars, and the deblending of stars superposed on galaxies, this shredding now rarely happens. Indeed, inspections of several hundred NGC galaxies shows that the deblend is correct in 95% of the cases; most of the exceptions are irregular galaxies of various sorts.

Reddening and Extinction Corrections

Reddening corrections in magnitudes at the position of each object, extinction, are computed following Schlegel, Finkbeiner & Davis (1998). These corrections are not applied to the magnitudes ugriz in the databases. If you want corrected magnitudes, you should use dered_[ugriz]; these are the extinction-corrected model magnitudes. All other magnitudes must have the correction applied by hand or as part of your SQL query. Conversions from E(B-V) to total extinction Alambda, assuming a z=0 elliptical galaxy spectral energy distribution, are tabulated in Table 22 of the EDR Paper.

Image processing flags

For objects in the calibrated object lists, the photometric pipeline sets a number of flags that indicate the status of each object, warn of possible problems with the image itself, and warn of possible problems in the measurement of various quantities associated with the object. For yet more details, refer to Robert Lupton's flags document.

Possible problems associated with individual pixels in the reduced images ("corrected frames") are traced in the Objects in the catalog have two major sets of flags:

  • The status flags, called status in the PhotoObjAll table, with information needed to discount duplicate detections of the same object in the catalog.
  • The object flags, called flags in the PhotoObjAll table, with information about the success of measuring the object's location, flux, or morphology.

The "status" of an object

The catalogs contain multiple detections of objects from overlapping CCD frames. For most applications, remove duplicate detections of the same objects by considering only those which have the "primary" flag set in the status entry of the PhotoObjAll table and its Views.

A description of status is provided on the details page. The details of determining primary status and of the remaining flags stored in status are found on the algorithms page describing the resolution of overlaps (resolve).

Object "flags"

The photometric pipeline's flags describe how certain measurements were performed for each object, and which measurements are considered unreliable or have failed altogether. You must interpret the flags correctly to obtain meaningful results.

For each object, there are 59 flags stored as bit fields in a single 64-bit table column called flags in the PhotoObjAll table (and its Views). There are two versions of the flag variable for each object:

  • Individual flags for each filter u, g, r, i, z. These are called flags_u, etc.
  • A single combination of the per-filter flags appropriate for the whole object, called flags.

Note: This differs from the tsObj files in the DAS, where the individual filter flags are stored as vectors in two separate 32-bit columns called flags and flags2, and the overall flags are stored in a scalar called objc_flags.

Here we describe which flags should be checked for which measurements, including whether you need to look at the flag in each filter, or at the general flags.

Recommendations

Clean sample of point sources

In a given band, first select objects with PRIMARY status and apply the SDSS star-galaxy separation. Then, define the following meta-flags:

DEBLEND_PROBLEMS = PEAKCENTER || NOTCHECKED || (DEBLEND_NOPEAK && psfErr>0.2)
INTERP_PROBLEMS = PSF_FLUX_INTERP || BAD_COUNTS_ERROR || (INTERP_CENTER && CR)
Then include only objects that satisfy the following in the band in question:

BINNED1 && !BRIGHT && !SATURATED && !EDGE && (!BLENDED || NODEBLEND) && !NOPROFILE && !INTERP_PROBLEMS && !DEBLEND_PROBLEMS

If you are very picky, you probably will want not to include the NODEBLEND objects. Note that selecting PRIMARY objects implies !BRIGHT && (!BLENDED || NODEBLEND || nchild == 0)

These are used in the SDSS quasar target selection code which is quite sensitive to outliers in the stellar locus. If you want to select very rare outliers in color space, especially single-band detections, add cuts to MAYBE_CR and MAYBE_EGHOST to the above list.

Clean sample of galaxies

As for point sources, but don't cut on EDGE (large galaxies often run into the edge). Also, you may not need to worry about the INTERP problems. The BRIGHTEST_GALAXY_CHILD may be useful if you are looking at bright galaxies; it needs further testing.

If you want to select (or reject against) moving objects (asteroids), cut on the DEBLENDED_AS_MOVING flag, and then cut on the motion itself. See the the SDSS Moving Objects Catalog for more details. An interesting experiment is to remove the restriction on the DEBLENDED_AS_MOVING flag to find objects with very small proper motion (i.e., those beyond Saturn).

Descriptions of all flags

Flags that affect the object's status

These flags must be considered to reject duplicate catalog entries of the same object. By using only objects with PRIMARY status (see above), you automatically account for the most common cases: those objects which are BRIGHT, or which have been deblended (decomposed) into one or more child objects which are listed individually.

In the tables, Flag names link to detailed descriptions. The "In Obj Flags?" column indicates that this flag will be set in the general (per object) "flags" column if this flag is set in any of the filters. "Bit" is the number of the bit.

To find the hexadecimal values used for testing if a flag is set, please see the PhotoFlags table.

Flag Bit In Obj Flags? Description
BINNED1 28   detected at >=5 sigma in original imaging frame
BINNED2 29   detected in 2x2 binned frame; often outskirts of bright galaxies, scattered light, low surface brightness galaxies
BINNED4 30   detected in 4x4 binned frame; few are genuine astrophysical objects. To check if an object is detected at all, use the flag combination (BINNED1 | BINNED2 | BINNED4)
BRIGHT 1 X duplicate detection of > 200 sigma objects, discard.
BLENDED 3 X Object has more than one peak, there was an attempt to deblend it into several CHILD objects. Discard unless NODEBLEND is set.
NODEBLEND 6 X Object is a blend, but was not deblended because it is:
  • too close to an edge (EDGE already set),
  • too large (TOO_LARGE), or
  • a child overlaps an edge (EDGE will be set).
CHILD 4 X Object is part of a BLENDED "parent" object. May be BLENDED itself.

Flags that indicate problems with the raw data

These flags are mainly informational and important only for some objects and science applications.

Flag Bit In Flags? Description
SATURATED 18 X contains saturated pixels; affects star-galaxy separation
SATURATED_CENTER 43   as SATURATED, affected pixels close to the center
EDGE 2   object was too close to edge of frame to be measured; should not affect point sources
LOCAL_EDGE 39   like EDGE, but for rare cases when one-half of a CCD failed
DEBLENDED_AT_EDGE 45   object is near EDGE, but so large that it was deblended anyway. Otherwise, it might have been missed.
INTERP 17   object contains interpolated-over pixels (bad columns, cosmic rays, bleed trails); should not affect photometry for single bad column or cosmic ray
INTERP_CENTER 44   interpolated pixel(s) within 3 pix of the center. Photometry may be affected.
PSF_FLUX_INTERP 47   more than 20% of PSF flux is interpolated over. May cause outliers in color-color plots, e.g.
BAD_COUNTS_ERROR 40   interpolation affected many pixels; PSF flux error is inaccurate and likely underestimated.
COSMIC_RAY (CR) 12   object contains cosmic rays which have been interpolated over; should not affect photometry
MAYBE_CR 56   object may be a cosmic ray; not interpolated over. Useful in searches for single-filter detections.
MAYBE_EGHOST 57   object may be an electronics ghost of a bright star. Be suspicious about faint single-filter detections.

Flags that indicate problems with the image

These flags may be hints that an object may not be real or that a measurement on the object failed.

Flag Bit In Flags? Description
CANONICAL_CENTER 0   could not determine a centroid in this band; used centroid in CANONICAL_BAND instead
PEAKCENTER 5   used brightest pixel as centroid; hint that an object may not be real
DEBLEND_NOPEAK 46   object is a CHILD of a DEBLEND but has no peak; hint that an object may not be real
NOPROFILE 7   only 0 or 1 entries for the radial flux profile; photometric quantities derived from profile are suspect
NOTCHECKED 19   object contains pixels which were not checked for peaks by deblender; deblending may be unreliable
NOTCHECKED_CENTER 58   as NOTCHECKED, but affected pixels are near object's center
TOO_LARGE 24   object is larger than outermost radiale profile bin (r > 4arcmin), or a CHILD in a deblend is > 1/2 frame. Very large object, poorly determined sky, or bad deblend. Photometry questionable.
BADSKY 22   local sky measurement failed, object photometry is meaningless

Problems associated with specific quantities

Some flags simply indicate that the quantity in question could not be measured. Others indicate more subtle aspects of the measurements, particularly for Petrosian quantities.

Flag Bit In Flags? Description
NOSTOKES 21   Stokes Q and U (isophotal shape parameters) undetermined
ELLIPFAINT 27   no isophotal fits performed
PETROFAINT 23   Petrosian radius measured at very low surface brightness. Petrosian magnitude still usable.
NOPETRO 8   no Petrosian radius could be determined. Petrosian magnitude still usable.
NOPETRO_BIG 10   Petrosian radius larger than extracted radial profile. Happens for noisy sky or low S/N objects.
MANYPETRO 9   more than 1 value was found for the Petrosian radius.
MANY_R50 / MANY_R90 13/14   object's radial profile dips below 0 and more than one radius was found enclosing 50%/90% of the light. Rare.
INCOMPLETE_PROFILE 16   Petrosian radius hits edge of frame. Petrosian quantities should still be reasonable.
DEBLENDED_AS_MOVING 32   object recognised to be moving between different filters. For most purposes, consider only this flag to find moving objects.
MOVED 31   candidate for moving object. Does not mean it did move - consider DEBLENDED_AS_MOVING instead! Not useful.
NODEBLEND_MOVING 33 X candidate moving object (MOVED) but was not deblended as moving
TOO_FEW_DETECTIONS 34   object detected in too few bands for motion determination
TOO_FEW_GOOD_DETECTIONS 48   even though detected, no good centroid found in enough bands for motion determination
STATIONARY 36   A "moving" object's velocity is consistent with zero.
BAD_MOVING_FIT 35   motion inconsistent with straight line, not deblended as moving
BAD_MOVING_FIT_CHILD 41   in a complicated blend, child's motion was inconsistent with straight line and parent was not deblended as moving
CENTER_OFF_AIMAGE 49   nominal motion moves object off atlas image in this band
AMOMENT_UNWEIGHTED 53   'adaptive' moment are actually unweighted for this object. NB: to find out if a moment measurement failed entirely, check the error field.
AMOMENT_SHIFT 54   centroid shifted too far during calculation of moments, moment calculation failed and M_e1,M_e2 give the value of the shift
AMOMENT_MAXITER 55   moment calculation did not converge
AMOMENT_UNWEIGHTED_PSF 59   PSF moments are unweighted.

All flags so far indicate some problem or failure of a measurement. The following flags provide information about the processing, but do not indicate a severe problem or failure.

Informational flags related to deblending

Flag Bit In Flags? Description
DEBLEND_TOO_MANY_PEAKS 11   object has more than 25 peaks; only first 25 were deblended and contain all of the parent's flux
DEBLEND_UNASSIGNED_FLUX 42 X more than 5% of the parent's Petrosian flux was initially not assigned to children; all this flux has been redistributed among children
DEBLEND_PRUNED 26   parent containing peaks which were not deblended
PEAKS_TOO_CLOSE 37   some peaks were too close to be deblended
DEBLEND_DEGENERATE 50   some peaks had degenerate templates
BRIGHTEST_GALAXY_CHILD 51   brightest child among one parent's children
DEBLENDED_AS_PSF 25   child is unresolved

Further informational flags

Flag Bit In Flags? Description
BAD_RADIAL 15   last bin in radial profile < 0; usually can be ignored
CANONICAL_BAND 52   object is undetected in r-band; this band was used to determine Petrosian and Model radii
SUBTRACTED 20   object is part of extended wing of a bright star
BINNED_CENTER 38   object was extended and centroid was determined on 2x2 binned frame. Avoid for astrometric work, e.g.

The fiber magnitude

The flux contained within the aperture of a spectroscopic fiber (3" in diameter) is calculated in each band and stored in fiberMag.

Notes:
-For children of deblended galaxies, some of the pixels within a 1.5" radius may belong to other children; we now measure the flux of the parent at the position of the child; this properly reflects the amount of light which the spectrograph will see. This was not true in the EDR.
-Images are now convolved to 2" seeing before fiberMags are measured. This also makes the fiber magnitudes closer to what is seen by the spectrograph. This was not true in the EDR.

The model magnitude

Important Note for EDR and DR1 data ONLY:Comparing the model (i.e., exponential and de Vaucouleurs fits) and Petrosian magnitudes of bright galaxies in EDR and DR1 data shows a systematic offset of about 0.2 magnitudes (in the sense that the model magnitudes are brighter). This turns out to be due to a bug in the way the PSF was convolved with the models (this bug affected the model magnitudes even when they were fit only to the central 4.4" radius of each object). This caused problems for very small objects (i.e., close to being unresolved). The code forces model and PSF magnitudes of unresolved objects to be the same in the mean by application of an aperture correction, which then gets applied to all objects. The net result is that the model magnitudes are fine for unresolved objects, but systematically offset for galaxies brighter than at least 20th mag. Therefore, model magnitudes should NOT be used in EDR and DR1 data. This problem has been corrected as of DR2.

Just as the PSF magnitudes are optimal measures of the fluxes of stars, the optimal measure of the flux of a galaxy would use a matched galaxy model. With this in mind, the code fits two models to the two-dimensional image of each object in each band:

1. a pure deVaucouleurs profile:
I(r) = I0exp{-7.67[(r/re)1/4]}
(truncated beyond 7re to smoothly go to zero at 8re, and with some softening within r=re/50.

2. a pure exponential profile
I(r) = I0exp(-1.68r/re)
(truncated beyond 3re to smoothly go to zero at 4re.

Each model has an arbitrary axis ratio and position angle. Although for large objects it is possible and even desirable to fit more complicated models (e.g., bulge plus disk), the computational expense to compute them is not justified for the majority of the detected objects. The models are convolved with a double-Gaussian fit to the PSF, which is provided by psp. Residuals between the double-Gaussian and the full KL PSF model are added on for just the central PSF component of the image.

These fitting procedures yield the quantities

  • r_deV and r_exp, the effective radii of the models;
  • ab_deV and ab_exp, the axis ratio of the best fit models;
  • phi_deV and phi_exp, the position angles of the ellipticity (in degrees East of North).
  • deV_L and exp_L, the likelihoods associated with each model from the chi-squared fit;
  • deVMag and expMag, the total magnitudes associated with each fit.

Note that these quantities correctly model the effects of the PSF. Errors for each of the last two quantities (which are based only on photon statistics) are also reported. We apply aperture corrections to make these model magnitudes equal the PSF magnitudes in the case of an unresolved object.

In order to measure unbiased colors of galaxies, we measure their flux through equivalent apertures in all bands. We choose the model (exponential or deVaucouleurs) of higher likelihood in the r filter, and apply that model (i.e., allowing only the amplitude to vary) in the other bands after convolving with the appropriate PSF in each band. The resulting magnitudes are termed modelMag. The resulting estimate of galaxy color will be unbiased in the absence of color gradients. Systematic differences from Petrosian colors are in fact often seen due to color gradients, in which case the concept of a global galaxy color is somewhat ambiguous. For faint galaxies, the model colors have appreciably higher signal-to-noise ratio than do the Petrosian colors.

Due to the way in which model fits are carried out, there is some weak discretization of model parameters, especially r_exp and r_deV. This is yet to be fixed. Two other issues (negative axis ratios, and bad model mags for bright objects) have been fixed since the EDR.

Caveat: At bright magnitudes (r <~ 18), model magnitudes may not be a robust means to select objects by flux. For example, model magnitudes in target and best imaging may often differ significantly because a different type of profile (deVaucouleurs or exponential) was deemed the better fit in target vs. best. Instead, to select samples by flux, one should typically use Petrosian magnitudes for galaxies and psf magnitudes for stars and distant quasars. However, model colors are in general robust and may be used to select galaxy samples by color. Please also refer to the SDSS target selection algorithms for examples.

The Petrosian magnitude

Stored as petroMag. For galaxy photometry, measuring flux is more difficult than for stars, because galaxies do not all have the same radial surface brightness profile, and have no sharp edges. In order to avoid biases, we wish to measure a constant fraction of the total light, independent of the position and distance of the object. To satisfy these requirements, the SDSS has adopted a modified form of the Petrosian (1976) system, measuring galaxy fluxes within a circular aperture whose radius is defined by the shape of the azimuthally averaged light profile.

We define the "Petrosian ratio" RP at a radius r from the center of an object to be the ratio of the local surface brightness in an annulus at r to the mean surface brightness within r, as described by Blanton et al. 2001a, Yasuda et al. 2001:

where I(r) is the azimuthally averaged surface brightness profile.

The Petrosian radius rP is defined as the radius at which RP(rP) equals some specified value RP,lim, set to 0.2 in our case. The Petrosian flux in any band is then defined as the flux within a certain number NP (equal to 2.0 in our case) of r Petrosian radii:


In the SDSS five-band photometry, the aperture in all bands is set by the profile of the galaxy in the r band alone. This procedure ensures that the color measured by comparing the Petrosian flux FP in different bands is measured through a consistent aperture.

The aperture 2rP is large enough to contain nearly all of the flux for typical galaxy profiles, but small enough that the sky noise in FP is small. Thus, even substantial errors in rP cause only small errors in the Petrosian flux (typical statistical errors near the spectroscopic flux limit of r ~17.7 are < 5%), although these errors are correlated.

The Petrosian radius in each band is the parameter petroRad, and the Petrosian magnitude in each band (calculated, remember, using only petroRad for the r band) is the parameter petroMag.

In practice, there are a number of complications associated with this definition, because noise, substructure, and the finite size of objects can cause objects to have no Petrosian radius, or more than one. Those with more than one are flagged as MANYPETRO; the largest one is used. Those with none have NOPETRO set. Most commonly, these objects are faint (r > 20.5 or so); the Petrosian ratio becomes unmeasurable before dropping to the limiting value of 0.2; these have PETROFAINT set and have their "Petrosian radii" set to the default value of the larger of 3" or the outermost measured point in the radial profile. Finally, a galaxy with a bright stellar nucleus, such as a Seyfert galaxy, can have a Petrosian radius set by the nucleus alone; in this case, the Petrosian flux misses most of the extended light of the object. This happens quite rarely, but one dramatic example in the EDR data is the Seyfert galaxy NGC 7603 = Arp 092, at RA(2000) = 23:18:56.6, Dec(2000) = +00:14:38.

How well does the Petrosian magnitude perform as a reliable and complete measure of galaxy flux? Theoretically, the Petrosian magnitudes defined here should recover essentially all of the flux of an exponential galaxy profile and about 80% of the flux for a de Vaucouleurs profile. As shown by Blanton et al. (2001a), this fraction is fairly constant with axis ratio, while as galaxies become smaller (due to worse seeing or greater distance) the fraction of light recovered becomes closer to that fraction measured for a typical PSF, about 95% in the case of the SDSS. This implies that the fraction of flux measured for exponential profiles decreases while the fraction of flux measured for deVaucouleurs profiles increases as a function of distance. However, for galaxies in the spectroscopic sample (r<17.7), these effects are small; the Petrosian radius measured by frames is extraordinarily constant in physical size as a function of redshift.

The PSF magnitude

Stored as psfMag. For isolated stars, which are well-described by the point spread function (PSF), the optimal measure of the total flux is determined by fitting a PSF model to the object. In practice, we do this by sync-shifting the image of a star so that it is exactly centered on a pixel, and then fitting a Gaussian model of the PSF to it. This fit is carried out on the local PSF KL model at each position as well; the difference between the two is then a local aperture correction, which gives a corrected PSF magnitude. Finally, we use bright stars to determine a further aperture correction to a radius of 7.4" as a function of seeing, and apply this to each frame based on its seeing. This involved procedure is necessary to take into account the full variation of the PSF across the field, including the low signal-to-noise ratio wings. Empirically, this reduces the seeing-dependence of the photometry to below 0.02 mag for seeing as poor as 2". The resulting magnitude is stored in the quantity psfMag. The flag PSF_FLUX_INTERP warns that the PSF photometry might be suspect. The flag BAD_COUNTS_ERROR warns that because of interpolated pixels, the error may be under-estimated.

Match and MatchHead Tables

Computing the Match table

Jim Gray, Alex Szalay, Robert Lupton, Jeff Munn, Ani Thakar
Aug 20, 2003

The SDSS data can be used for temporal studies of objects that are re-observed at different times. The SDSS survey observes about 10% of the Northern survey area 2 or more times, and observes the Southern stripe more than a dozen times.

The match table is intended to make temporal queries easy by providing a precomputed list of all objects that were observed multiple times. More formally,

Match = { (ObjID1,ObjID2) | Objid1 and ObjID2 are both from different runs (==observations)

And they are within 1 arcsecond of one another
And are both good (star or galaxy or unknown)
And are both fully deblended (no children)
And they are primary or secondary (not family or outside)

The following count from the DR1 dataset says gives.
Mode Total nChild=0
primary 52,525,57652,525,576
secondary 14,596,93114,596,931
family 17,074,000 6,153,714
outside 126,819 126,819

And here are the flag counts for DR1
Dr1 Count Flag Description
72,926,906 SET Object's status has been set in reference to its own run
72,926,906 GOOD Object is good as determined by its object flags. Absence implies bad.
10,186,591 DUPLICATE Object has one or more duplicate detections in an adjacent field of the same Frames Pipeline Run.
67,029,849 OK_RUN Object is usable, it is located within the primary range of rows for this field.
66,894,914 RESOLVED Object has been resolved against other runs.
66,839,376 PSEGMENT Object Belongs to a PRIMARY segment. This does not imply that this is a primary object.
387,964 FIRST_FIELD Object belongs to the first field in its segment. Used to distinguish objects in fields shared by two segments.
62,728,244 OK_SCANLINE Object lies within valid nu range for its scanline.
53,60,3453 OK_STRIPE Object lies within valid eta range for its stripe.

Computing the Match table

The Match table is computed by using the Neighbors table and has a very similar schema (the Neighbors table only stores mode (1,2) (aka primary/secondary) and type (3,5,6) (aka galaxy, unknown, star) objects;
Create table Match (objID 			bigint not null,
			matchObjID 		bigint not null, 
			distance 		float not null,
			type			tinyint not null,
			matchType 		tinyint not null,
			Mode			tinyint not null,
			matchMode 		tinyint not null,
			primary key (objID, matchObjID)
) ON [Neighbors]
-- now populate the table
insert Match
select N.*
  from  (Neighbors N join PhotoObj P1 on N.objID = P1.objID) 
                    join PhotoObj P2 on N.NeighborObjID = P2.objID
  where ((N.objID ^ N.neighborObjID) & 0x0000FFFF00000000) != 0 -- dif  runs
  and distance < 1.0/60.0 	             -- within 1 arcsecond of one another

One arcsecond is a large error in Sloan Positioning - the vast majority (95%) are within 0.5 arcsecond. But a particular cluster may not form a complete graph (all members connected to all others). To make the graph fully transitive, we repeatedly execute the query to add the "curved" arcs in the figure below.

 
-- compute triples
create table ##Trip(objid bigint, matchObjID bigint,  distance float,  
	type tinyint, neighborType tinyint, 
        mode tinyInt, matchMode tinyInt,
	primary key (objID, matchObjID))
again: truncate table ##trip
-- compute triples
insert ##trip
select distinct a.objID, b.matchObjID, 0,
       a.type, b.matchType, a.mode, b.matchMode 
from Match a join Match b on a.matchObjID = b.objID 
where a.objID != b.matchObjID
and (a.objid     & 0x0000FFFF00000000)!=
    (b.matchObjID& 0x0000FFFF00000000) -- Different runs
-- now delete the pairs we already have in Match
delete ##trip  
where 0 != (
	select count(*) 
	from Match p 
	where p.objID = ##trip.objID and p.matchObjID = ##trip.matchObjID
	)
-- compute the distance between the remaining tripples
select 'adding ' + cast(count(*) as varchar(20)) + ' tripples.'
update ##trip 
set distance = 
	(select min(N.distance) 
	from ##trip t join Neighbors N
	     on t.objID = N.objID and t.matchObjID = N.NeighborObjID)
-- now add these into Match and repeat till no more rows.
insert Match   select * from ##trip
if @@rowcount > 0 goto again
drop table ##trip

Computing the MatchHead table

Now each cluster of objects in the Match table is fully connected. We can name the clusters in the Match table by the minimum objID in the cluster. We can compute the MatchHead table that describes the global properties of the cluster: its name, its average RA and DEC and the variance in RA, DEC.
-- build a table of cluster IDs (minimum object ID of each cluster).
Create table MatchHead (
objID 		bigint not null primary key,
		averageRa	float not null default 0,
		averageDec	float not null default 0,
		varRa		float not null default 0, 	-- variance in RA	
		varDec		float not null default 0,	-- variance in DEC
		matchCount	tinyInt not null default 0,	-- number in cluster
		missCount	tinyInt not null default 0	-- runs missing from cluster
		) ON [Neighbors]	
-- compute the minimum object IDs.
Create table ##MinID (objID bigint primary key)
Insert ##MinID
select distinct objID 
from Match MinId
where 0 = (	select count(*)
	 	from Match m
		where MinId.objID = m.objID  
    		and MinId.objID > m.matchObjID)
-- compute all pairs of objIDs in a cluster (including x,x for the headID)
create table ##pairs (objID 			bigint not null,
			matchObjID 		bigint not null
			primary key(objID, matchObjID)) 
insert ##pairs 
select h.objID, m.matchObjID 
from ##MinID h join Match m on h.objID = m.objID
insert ##pairs select objID, objID from ##MinID
-- now populate the MatchHead table with minObjID and statistics 
Insert MatchHead  
Select MinID.objID, avg(ra), avg(dec), 
 	coalesce(stdev(ra),0), coalesce(stdev(dec),0), 
	count(m.objid & 0x0000FFFF00000000), -- count runs
	0	-- count misses later
from  	##MinID as MinID,
		##pairs	as m,
		PhotoObj as o
where  MinID.objID = m.objID 
	   and   m.matchObjID = o.objID
group by MinID.objID
order by MinID.objID
-- cleanup	 
Drop table ##MinID
Drop table ##pairs
The number missing from the cluster is computed in the next section.

Computing the MatchMiss table

It is also of interest to have a list of objects that are in areas that were observed multiple times but that were only observed once. To do this we need:
a description of each multiple-observation region.
A count of how many times it was observed.
An efficient way to test if a point is in a region
Alex will provide 1 and 2, jim will provide 3 (right?).

We will create a table of "dropouts", places where a match cluster should have an object but does not.

Create  table MatchMiss (objID  	bigint not null,  	--- the unique ID of the cluster	
Run 	int not null,	-- the run that is missing a member of this cluster.
Primary key (objID, Run)
)
Logic:
	From Match find all pairs of runs that overlap
	Form the domain that is the union of the intersection of these pairs.
	Now build T, a list of all objects primary/secondary type (3,5, 6) objects that are in this domain.
	Subtract from T all objects that appear in Match 
	Add these objects and the missing run number(s) to MatchMiss
	For each object in MatchHead, count the number of overlaps it is a member of. (MatchHead, runs)
	If this is equals the number of runs the match list then  

Performance

Building Match and MatcHead takes about an hour on SdssDr1 with the Best database of 85M objects. The cardinalities of each step are:
Match 12,294,016
add from triples 19,040
add from triples 322
add from triples 16
add from triples 2
add from triples 0
MinID 5,545,446
Mirror Pairs 5,849,459
Paris from match 5,545,446
MatchHead 5,545,446

SDSS ObjID Encoding

The bit encoding for the long (64-bit) IDs that are used as unique keys in the SDSS catalog tables is described here.

PhotoObjID

The encoding of the photometric object long ID (objID in the photo tables) is described in the table below. This scheme applies to the fieldID and objID (objid bits are 0 for fieldID).

Bits Length
(# of bits)
Mask Assignment Description
0 1 0x8000000000000000 empty unassigned
1-4 4 0x7800000000000000 skyVersion resolved sky version (0=TARGET, 1=BEST, 2-15=RUNS)
5-15 11 0x07FF000000000000 rerun number of pipeline rerun
16-31 16 0x0000FFFF00000000 run run number
32-34 3 0x00000000E0000000 camcol camera column (1-6)
35 1 0x0000000010000000 firstField is this the first field in segment?
36-47 12 0x000000000FFF0000 field field number within run
48-63 16 0x000000000000FFFF object object number within field

SpecObjID

The encoding of the long ID for spectroscopic objects is described below. This applies to plateID, specObjID, specLineID, specLineIndexID, elRedshiftID and xcRedshiftID.

Bits Length
(# of bits)
Mask Assignment Description
0-15 16 0xFFFF000000000000 plate number of spectroscopic plate
16-31 16 0x0000FFFF00000000 MJD MJD (date) plate was observed
32-41 10 0x00000000FFC00000 fiberID number of spectroscopic fiber on plate (1-640)
42-47 6 0x00000000003F0000 type type of targeted object
48-63 16 0x000000000000FFFF line/redshift/index 0 for SpecObj, else number of spectroscopic line (SpecLine) or index (SpecLineIndex) or redshift (ELRedhsift or XCRedshift)

Photometric Flux Calibration

The objective of the photometric calibration process is to tie the SDSS imaging data to an AB magnitude system, and specifically to the "natural system" of the 2.5m telescope defined by the photon-weighted effective wavelengths of each combination of SDSS filter, CCD response, telescope transmission, and atmospheric transmission at a reference airmass of 1.3 as measured at APO.

The calibration process ultimately involves combining data from three telescopes: the USNO 40-in on which our primary standards were first measured, the SDSS Photometric Telescope (or PT) , and the SDSS 2.5m telescope. At the beginning of the survey it was expected that there would be a single u'g'r'i'z' system. However, in the course of processing the SDSS data, the unpleasant discovery was made that the filters in the 2.5m telescope have significantly different effective wavelengths from the filters in the PT and at the USNO. These differences have been traced to the fact that the short-pass interference films on the 2.5-meter camera live in the same vacuum as the detectors, and the resulting dehydration of the films decreases their effective refractive index. This results in blueward shifts of the red edges of the filters by about 2.5 percent of the cutoff wavelength, and consequent shifts of the effective wavelengths of order half that. The USNO filters are in ambient air, and the hydration of the films exhibits small temperature shifts; the PT filters are kept in stable very dry air and are in a condition about halfway between ambient and the very stable vacuum state. The rather subtle differences between these systems are describable by simple linear transformations with small color terms for stars of not-too-extreme color, but of course cannot be so transformed for very cool objects or objects with complex spectra. Since standardization is done with stars, this is not a fundamental problem, once the transformations are well understood.

It is these subtle issues that gave rise to our somewhat awkward nomenclature for the different magnitude systems:

  • magnitudes in the the USNO 40-in system are primed (u'g'r'i'z')
  • magnitudes in the SDSS 2.5m system are unprimed (ugriz)
  • magnitudes in the PT system only exist internally within the Monitor Telescope Pipeline (mtpipe) and have no official designation.

Previous reductions of the data, including that used in the EDR, were based on inconsistent photometric equations; this is why we referred to the 2.5m photometry with asterisks: u*g*r*i*z*. With the DR1, the photometric equations are properly self-consistent, and we can now remove the stars, and refer to u g r i z photometry with the 2.5m.

Overview of the Photometric Calibration in SDSS

The photometric calibration of the SDSS imaging data is a multi-step process, due to the fact that the images from the 2.5m telescope saturate at approximately r = 14, fainter than typical spectrophotometric standards, combined with the fact that observing efficiency would be greatly impacted if the 2.5m needed to interrupt its routine scanning in order to observe separate calibration fields.

The first step involved setting up a primary standard star network of 158 stars distributed around the Northern sky. These stars were selected from a variety of sources and span a range in color, airmass, and right ascension. They were observed repeatedly over a period of two years using the US Naval Observatory 40-in telescope located in Flagstaff, Arizona. These observations are tied to an absolute flux system by the single F0 subdwarf star BD+17_4708, whose absolute fluxes in SDSS filters are taken from Fukugita et al. 1996 As noted above, the photometric system defined by these stars is called the u'g'r'i'z' system. You can look at the table containing the calibrated magnitudes for these standard stars.

Most of these primary standards have brightnesses in the range r = 8 - 13, and would saturate the 2.5-meter telescope's imaging camera in normal operations. Therefore, a set of 1520 41.5x41.5 arcmin2 transfer fields, called secondary patches, have been positioned throughout the survey area. These secondary patches are observed with the PT; their size is set by the field of view of the PT camera. These secondary patches are grouped into sets of four. Each set spans the full set of 12 scan lines of a survey stripe along the width of the stripe, and the sets are spaced along the length of a stripe at roughly 15 degree intervals. The patches are observed by the PT in parallel with observations of the primary standards and processed using the Monitor Telescope Pipeline (mtpipe). The patches are first calibrated to the USNO 40-in u'g'r'i'z' system and then transformed to the 2.5m ugriz system; both initial calibration to the u'g'r'i'z' system and the transformation to the ugriz system occur within mtpipe. The ugriz-calibrated patches are then used to calibrate the 2.5-meter's imaging data via the Final Calibrations Pipeline (nfcalib).

Monitor Telescope Pipeline

The PT has two main functions: it measures the atmospheric extinction on each clear night based on observations of primary standards at a variety of airmasses, and it calibrates secondary patches in order to determine the photometric zeropoint of the 2.5m imaging scans. The extinction must be measured on each night the 2.5m is scanning, but the corresponding secondary patches can be observed on any photometric night, and need not be coincident with the image scans that they will calibrate.

The Monitor Telescope Pipeline (mtpipe), so called for historical reasons, processes the PT data. It performs three basic functions:

  1. it bias subtracts and flatfields the images, and performs aperture photometry;
  2. it identifies primary standards in the primary standard star fields and computes a transformation from the aperture photometry to the primary standard star u'g'r'i'z' system;
  3. it applies the photometric solution to the stars in the secondary patch fields, yielding u'g'r'i'z'-calibrated patch star magnitudes, and then transforms these u'g'r'i'z' magnitudes into the SDSS 2.5m ugriz system.

The Final Calibration Pipeline

The final calibration pipeline (nfcalib) works much like mtpipe, computing the transformation between psf photometry (or other photometry) as observed by the 2.5m telescope and the final SDSS photometric system. The pipeline matches stars between a camera column of 2.5m data and an overlapping secondary patch. Each camera column of 2.5m data is calibrated individually. There are of order 100 stars in each patch in the appropriate color and magnitude range in the overlap.

The transformation equations are a simplified form of those used by mtpipe. Since mtpipe delivers patch stars already calibrated to the 2.5m ugriz system, the nfcalib transformation equations have the following form:
mfilter_inst(2.5m) = mfilter(patch) + afilter + kfilterX,
where, for a given filter, mfilter_inst(2.5m) is the instrumental magnitude of the star in the 2.5m data [-2.5 log10(counts/exptime)], mfilter(patch) is the magnitude of the same star in the PT secondary patch, afilter is the photometric zeropoint, kfilter is the first-order extinction coefficient, and X is the airmass of the 2.5m observation. The extinction coefficient is taken from PT observations on the same night, linearly interpolated in time when multiple extinction determinations are available. (Generally, however, mtpipe calculates only a single kfilter per filter per night, so linear interpolation is usually unnecessary.) A single zeropoint afilter is computed for each filter from stars on all patches that overlap a given CCD in a given run. Observations are weighted by their estimated errors, and sigma-clipping is used to reject outliers. At one time it was thought that a time dependent zero point might be needed to account for the fact that the 2.5m camera and corrector lenses rotate relative to the telescope mirrors and optical structure; however, it now appears that any variations in throughput are small compared to inherent fluctuations in the calibration of the patches themselves. The statistical error in the zeropoint is usually constrained to be less than 1.35 percent in u and z and 0.9 percent in gri.

Assessment of Photometric Calibration

With Data Release 1 (DR1), we now routinely meet our requirements of photometric uniformity of 2% in r, g-r, and r-i and of 3% in u-g and i-z (rms).

This is a substantial improvement over the photometric uniformity achieved in the Early Data Release (EDR), where the corresponding values were approximately 5% in r, g-r, and r-i and 5% in u-g and i-z.

The improvements between the photometric calibration of the EDR and the DR1 can be traced primarily to the use of more robust and consistent photometric equations by mtpipe and nfcalib and to improvements to the PSF-fitting algorithm and flatfield methodology in the Photometric Pipeline (photo).

Note that this photometric uniformity is measured based upon relatively bright stars which are no redder than M0; hence, these measures do not include effects of the u band red leak (see caveats below) or the model magnitude bug.

How to go from Counts in the fpC file to Calibrated ugriz magnitudes?

Asinh and Pogson magnitudes

All calibrated magnitudes in the photometric catalogs are given not as conventional Pogson astronomical magnitudes, but as asinh magnitudes. We show how to obtain both kinds of magnitudes from observed count rates and vice versa. See further down for conversion of SDSS magnitudes to physical fluxes. For both kinds of magnitudes, there are two ways to obtain the zeropoint information for the conversion.

  1. A little slower, but gives the final calibration and works for all data releases

    Here you first need the following information from the tsField files:

    aa = zeropoint
    kk = extinction coefficient
    airmass

    To get a calibrated magnitude, you first need to determine the extinction-corrected ratio of the observed count rate to the zero-point count rate:

    • Convert the observed number of counts to a count rate using the exposure time exptime = 53.907456 sec,
    • correct counts for atmospheric extinction using the extinction coefficient kk and the airmass, and
    • divide by the zero-point count rate, which is given by f0 = 10-0.4*aa both for asinh and conventional magnitudes.
    In a single step,
    f/f0 = counts/exptime * 100.4*(aa + kk * airmass)

    Then, calculate either the conventional ("Pogson") or the SDSS asinh magnitude from f/f0:

    Pogson
    mag = -2.5 * log10(f/f0)
    asinh
    mag = -(2.5/ln10)*[asinh((f/f0)/2b)+ln(b)], where b is the softening parameter for the photometric band in question and is given in the table of b coefficients below.

    asinh Softening Parameters (b coefficients)
    BandbZero-Flux Magnitude [m(f/f0 = 0)]m(f/f0 = 10b)
    u 1.4 × 10-1024.6322.12
    g 0.9 × 10-1025.1122.60
    r 1.2 × 10-1024.8022.29
    i 1.8 × 10-1024.3621.85
    z 7.4 × 10-1022.8320.32

    Note: These values of the softening parameter b are set to be approximately 1-sigma of the sky noise; thus, only low signal-to-noise ratio measurements are affected by the difference between asinh and Pogson magnitudes. The final column gives the asinh magnitude associated with an object for which f/f0 = 10b; the difference between Pogson and asinh magnitudes is less than 1% for objects brighter than this.

    The calibrated asinh magnitudes are given in the tsObj files. To obtain counts from an asinh magnitude, you first need to work out f/f0 by inverting the asinh relation above. You can then determine the number of counts from f/f0 using the zero-point, extinction coefficient, airmass, and exposure time.

    The equations above are exact for DR1. Strictly speaking, for EDR photometry, the corrected counts should include a color term cc*(color-color0)*(X-X0) (cf. equation 15 in section 4.5 in the EDR paper), but it turns out that generally, cc*(color-color0)*(X-X0) < 0.01 mag and the color term can be neglected. Hence the calibration looks identical for EDR and DR1.

  2. Faster magnitudes via "flux20"

    The "flux20" keyword in the header of the corrected frames (fpC files) approximately gives the net number of counts for a 20th mag object. So instead of using the zeropoint and airmass correction term from the tsField file, you can determine the corrected zero-point flux as

    f/f0 = counts/(exptime * 10-8 * flux20)

    Then proceed with the calculation of a magnitude from f/f0 as above.

    The relation is only approximate because the final calibration information (provided by nfcalib) is not available at the time the corrected frames are generated. We expect the error here (compared to the final calibrated magnitude) to be of order 0.1 mag or so, as estimated from a couple of test cases we have tried out.

    Note the counts measured by photo for each object are given in the fpObjc files, as e.g., "psfcounts", "petrocounts", etc.

On a related note, in DR1 one can also use relations similar to the above to estimate the sky level in magnitudes per sq. arcsec (1 pixel = 0.396 arcsec). Either use the header keyword "sky" in the fpC files, or remember to first subtract "softbias" (= 1000) from the raw background counts in the fpC files. Note the sky level is also given in the tsField files. This note only applies to the DR1 and later data releases. Note also that the calibrated sky brightnesses reported in the tsField values have been corrected for atmospheric extinction.

Computing errors on counts (converting counts to photo-electrons)

The fpC (corrected frames) and fpObjc (object tables with counts for each object instead of magnitudes) files report counts (or "data numbers", DN). However, it is the number of photo-electrons which is really counted by the CCD detectors and which therefore obeys Poisson statistics. The number of photo-electrons is related to the number of counts through the gain (which is really an inverse gain):
photo-electrons = counts * gain

The gain is reported in the headers of the tsField and fpAtlas files (and hence also in the field table in the CAS). The total noise contributed by dark current and read noise (in units of DN2) is also reported in the tsField files in header keyword dark_variance (and correspondingly as darkVariance in the field table in the CAS), and also as dark_var in the fpAtlas header.

Thus, the error in DN is given by the following expression:

error(counts) = sqrt([counts+sky]/gain + Npix*dark_variance),

where counts is the number of object counts, sky is the number of sky counts summed over the same area as the object counts, Npix is the area covered by the object in pixels, and gain and dark_variance are the numbers from the corresponding tsField files.

Conversion from SDSS ugriz magnitudes to AB ugriz magnitudes

The SDSS photometry is intended to be on the AB system (Oke & Gunn 1983), by which a magnitude 0 object should have the same counts as a source of Fnu = 3631 Jy. However, this is known not to be exactly true, such that the photometric zeropoints are slightly off the AB standard. We continue to work to pin down these shifts. Our present estimate, based on comparison to the STIS standards of Bohlin, Dickinson, & Calzetti~(2001) and confirmed by SDSS photometry and spectroscopy of fainter hot white dwarfs, is that the u band zeropoint is in error by 0.04 mag, uAB = uSDSS - 0.04 mag, and that g, r, and i are close to AB. These statements are certainly not precise to better than 0.01 mag; in addition, they depend critically on the system response of the SDSS 2.5-meter, which was measured by Doi et al. (2004, in preparation). The z band zeropoint is not as certain at this time, but there is mild evidence that it may be shifted by about 0.02 mag in the sense zAB = zSDSS + 0.02 mag. The large shift in the u band was expected because the adopted magnitude of the SDSS standard BD+17 in Fukugita et al.(1996) was computed at zero airmass, thereby making the assumed u response bluer than that of the USNO system response.

We intend to give a fuller report on the SDSS zeropoints, with uncertainties, in the near future. Note that our relative photometry is quite a bit better than these numbers would imply; repeat observations show that our calibrations are better than 2%.

Conversion from SDSS ugriz magnitudes to physical fluxes

As explained in the preceding section, the SDSS system is nearly an AB system. Assuming you know the correction from SDSS zeropoints to AB zeropoints (see above), you can turn the AB magnitudes into a flux density using the AB zeropoint flux density. The AB system is defined such that every filter has a zero-point flux density of 3631 Jy (1 Jy = 1 Jansky = 10-26 W Hz-1 m-2 = 10-23 erg s-1 Hz-1 cm-2).

To obtain a flux density from SDSS data, you need to work out f/f0 (e.g. from the asinh magnitudes in the tsObj files by using the inverse of the relations given above). This number is then the also the object's flux density, expressed as fraction of the AB zeropoint flux density. Therefore, the conversion to flux density is
S = 3631 Jy * f/f0.

Then you need to apply the correction for the zeropoint offset between the SDSS system and the AB system. We do not know this correction yet, so the fluxes you obtain by assuming that SDSS = AB may be affected by a systematic shift of probably at most 10%.

Spectroscopic Redshift and Type Determination


The spectro1d pipeline analyzes the combined, merged spectra output by spectro2d and determines object classifications (galaxy, quasar, star, or unknown) and redshifts; it also provides various line measurements and warning flags. The code attempts to measure an emission and absorption redshift independently for every targeted (nonsky) object. That is, to avoid biases, the absorption and emission codes operate independently, and they both operate independently of any target selection information.

The spectro1d pipeline performs a sequence of tasks for each object spectrum on a plate: The spectrum and error array are read in, along with the pixel mask. Pixels with mask bits set to FULLREJECT, NOSKY, NODATA, or BRIGHTSKY are given no weight in the spectro1d routines. The continuum is then fitted with a fifth-order polynomial, with iterative rejection of outliers (e.g., strong lines). The fit continuum is subtracted from the spectrum. The continuum-subtracted spectra are used for cross-correlating with the stellar templates.

Emission-Line Redshifts

Emission lines (peaks in the one-dimensional spectrum) are found by carrying out a wavelet transform of the continuum-subtracted spectrum fc(&lambda):



where g(x; a, b) is the wavelet (with complex conjugate ) with translation and scale parameters a and b. We apply the à trous wavelet (Starck, Siebenmorgen, & Gredel 1997). For fixed wavelet scale b, the wavelet transform is computed at each pixel center a; the scale b is then increased in geometric steps and the process repeated. Once the full wavelet transform is computed, the code finds peaks above a threshold and eliminates multiple detections (at different b) of a given line by searching nearby pixels. The output of this routine is a set of positions of candidate emission lines.

This list of lines with nonzero weights is matched against a list of common galaxy and quasar emission lines, many of which were measured from the composite quasar spectrum of Vanden Berk et al.(2001; because of velocity shifts of different lines in quasars, the wavelengths listed do not necessarily match their rest-frame values). Each significant peak found by the wavelet routine is assigned a trial line identification from the common list (e.g., MgII) and an associated trial redshift. The peak is fitted with a Gaussian, and the line center, width, and height above the continuum are stored in HDU 2 of the spSpec*.fits files as parameters wave, sigma, and height, respectively. If the code detects close neighboring lines, it fits them with multiple Gaussians. Depending on the trial line identification, the line width it tries to fit is physically constrained. The code then searches for the other expected common emission lines at the appropriate wavelengths for that trial redshift and computes a confidence level (CL) by summing over the weights of the found lines and dividing by the summed weights of the expected lines. The CL is penalized if the different line centers do not quite match. Once all of the trial line identifications and redshifts have been explored, an emission-line redshift is chosen as the one with the highest CL and stored as z in the EmissionRedshift table and the spSpec*.fits emission line HDU. The exact expression for the emission-line CL has been tweaked to match our empirical success rate in assigning correct emission-line redshifts, based on manual inspection of a large number of spectra from the EDR.

The SpecLine table also gives the errors, continuum, equivalent width, chi-squared, spectral index, and significance of each line. We caution that the emission-line measurement for Hα should only be used if chi-squared is less than 2.5. In the SpecLine table, the "found" lines in HDU1 denote only those lines used to measure the emission-line redshift, while "measured" lines in HDU2 are all lines in the emission-line list measured at the redshifted positions appropriate to the final redshift assigned to the object.

A separate routine searches for high-redshift (z > 2.3) quasars by identifying spectra that contain a Lyα forest signature: a broad emission line with more fluctuation on the blue side than on the red side of the line. The routine outputs the wavelength of the Lyα emission line; while this allows a determination of the redshift, it is not a high-precision estimate, because the Lyα line is intrinsically broad and affected by Lyα absorption. The spectro1d pipeline stores this as an additional emission-line redshift. This redshift information is stored in the EmissionRedshift table.

If the highest CL emission-line redshift uses lines only expected for quasars (e.g., Lyα, CIV, CIII], then the object is provisionally classified as a quasar. These provisional classifications will hold up if the final redshift assigned to the object (see below) agrees with its emission redshift.

Cross-Correlation Redshifts

The spectra are cross-correlated with stellar, emission-line galaxy, and quasar template spectra to determine a cross-correlation redshift and error. The cross-correlation templates are obtained from SDSS commissioning spectra of high signal-to-noise ratio and comprise roughly one for each stellar spectral type from B to almost L, a nonmagnetic and a magnetic white dwarf, an emission-line galaxy, a composite LRG spectrum, and a composite quasar spectrum (from Vanden Berk et al. 2001). The composites are based on co-additions of ∼ 2000 spectra each. The template redshifts are determined by cross-correlation with a large number of stellar spectra from SDSS observations of the M67 star cluster, whose radial velocity is precisely known.

When an object spectrum is cross-correlated with the stellar templates, its found emission lines are masked out, i.e., the redshift is derived from the absorption features. The cross-correlation routine follows the technique of Tonry & Davis (1979): the continuum-subtracted spectrum is Fourier-transformed and convolved with the transform of each template. For each template, the three highest cross-correlation function (CCF) peaks are found, fitted with parabolas, and output with their associated confidence limits. The corresponding redshift errors are given by the widths of the CCF peaks. The cross-correlation CLs are empirically calibrated as a function of peak level based on manual inspection of a large number of spectra from the EDR. The final cross-correlation redshift is then chosen as the one with the highest CL from among all of the templates.

If there are discrepant high-CL cross-correlation peaks, i.e., if the highest peak has CL < 0.99 and the next highest peak corresponds to a CL that is greater than 70% of the highest peak, then the code extends the cross-correlation analysis for the corresponding templates to lower wavenumber and includes the continuum in the analysis, i.e., it chooses the redshift based on which template provides a better match to the continuum shape of the object. These flagged spectra are then manually inspected (see below). The cross-correlation redshift is stored as z in the CrossCorrelationRedshift table.

Final Redshifts and Spectrum Classification

The spectro1d pipeline assigns a final redshift to each object spectrum by choosing the emission or cross-correlation redshift with the highest CL and stores this as z in the SpecObj table. A redshift status bit mask zStatus and a redshift warning bit mask zWarning are stored. The CL is stored in zConf. Objects with redshifts determined manually (see below) have CL set to 0.95 (MANUAL_HIC set in zStatus), or 0.4 or 0.65 (MANUAL_LOC set in zStatus). Rarely, objects have the entire red or blue half of the spectrum missing; such objects have their CLs reduced by a factor of 2, so they are automatically flagged as having low confidence, and the mask bit Z_WARNING_NO_BLUE or Z_WARNING_NO_RED is set in zWarning as appropriate.

All objects are classified in specClass as either a quasar, high-redshift quasar, galaxy, star, late-type star, or unknown. If the object has been identified as a quasar by the emission-line routine, and if the emission-line redshift is chosen as the final redshift, then the object retains its quasar classification. Also, if the quasar cross-correlation template provides the final redshift for the object, then the object is classified as a quasar. If the object has a final redshift z > 2.3 (so that Lyα is or should be present in the spectrum), and if at least two out of three redshift estimators agree on this (the three estimators being the emission-line, Lyα, and cross-correlation redshifts), then it is classified as a high-z quasar. If the object has a redshift cz < 450 km s-1, then it is classified as a star. If the final redshift is obtained from one of the late-type stellar cross-correlation templates, it is classified as a late-type star. If the object has a cross-correlation CL < 0.25, it is classified as unknown.

There exist among the spectra a small number of composite objects. Most common are bright stars on top of galaxies, but there are also galaxy-galaxy pairs at distinct redshifts, and at least one galaxy-quasar pair, and one galaxy-star pair. Most of these have the zWarning flag set, indicating that more than one redshift was found.

The zWarning bit mask mentioned above records problems that the spectro1d pipeline found with each spectrum. It provides compact information about the spectra for end users, and it is also used to trigger manual inspection of a subset of spectra on every plate. Users should particularly heed warnings about parts of the spectrum missing, low signal-to-noise ratio in the spectrum, significant discrepancies between the various measures of the redshift, and especially low confidence in the redshift determination. In addition, redshifts for objects with zStatus = FAILED should not be used.

Spectral Classification Using Eigenspectra

In addition to spectral classification based on measured lines, galaxies are classified by a Principal Component Analysis (PCA), using cross-correlation with eigentemplates constructed from SDSS spectroscopic data. The 5 eigencoefficients and a classification number are stored in eCoeff and eClass, respectively, in the SpecObj table and the spSpec files. eClass, a single-parameter classifier based on the expansion coefficients (eCoeff1-5), ranges from about -0.35 to 0.5 for early- to late-type galaxies.

A number of changes to eClass have occurred since the EDR. The galaxy spectral classification eigentemplates for DR1 are created from a much larger sample of spectra than were used in the Stoughton et al. EDR paper, and now number approximately 200,000. The eigenspectra used in DR1 are an early version of those created by Yip et al. (in prep). The sign of the second eigenspectrum has been reversed with respect to that of EDR; therefore we recommend using the expression
atan(-eCoeff2/eCoeff1)
rather than eClass as the single-parameter classifier.

Manual Inspection of Spectra

A small percentage of spectra on every plate are inspected manually, and if necessary, the redshift, classification, zStatus, and CL are corrected. We inspect those spectra that have zWarning or zStatus indicating that there were multiple high-confidence cross-correlation redshifts, that the redshift was high (z > 3.2 for a quasar or z > 0.5 for a galaxy), that the confidence was low, that signal-to-noise ratio was low in r, or that the spectrum was not measured. All objects with zStatus = EMLINE_HIC or EMLINE_LOC, i.e., for which the redshift was determined only by emission lines, are also examined. If, however, the object has a final CL > 0.98 and zStatus of either XCORR_EMLINE or EMLINE_XCORR, then despite the above, it is not manually checked. All objects with either specClass = SPEC_UNKNOWN or zStatus = FAILED are manually inspected.

Roughly 8% of the spectra in the EDR were thus inspected, of which about one-eighth, or 1% overall, had the classification, redshift, zStatus, or CL manually corrected. Such objects are flagged with zStatus changed to MANUAL_HIC or MANUAL_LOC, depending on whether we had high or low confidence in the classification and redshift from the manual inspection. Tests on the validation plates, described in the next section, indicate that this selection of spectrafor manual inspection successfully finds over 95% of the spectra for which the automated pipeline assigns an incorrect redshift.

Resolving Multiple Detections and Defining Samples

In addition to reading this section, we recommend that users familiarize themselves with the , which indicate what happened to each object during the Resolve procedure.

SDSS scans overlap, leading to duplicate detections of objects in the overlap regions. A variety of unique (i.e., containing no duplicate detections of any objects) well-defined (i.e., areas with explicit boundaries) samples may be derived from the SDSS database. This section describes how to define those samples. The resolve figure is a useful visual aid for the discussion presented below.

Consider a single drift scan along a stripe, called a run. The camera has six columns of CCDs, which scan six swaths across the sky. A given camera column is referred to throughout with the abbreviation camCol. The unit for data processing is the data from a single camCol for a single run. The same data may be processed more than once; repeat processing of the same run/camCol is assigned a unique rerun number. Thus, the fundamental unit of data process is identified by run/rerun/camCol.

While the data from a single run/rerun/camCol is a scan line of data 2048 columns wide by a variable number of rows (approximately 133000 rows per hour of scanning), for purposes of data processing the data is split up into frames