SkyServer: Algorithms
Algorithm Descriptions
DR7 Help
 Site News
 Cooking with Sloan
 Search Form Guide
 SQL Tutorial
 SQL in SkyServer
 Sample SQL Queries
 Query Limits
 Searching Advice
 Archive Intro
 Table Descriptions
 Schema Browser
 Web Browsers
 Data Publications
 SkyServer Sites
 Contact Help Desk

Photometric Redshifts

There are no photometic redshifts available for data releases 2 through 4 (DR2-DR4). Starting with DR5, there are two versions of photometric redshift in the SDSS databases, in the Photoz and Photoz2 tables respectively. The algorithms for generating these are described below.

Photoz Table

There are two basic methods to create a photometric redshift (photo-z, hereafter) catalog. The first technique compares the observed colors of galaxies to a reference set that has both colors and spectroscopic redshifts observed. The other one uses synthetic colors calculated from spectral energy distribution templates instead of the empirical reference set.

The advantage of the first method is the better estimation accuracy, but it cannot extrap- olate, so the completeness of the reference set is crucial. In theory the second method can cover broader redshift ranges for all types of galaxies and can give additional information like spectral type, K-correction and absolute magnitudes, but its accuracy is severely limited by the lack of perfect SED models.

Based on the experiences of previous releases and uti- lizing the accumulated large spectroscopic reference set, in this data release we use a hybrid method that combines the advantages of both methods. Need to clarify that there are two versions of photo-zs available here. For photo-z estimation a reference set would be ideal had it been completely and densely covered the whole color space spanned by the colors of the objects for which we want to estimate redshifts. The spectroscopic sample of SDSS contains over 700,000 objects that are categorized as galaxies based on their spectral features.

Although the target selection of SDSS was not designed to be a complete reference set for photo-z, the main sample, the LRG sample together with some special surveys like the photo-z plate survey of higher redshift non-LRG galaxies, the low redshift plates, etc. have turned out to cover pretty well the whole color region. This fact justifies our choice to use the DR7 spectroscopic set as a reference set for redshift estimation without any additional data from synthetic spectra. The estimation method first searches in the ubercal u g, g r, r i, i z color space for the k nearest neighbors of every object in the estimation set (i.e. the galaxies for which we want to estimate redshift) and than estimates redshift by fitting a local low order polynomial onto these points.

The accuracy of the results show some weak dependence on the number of neighbors and we have found k = 100 and a linear polynomial (hyperplane) to be optimal regarding estimation accuracy and robustness. The robustness is further increased by excluding outliers from the set of neighbors, i.e. those ones with redshift value too far from the fitted hyperplane. Since the reference set contains over 700 thousand galaxies, and we had to estimate redshift for more than 260 million objects, we have used a k-d tree index for fast nearest neighbor search.

Beyond the redshift (z in the Photoz table in CAS) estimated by the linear fit, we give the objID (nnObjID) and spectroscopic redshift (nnSpecz) of the (first) nearest neighbor and calculate also the average redshift (nnAvgZ) of the neighbors. nnV ol gives the 4 dimensional volume of the rectangular bounding box of the nnCount nearest neighbors after excluding the outliers, and a ag value (nnIsInside) shows if the object to be estimated is inside or outside of the box. The latter case, which occurs for less than 5% of the objects, is a strong indication that the estimated redshift is the result of an extrapolation and it should not be trusted. To calculate K-correction (kcorr {u, g, r, i, z}), distance modulus (dmod), absolute mag- nitudes (absMag {u, g, r, i, z}) and rest frame colors (rest {ug, gr, ri, iz}) we combine the above method with template fitting. We search for the best match of the measured colors and the synthetic colors calculated from repaired empirical template spectra at the redshift given by the local nearest neighbor fit. This process gives an estimate for the spec- tral type of the object (pzType), too, as a scalar value in the range of [0, 1] from early type ellipticals to late type spirals.

We have found, that error propagation from the magnitude errors did not give reliable estimate of redshift errors. Instead when fitting the linear polynomial, we calculate the mean deviation of the redshifts of the reference objects. The comparison this mean deviation to differences between spectroscopic and estimated redshifts for the reference set shows, that the mean deviation value (zErr) is a good error estimator, and together with the above mentioned nnIsInside and nnVol values they can be used to select objects with reliable photometric redshift values. As an additional cross-check the fitted redshift (z) and the more robust but less exact average redshift (nnAvgZ) values can be compared.

Following the practice used in other tables in CAS, we have used 9999 to mark missing values (e.g. where the fit gave values beyond the reasonable [0, 1] redshift range). Note that we estimate redshift for objects marked as galaxies by the photometric pipeline. One one hand this means that redshift is not estimated for quasars (we plan to do this in a separate value added catalog), but also that objects those were erroneously classified as galaxies have nonsense estimated values. Further details of the method, and statistics on the quality of the estimations will be covered in a separate paper.

Photoz2 Table

(This table was not updated for DR7)
The photometric redshifts from the U. Chicago/Fermilab/NYU group (H. Oyaizu, M. Lima, C. Cunha, H. Lin, J. Frieman, and E. Sheldon) are calculated using a Neural Network method that is similar in implementation to that of Collister and Lahav (2004, PASP, 116, 345). The photo-z training and validation sets consist of over 551,000 unique spectroscopic redshifts matched to nearly 640,000 SDSS photometric measurements. These spectroscopic redshifts come from the SDSS as well as the deeper galaxy surveys 2SLAQ, CFRS, CNOC2, TKRS, and DEEP+DEEP2.

We provide photo-z estimates for a sample of over 77.4 million DR6 primary objects, classified as galaxies by the SDSS PHOTO pipeline (TYPE = 3), with dereddened model magnitude r < 22, and which do not have any of the flags BRIGHT, SATURATED, or SATUR_CENTER set. Note that this is a significant change in the input galaxy sample selection compared to the DR5 version of Photoz2.

Our data model is

Name Type Description
objid bigint unique ID pointing to PhotoObjAll table
photozcc2 real CC2 photo-z
photozerrcc2 real CC2 photo-z error
photozd1 real D1 photo-z
photozerrd1 real D1 photo-z error
flag int 0 for objects with r <= 20; 2 for objects with r > 20

Both the "CC2" and "D1" photo-z's are neural network based estimators. "D1" uses the galaxy magnitudes in the photo-z fit, while "CC2" uses only galaxy colors (i.e., only magnitude differences). Both methods also employ concentration indices (the ratio of PetroR50 and PetroR90). The "D1" estimator provides smaller photo-z errors than the "CC2" estimator, and is recommended for bright galaxies r < 20 to minimize the overall photo-z scatter and bias. However, for faint galaxies r > 20, we recommend "CC2" as it provides more accurate photo-z redshift distributions. If a single photo-z method is desired for simplicity, we also recommend "CC2" as the better overall photo-z estimator.

Please see this link for a detailed comparison of the two methods, including performance metrics (photo-z errors and biases), quality plots, and photometric redshift vs. spectroscopic redshift distributions in different magnitude bins.

The photo-z errors (1&sigma, or 68% confidence) are computed using an empirical "Nearest Neighbor Error" (NNE) method. NNE is a training set based method that associates similar errors to objects with similar magnitudes, and is found to accurately predict the photo-z error when the training set is representative.

The photo-z "flag" value is set to 2 for fainter objects with r > 20, whose photo-z's have larger uncertainties and biases.

Full details about the Photoz2 photometric redshifts are available here and in Oyaizu et al. (2007), ApJ, submitted, arXiv:0708.0030 [astro-ph].