Sample Queries

The best way to learn how to write queries is to build on the work others have done.


Modify these queries and submit them to the SQL Search tool to see what they do!

More sample queries are available on the SQL Search , in the panel next to the query window.

Basic SQL

Basic SELECT-FROM-WHERE

SELECT TOP 100 objID, ra, dec
-- Get the unique object ID and coordinates
FROM
PhotoPrimary
--From the table containing photometric data for unique objects
WHERE
ra > 185 and ra < 185.1
AND dec > 15 and dec < 15.1
-- that matches our criteria
Load Query

Basic position search

-- Find galaxies within 1' of a given point (ra=185.0, dec=-0.5).
-- This is a slightly more complex query, but it can be easily adapted to search
--around any point in the sky.

--To see how to limit the search only to objects with clean photometry, see the
-- Clean imaging query.

SELECT TOP 100 G.objID, GN.distance
FROM Galaxy as G
JOIN dbo.fGetNearbyObjEq(185., -0.5, 1) AS GN
ON G.objID = GN.objID
ORDER BY distance
Load Query

Using the PhotoTag table

-- This query introduces the PhotoTag table, which contains the most frequently used columns
-- of PhotoObj. Queries to PhotoTag will run more quickly than those to photoObj.

-- This sample query finds data for all objects in fields with desired PSF width.

SELECT TOP 100 r.run, r.rerun, f.camCol, f.field, p.objID,
	p.ra, p.dec, p.modelMag_r, f.psfWidth_r
FROM
PhotoTag AS p
JOIN Field AS f ON f.fieldid = p.fieldid
JOIN Run AS r ON f.run = r.run
WHERE mode = 1-- select primary objects only
and f.psfWidth_r > 1.2
and p.modelMag_r < 21
and r.stripe = 21
Load Query

Search for a Range of Values: BETWEEN

--The BETWEEN statement can be used to set constraints on a range of values.

-- This sample query finds galaxies with g magnitudes between 18 and 19.

SELECT TOP 10 objID, cModelMag_g
FROM Galaxy
WHERE
cModelMag_g between 18 and 19		-- 18 < cModelMag_g < 19
Load Query

Rectangular position search

-- There are several built-in functions available to CAS users that make spatial
--queries, i.e., those with coordinate cuts, much more efficient than simply
--including the coordinate constraints in the WHERE clause.This example
--illustrates the use of the dbo.fGetObjFromRectEq function that invokes
--the Hierarchical Triangular Mesh(HTM) functionality.

-- a) Rectangular search using straight coordinate constraints:
SELECT objID, ra, dec
FROM PhotoObj
WHERE(ra between 179.5 and 182.3) and(dec between - 1.0 and 1.8)

-- b) This query can be rewritten as follows to use the HTM function that returns a
--rectangular search area(The "Run this Query" button above will run this query):

	--SELECT p.objID, p.ra, p.dec
	--FROM PhotoObj p
	--JOIN dbo.fGetObjFromRectEq(179.5, -1.0, 182.3, 1.8) r ON p.objID = r.objID
Load Query

Searching more than one table: JOIN...ON

-- When you need to search for data in two or more tables, you must "join" the tables
-- together for the purposes of your query by using a JOIN...ON statement.
--JOIN...ON allows you to search for data and/ or constraints in both tables.

-- The syntax of the statement is:
--JOIN[the second table] ON[a variable the tables have in common].
-- The variable the tables have in common is called the "key" (think of it
-- as the key that unlocks your ability to search two tables).
-- The key variable(s) in each table are shown on the About the Database page.
-- Find the two tables you want to join and look for a key variable they have in common.

-- The sample query looks for spectra of quasars and shows the date and time at which
-- each spectrum was taken.

SELECT TOP 100 sp.objID, sp.ra, sp.dec,
sp.mjd, px.taiBegin, px.taiEnd, sp.fiberID, sp.z
FROM specPhoto AS sp
JOIN plateX AS px
	ON sp.plateID = px.plateID
WHERE (sp.class='QSO') AND sp.z > 3
Load Query

Photometry and Spectroscopy: SpecPhoto

-- This query introduces the SpecPhoto view, which is a pre-computed join
-- of the most commonly-searched fields in both the SpecObj view(which
-- contains spectroscopy data) and the PhotoObj view(which
-- contains photometry data). This means that to combine
-- photometry and spectroscopy tables yourself using JOIN...ON,
-- the join already has been "precomputed" and done for you.

-- You should use specPhoto whenever your variables of interest
-- can be found there; using specPhoto means that your queries will
-- return results much faster than using JOIN...ON.
-- that your queries will return results much faster than using JOIN...ON.

-- This sample query retrieves both magnitudes(from photometry) and
-- redshifts(from spectroscopy) of 100 quasars.Note that this query
-- also introduces the use of mathematical operators(in this case subtraction)
-- in queries.

SELECT top 100
objid, ra, dec, psfmag_i - extinction_i AS mag_i,
psfmag_r - extinction_r AS mag_r, z	-- In SpecPhoto, "z" is the redshift
FROM SpecPhoto
WHERE
(class = 'QSO')
Load Query

Counting objects by type or category

--This simple query introduces two important SQL structures while showing how to count the number of
-- spectra of each spectral classification(galaxy, quasar, star) in DR8.

-- When included in the SELECT block, the count(*) statement returns the number of objects that
-- meet your search criteria.
-- The GROUP BY statement sorts results into groups(categories) based on the value of a data column.

-- In this query, the grouping occurs on the 'class' column of the SpecObj view, which contains
-- the spectral classification of the object.
-- The query can be easily modified to find object counts sorted by other columns.

-- For another example, see the Counts by Type and Program sample query.

SELECT class, count(*)
FROM SpecObj
GROUP BY class
Load Query

Using flags

-- Another useful query is to select stars that are not saturated.
-- This query introduces bitwise logic for flags, and uses the 'as' syntax to
-- make the query more readable. Note that if a flag is not set, the value
-- will be zero. If you want to ensure multiple flags are not set, you can
-- either check that each individually is zero, or their sum is zero.
-- (From Gordon Richards)

-- Important caveat: The 'flags' column is the union of the flag values in
-- each band,and using it may produce unexpected results, for example if a
-- bad flag bit is set in only one band (e.g. in the 'flags_r' value), that bit
-- will be bad in the overall 'flags' value as well. For more information,
-- please see the 'Photometric flags detail' entry in Algorithms.

-- NOTE: This query takes a long time to run without the "TOP 1000".

SELECT TOP 1000
objID,
ra,
dec,
psfMag_u, psfMag_g, psfMag_r, psfMag_i, psfMag_z,
dbo.fPhotoFlagsN(flags)
FROM Galaxy
WHERE
(flags & (dbo.fPhotoFlags('SATURATED'))) != 0
Load Query

SQL Jujitsu

Data subsample

-- Give me the colors of a random 1% sample of objects from all fields that
-- are "survey quality" so that I could plot up color-color diagrams and play
-- around with more sophisticated cuts. Query requested by Karl Glazebrook.
-- Uses the HTM spatial index ID as a random number generator. The htmID is
-- multiplied by a prime number (37) to remove bias and generate a random number
-- from the lowermost 16 bits. This number is then constrained to be between 650
-- and 65000 to generate a random sample between 1 and 100% of the data
-- respectively. So replacing the "1" by a different number between 1 and 99
-- will generate a sample of the required percentage of objects.
-- Note that this is much faster than the standard "ORDER BY RANDOM()" SQL method.

SELECT TOP 100 u, g, r, i, z FROM Galaxy
WHERE
	(htmid*37 & 0x000000000000FFFF)
		 < (650 * 1)
-- Replace the "1" with a number < 99 to get a higher percentage data subsample
Load Query

Objects in close pairs using neighbors

-- This query introduces the Neighbors table, which contains each
-- object's neighbors within 30 arcseconds.
-- Find all objects within 30 arcseconds of one another
-- that have very similar colors: that is where the color ratios
-- u-g, g-r, r-I are less than 0.05m.

SELECT TOP 10 P.ObjID		-- distinct cases
FROM PhotoPrimary AS P		-- P is the primary object
	JOIN Neighbors AS N ON P.ObjID = N.ObjID		-- N is the neighbor link
JOIN PhotoPrimary AS L ON L.ObjID = N.NeighborObjID
-- L is the lens candidate of P
WHERE      
	P.ObjID < L. ObjID		-- avoid duplicates
	and abs((P.u-P.g)-(L.u-L.g))<0.05		-- L and P have similar spectra.
	and abs((P.g-P.r)-(L.g-L.r))<0.05
	and abs((P.r-P.i)-(L.r-L.i))<0.05
	and abs((P.i-P.z)-(L.i-L.z))<0.05
Load Query

Selected neighbors in run

-- A complex query from Robert Lupton that uses the Neighbors table to find
-- selected neighbors in a given run+camcol. Contains a nested query with 
-- join, joined with the query results to select only neighbors that meet
-- criteria. Nested queries are required because the Neighbors table does
-- not contain all parameters for neighbor objects
SELECT TOP 10
	obj.run, obj.camCol, str(obj.field, 3) as field,
	str(obj.rowc, 6, 1) as rowc, str(obj.colc, 6, 1) as colc,
	str(dbo.fObj(obj.objId), 4) as id,
	str(obj.psfMag_g - 0*obj.extinction_g, 6, 3) as g,
	str(obj.psfMag_r - 0*obj.extinction_r, 6, 3) as r,
	str(obj.psfMag_i - 0*obj.extinction_i, 6, 3) as i,
	str(obj.psfMag_z - 0*obj.extinction_z, 6, 3) as z,
	str(60*distance, 3, 1) as D,
	dbo.fField(neighborObjId) as nfield,
	str(dbo.fObj(neighborObjId), 4) as nid
FROM
	(
		SELECT obj.objId,
			run, camCol, field, rowc, colc,
			psfMag_u, extinction_u,
			psfMag_g, extinction_g,
			psfMag_r, extinction_r,
			psfMag_i, extinction_i,
			psfMag_z, extinction_z,
			NN.neighborObjId, NN.distance
		FROM PhotoObj as obj
			JOIN neighbors as NN on obj.objId = NN.objId
		WHERE
			60*NN.distance between 0 and 15 and
			NN.mode = 1 and NN.neighborMode = 1 and
			run = 756 and camCol = 5 and
			obj.type = 6 and (obj.flags & 0x40006) = 0 and
			nchild = 0 and obj.psfMag_i < 20 and
			g - r between 0.3 and 1.1 and r - i between -0.1 and 0.6
	) as obj
JOIN PhotoObj as nobj on nobj.objId = obj.neighborObjId
WHERE
	nobj.run = obj.run and
	(abs(obj.psfMag_g - nobj.psfMag_g) < 0.5 or
	abs(obj.psfMag_r - nobj.psfMag_r) < 0.5 or
	abs(obj.psfMag_i - nobj.psfMag_i) < 0.5)
Open SQL Search tool and paste query by hand

Object counting and logic

-- Using object counting and logic in a query.
-- Find all objects similar to the colors of a quasar at 5.5
SELECT count(*) as 'total',
	sum( case when (Type=3) then 1 else 0 end) as 'Galaxies',
	sum( case when (Type=6) then 1 else 0 end) as 'Stars',
	sum( case when (Type not in (3,6)) then 1 else 0 end) as 'Other'
FROM PhotoPrimary -- for each object
WHERE (( u - g > 2.0) or (u > 22.3) ) -- apply the quasar color cut.
	and ( i between 0 and 19 )
	and ( g - r > 1.0 )
	and ( (r - i < 0.08 + 0.42 * (g - r - 0.96)) or (g - r > 2.26 ) )
	and ( i - z < 0.25 )
Load Query

Galaxy counts on HTM grid

-- Create a count of galaxies for each of the HTM triangles.
-- Galaxies should satisfy a certain color cut, like
-- 0.7u-0.5g-0.2i<1.25 && r<21.75, output it in a form
-- adequate for visualization.

SELECT (htmID / power(2,24)) as htm_8 ,
	-- group by 8-deep HTMID (rshift HTM by 12)
	avg(ra) as ra,
	avg(dec) as [dec],
	count(*) as pop -- return center point and count for display
FROM Galaxy -- only look at galaxies
WHERE (0.7*u - 0.5*g - 0.2*i) < 1.25 -- meeting this color cut
	and r < 21.75 -- brighter than 21.75 magnitude in red band.
group by (htmID /power(2,24)) -- group into 8-deep HTM buckets.
Load Query

Repeated high-z objects

-- Compare different redshift measurements of the same object for objects
-- with high redshift

SELECT prim.bestObjId, prim.mjd AS PrimMJD, prim.plate AS PrimPlate,
	other.mjd AS OtherMJD, other.plate AS OtherPlate,
	prim.z AS PrimZ, other.z AS OtherZ, plate.programname
FROM SpecObjAll prim
	JOIN SpecObjAll other ON prim.bestObjId = other.bestObjId
	JOIN platex AS plate ON other.plate = plate.plate AND other.mjd = plate.mjd
WHERE other.bestObjId > 0
	AND prim.sciencePrimary = 1
	AND other.sciencePrimary = 0
	AND prim.z > 2.5
ORDER BY prim.bestObjId
Load Query

Splitting 64-bit values into two 32-bit values

-- The flag fields in the SpecObjAll table are 64-bit, but some
-- analysis tools (and FITS format) only accept 32-bit integers.
-- Here is a way to split them up, using bitmasks to extract
-- the higher and lower 32 bits, and dividing by a power of
-- 2 to shift bits to the right (since there is no bit shift
-- operator in SQL.) The hexadecimal version can be used for debugging
-- to make sure you are splitting them up right.

SELECT TOP 10 objid,ra,dec,     
	flags,		-- output the whole bigint as a check
	flags & 0x00000000FFFFFFFF AS flags_lo,		-- get the lower 32 bits with a mask
	-- shift the bigint to the right 32 bits, then use the same mask to
	-- sget upper 32 bits
	(flags/power(cast(2 as bigint),32)) & 0x00000000FFFFFFFF AS flags_hi
FROM PhotoObj
 
-- Hexadecimal version of above query - use for debugging 
-- SELECT TOP 10 objid,ra,dec,
-- CAST(flags AS BINARY(8)) AS flags,
-- CAST(flags & 0x00000000FFFFFFFF AS BINARY(8)) AS flags_lo,
-- CAST((flags/POWER(CAST(2 AS BIGINT),32)) & 0x00000000FFFFFFFF AS BINARY(8)) AS flags_hi
-- FROM PhotoObj
Load Query

Using LEFT OUTER JOIN

-- This query demonstrates the use of a LEFT OUTER JOIN to include rows from the joined table
-- that do not satisfy the JOIN condition. The OUTER JOIN (LEFT or RIGHT)
-- is an inclusive JOIN, whereas the INNER (default) JOIN is an exclusive JOIN.
-- In this case the query returns galaxies with or without spectra that meet the criteria
-- in the WHERE clause. The ISNULL operator is used to replace null
-- values with zeros for galaxies without spectra.

SELECT TOP 10 p.objid, p.ra, p.dec, ISNULL(s.specobjid,0) AS specobjid, ISNULL(s.z, 0) AS z
FROM Galaxy AS p
	LEFT OUTER JOIN SpecObj s ON s.bestObjID = p.objID
WHERE
	p.u BETWEEN 0 AND 19.6 AND p.g BETWEEN 0 AND 20
Load Query

Using Nested Queries

-- This query demonstrates, among other tings, a nested query (a query
-- inside a query, where the outer query operates on the result of the inner query. The
-- query turns sky brighness into flux, which illustrates the 
-- the POWER() function, and uses CAST to get a floating
-- point. The First table contains matches between SDSS and FIRST survey objects.

SELECT TOP 10 fld.run, fld.avg_sky_muJy, fld.runarea AS area, ISNULL(fp.nfirstmatch,0)
FROM (
--first part: for each run, get total area and average sky brightness
SELECT run, sum(primaryArea) AS runarea,
3631e6*avg(power(cast(10. as float),-0.4*sky_r)) as avg_sky_muJy
FROM Field
GROUP BY run
) AS fld
LEFT OUTER JOIN (
-- second part: for each run, get total number of FIRST matches. To get the run number
-- for each FIRST match, join FIRST with PHOTOPRIMARY. Some runs may have
-- 0 FIRST matches, so these runs will not appear in the result set of this subquery.
-- But we want to keep all runs from the first query in the final result, so
-- we need a LEFT OUTER JOIN to return all rows from the first subquery and match
-- with corresponding rows from the second query - or else return NULL. The ISNULL() 
-- function converts this NULL into a 0 for readability.
SELECT p.run, count(*) AS nfirstmatch
FROM FIRST AS fm
INNER JOIN photoprimary as p
ON p.objid=fm.objid
GROUP BY p.run
) AS fp
ON fld.run=fp.run
ORDER BY fld.run
Load Query

Photometric redshifts

-- Here is a simple query for objects with reliable redshift estimation in the 0.4 < z < 0.5 range
-- The query uses the Photoz table that contains photometric redshift estimates for galaxies.

SELECT TOP 100 objId,z,zErr,absMagR
FROM Photoz
WHERE
	z BETWEEN 0.4 and 0.5
	and photoErrorClass=1	-- the estimated object has low photometric errors, and enough nearest
	and nnCount>95	-- neighbors in the reference set; note that zErr=-9999 indicates an
	and zErr BETWEEN 0 and 0.03	-- unreliable redshift estimate, hence the lower bound on zErr

-- A simple luminosity function with the same selection criteria as above
-- Returns a histogram of absolute magnitudes for the given redshift range.
-- (You will need to cut and paste this query if you want to run it in the SQL Search page,
-- the above buttons will only load/run the first query).
-- SELECT round(absMagR,1) as absMagR, COUNT(*) as cnt
-- FROM Photoz
-- WHERE
--	z BETWEEN 0.4 and 0.5
--	and photoErrorClass=1 and nnCount>95
--	and zErr BETWEEN 0 and 0.03
-- group by round(absMagR,1)
-- order by round(absMagR,1)
Load Query

Spectra in other programs: SEGUE/SDSS

-- This sample query find objects with spectra in the
-- regular program of the SEGUE-2 survey. It checks
-- both the "survey" tag (which tells you the overall
-- survey) and the "programname" tag (which distinguishes
-- subprograms within each survey). The PlateX table
-- also has the survey and programname tags.

SELECT TOP 50 ra, dec, specobjid, plate, mjd, fiberid
FROM specObj
WHERE survey = 'segue2' and programname = 'segue2'
Load Query

Spectra in other programs: All Programs

-- There are five possible survey names ("sdss", "segue1",
-- "segue2", "boss" and "apogee", though APOGEE is not
-- available in DR9 and earlier). Within each survey there
-- can be a number of programnames; the main programs
-- of each survey are:
-- survey = 'sdss' and programname = 'legacy'
-- survey = 'segue1' and programname = 'segue'
-- survey = 'segue2' and programname = 'segue2'
-- survey = 'boss' and programname = 'boss'
-- For the SEGUE-1 and SDSS surveys in particular there
-- are multiple programs. A full list of surveys and programs
-- can be obtained with:

SELECT DISTINCT survey, programname
FROM platex
ORDER BY survey
Load Query

Object counts by type and program

-- List the number of each type of object observed by each
-- special spectroscopic observation program.

SELECT plate.programname, class,
COUNT(specObjId) AS numObjs
FROM SpecObjAll
JOIN PlateX AS plate ON plate.plate = specObjAll.plate
GROUP BY plate.programname, class
ORDER BY plate.programname, class
Load Query

WISE cross-match

-- The WISE_allsky is a straightforward import of the WISE all-sky data release catalog.
-- It has hundreds of columns containing all manner of
-- WISE measurements. It has 563,921,584 rows. This catalog includes data
-- from the first part of the WISE mission, when all four of its mid-IR
-- channels (3.4, 4.6, 12, and 22 microns) were operational. It is
-- all-sky.
--
-- The WISE_xmatch is an astrometric cross-match between the WISE_allsky
-- and SDSS photoObj objects. It contains 495,003,196 matches. The
-- WISE_xmatch table itself just contains three columns: IDs of the WISE
-- and SDSS objects, and the distance between them. I used a matching
-- radius of 4 arcsec, and duplicate matches are allowed.
--
-- To use the WISE_xmatch table, join on one of the SDSS Photo* tables
-- (PhotoObjAll, PhotoPrimary, PhotoTag, Galaxy, Star) and the WISE_allsky
-- table, like so:

select top 10
	S.ra as sdss_ra, S.dec as sdss_dec, W.ra as wise_ra, W.dec as wise_dec,
	S.psfmag_r as r, W.w1mpro as w1
from wise_xmatch as X
	join wise_allsky as W on X.wise_cntr = W.cntr
	join photoTag as S on X.sdss_objid = S.objid
where S.nchild = 0
Load Query