*** DRAFT ***

SQL As Understood By SQLite

[Top]

expression

expr:

syntax diagram expr

literal-value:

syntax diagram literal-value

signed-number:

syntax diagram signed-number

raise-function:

syntax diagram raise-function

This section is different from the others. Most other sections of this document talks about a particular SQL command. This section does not talk about a standalone command but about "expressions" which are subcomponents of most other commands.

Operators

SQLite understands the following binary operators, in order from highest to lowest precedence:

||
*    /    %
+    -
<<   >>   &    |
<    <=   >    >=
=    ==   !=   <>   IS   IS NOT   IN   LIKE   GLOB   MATCH   REGEXP
AND   
OR

Supported unary prefix operators are these:

-    +    ~    NOT

The COLLATE operator is a unary postfix operator that assigns a collating sequence to an expression. The COLLATE operator has a higher precedence (binds more tightly) than any prefix unary operator or any binary operator. The collating sequence set by the COLLATE operator overrides the collating sequence determined by the COLLATE clause in a table column definition. See the detailed discussion on collating sequences in the Datatype In SQLite3 document for additional information.

The unary operator + is a no-op. It can be applied to strings, numbers, blobs or NULL and it always returns a result with the same value as the operand.

Note that there are two variations of the equals and not equals operators. Equals can be either = or ==. The non-equals operator can be either != or <>. The || operator is "concatenate" - it joins together the two strings of its operands. The operator % outputs the value of its left operand modulo its right operand.

The result of any binary operator is either a numeric value or NULL, except for the || concatenation operator which always evaluates to either NULL or a text value.

The IS and IS NOT operators work like = and != except when one or both of the operands are NULL. In this case, if both operands are NULL, then the IS operator evaluates to 1 (true) and the IS NOT operator evaluates to 0 (false). If one operand is NULL and the other is not, then the IS operator evaluates to 0 (false) and the IS NOT operator is 1 (true). It is not possible for an IS or IS NOT expression to evaluate to NULL. Operators IS and IS NOT have the same precedence as =.

Literal Values

A literal value is a constant of some kind. Literal values may be integers, floating point numbers, strings, BLOBs, or NULLs. Scientific notation is supported for floating point literal values. The "." character is always used as the decimal point even if the locale setting specifies "," for this role - the use of "," for the decimal point would result in syntactic ambiguity. A string constant is formed by enclosing the string in single quotes ('). A single quote within the string can be encoded by putting two single quotes in a row - as in Pascal. C-style escapes using the backslash character are not supported because they are not standard SQL. BLOB literals are string literals containing hexadecimal data and preceded by a single "x" or "X" character. For example:

X'53514C697465'

A literal value can also be the token "NULL".

Parameters

A "variable" or "parameter" token specifies a placeholder in the expression for a value that is filled in at runtime using the sqlite3_bind() family of C/C++ interfaces. Parameters can take several forms:

?NNN A question mark followed by a number NNN holds a spot for the NNN-th parameter. NNN must be between 1 and SQLITE_MAX_VARIABLE_NUMBER.
? A question mark that is not followed by a number creates a parameter with a number one greater than the largest parameter number already assigned. If this means the parameter number is greater than SQLITE_MAX_VARIABLE_NUMBER, it is an error.
:AAAA A colon followed by an identifier name holds a spot for a named parameter with the name :AAAA. Named parameters are also numbered. The number assigned is one greater than the largest parameter number already assigned. If this means the parameter would be assigned a number greater than SQLITE_MAX_VARIABLE_NUMBER, it is an error. To avoid confusion, it is best to avoid mixing named and numbered parameters.
@AAAA An "at" sign works exactly like a colon, except that the name of the parameter created is @AAAA.
$AAAA A dollar-sign followed by an identifier name also holds a spot for a named parameter with the name $AAAA. The identifier name in this case can include one or more occurrences of "::" and a suffix enclosed in "(...)" containing any text at all. This syntax is the form of a variable name in the Tcl programming language. The presence of this syntax results from the fact that SQLite is really a Tcl extension that has escaped into the wild.

Parameters that are not assigned values using sqlite3_bind() are treated as NULL.

The LIKE and GLOB operators

The LIKE operator does a pattern matching comparison. The operand to the right of the LIKE operator contains the pattern and the left hand operand contains the string to match against the pattern. A percent symbol ("%") in the LIKE pattern matches any sequence of zero or more characters in the string. An underscore ("_") in the LIKE pattern matches any single character in the string. Any other character matches itself or its lower/upper case equivalent (i.e. case-insensitive matching). (A bug: SQLite only understands upper/lower case for ASCII characters by default. The LIKE operator is case sensitive by default for unicode characters that are beyond the ASCII range. For example, the expression 'a' LIKE 'A' is TRUE but 'æ' LIKE 'Æ' is FALSE.)

If the optional ESCAPE clause is present, then the expression following the ESCAPE keyword must evaluate to a string consisting of a single character. This character may be used in the LIKE pattern to include literal percent or underscore characters. The escape character followed by a percent symbol (%), underscore (_), or a second instance of the escape character itself matches a literal percent symbol, underscore, or a single escape character, respectively.

The infix LIKE operator is implemented by calling the application-defined SQL functions like(Y,X) or like(Y,X,Z).

The LIKE operator can be made case sensitive using the case_sensitive_like pragma.

The GLOB operator is similar to LIKE but uses the Unix file globbing syntax for its wildcards. Also, GLOB is case sensitive, unlike LIKE. Both GLOB and LIKE may be preceded by the NOT keyword to invert the sense of the test. The infix GLOB operator is implemented by calling the function glob(Y,X) and can be modified by overriding that function.

The REGEXP operator is a special syntax for the regexp() user function. No regexp() user function is defined by default and so use of the REGEXP operator will normally result in an error message. If a application-defined SQL function named "regexp" is added at run-time, that function will be called in order to implement the REGEXP operator.

The MATCH operator is a special syntax for the match() application-defined function. The default match() function implementation raises an exception and is not really useful for anything. But extensions can override the match() function with more helpful logic.

The BETWEEN operator

The BETWEEN operator is logically equivalent to a pair of comparisons. "x BETWEEN y AND z" is equivalent to "x>=y AND x<=z" except that with BETWEEN, the x expression is only evaluated once. The precedence of the BETWEEN operator is the same as the precedence as operators == and != and LIKE and groups left to right.

The CASE expression

A CASE expression serves a role similar to IF-THEN-ELSE in other programming languages. WHEN expressions are evaluated from left to right until one is found that is true, at which point the corresponding THEN term becomes the result. If no WHEN expression is true then the ELSE clause determines the result or the result is NULL if there is no ELSE clause.

The optional expression that occurs in between the CASE keyword and the first WHEN keyword is the "base" expression. There are two basic forms of a CASE expression: those with and without a base expression. In a CASE without a base expression, each WHEN expression is evaluated as a boolean and the overall result is determined by first WHEN expression that is true. In a CASE with a base expression, the base expression is evaluated just once and the result is compared against each WHEN expression until a match is found. When comparing a base expression against a WHEN expression, the same collating sequence, affinity, and NULL-handling rules apply as if the base expression and WHEN expression are respectively the left- and right-hand operands of an = operator.

Assuming the subexpressions have no side-effects, the following two expressions are equivalent:

The only difference between the two CASE expressions shown above is that the x expression is evaluated exactly once in the first example but might be evaluated multiple times in the second.

A NULL result is considered false when evaluating WHEN terms. If the base expression is NULL then the result of the CASE is the result of the ELSE expression if it exists, or NULL if the ELSE clause is omitted.

The IN and NOT IN operators

The IN and NOT IN operators take a single scalar operand on the left and a vector operand on the right formed by an explicit list of zero or more scalars or by a single subquery. When the right operand of an IN or NOT IN operator is a subquery, the subquery must have a single result column. When the right operand is an empty set, the result of IN is false and the result of NOT IN is true, regardless of the left operand and even if the left operand is NULL. The result of an IN or NOT IN operator is determined by the following matrix:

Left operand
is NULL
Right operand
contains NULL
Right operand
is an empty set
Left operand found
within right operand
Result of
IN operator
Result of
NOT IN operator
no no no no false true
does not matter no yes no false true
no does not matter no yes true false
no yes no no NULL NULL
yes does not matter no does not matter NULL NULL

Note that SQLite allows the parenthesized list of scalar values on the right-hand side of an IN or NOT IN operator to be an empty list but most other SQL database database engines and the SQL92 standard require the list to contain at least one element.

Table Column Names

A column name can be any of the names defined in the CREATE TABLE statement or one of the following special identifiers: "ROWID", "OID", or "_ROWID_". These special identifiers all describe the unique integer key (the rowid) associated with every row of every table. The special identifiers only refer to the row key if the CREATE TABLE statement does not define a real column with the same name. The rowid can be used anywhere a regular column can be used.

Subqueries

SELECT statements can appear in expressions as either the right-hand operand of the IN or NOT IN operators, as a scalar quantity, or as the operand of an EXISTS operator. As a scalar quantity or the operand of an IN or NOT IN operator, the SELECT may have only a single column in its result. Compound SELECTs (connected with keywords like UNION or EXCEPT) are allowed in any subquery. With the EXISTS operator, the columns in the result set of the SELECT are ignored and the expression returns TRUE if one or more rows exist and FALSE if the result set is empty. If no terms in the SELECT expression refer to values in the containing query, then the expression is evaluated once prior to any other processing and the result is reused as necessary. If the SELECT expression does contain variables from the outer query, then the SELECT is reevaluated every time it is needed.

When a SELECT appears within an expression but is not the right operand of an IN or NOT IN operator, then the first row of the result of the SELECT becomes the value used in the expression. If the SELECT yields more than one result row, all rows after the first are ignored. If the SELECT yields no rows, then the value of the SELECT is NULL.

CAST expressions

A CAST expression changes the datatype of the into the type specified by <type>. <type> can be any non-empty type name that is valid for the type in a column definition of a CREATE TABLE statement. The <type> name is interpreted according to the rules for determining column affinity.

An explicit cast is stronger than affinity; with the CAST expression the datatype conversion is forced even if it is lossy and irrreversible.

Note that the result from casting any non-BLOB value into a BLOB and the result from casting any BLOB value into a non-BLOB value will be different depending on whether the database encoding is UTF-8, UTF-16be, or UTF-16le.

Functions

Both simple and aggregate functions are supported. (For presentation purposes, simple functions are further subdivided into core functions and date-time functions.) A simple function can be used in any expression. Simple functions return a result immediately based on their inputs. Aggregate functions may only be used in a SELECT statement. Aggregate functions compute their result across all rows of the result set.

*** DRAFT ***