SELECT Statements

A SELECT statement can consist of the following basic clauses.

SELECT
INTO
FROM
JOIN
WHERE
GROUP BY
HAVING
UNION
ORDER BY
LIMIT

SELECT Syntax

The following syntax diagram outlines the syntax supported by the SQL engine of the driver:

SELECT {
  [ TOP <numeric_literal> | DISTINCT ]
  { 
    * 
    | { 
        <expression> [ [ AS ] <column_reference> ] 
        | { <table_name> | <correlation_name> } .* 
      } [ , ... ] 
  }
  [ INTO csv:// [ filename= ] <file_path> [ ;delimiter=tab ] ]
  { 
    FROM <table_reference> [ [ AS ] <identifier> ] 
  } [ , ... ]
  [ [  
      INNER | { { LEFT | RIGHT | FULL } [ OUTER ] } 
    ] JOIN <table_reference> [ ON <search_condition> ] [ [ AS ] <identifier> ] 
  ] [ ... ] 
  [ WHERE <search_condition> ]
  [ GROUP BY <column_reference> [ , ... ]
  [ HAVING <search_condition> ]
  [ UNION [ ALL ] <select_statement> ]
  [ 
    ORDER BY 
    <column_reference> [ ASC | DESC ] [ NULLS FIRST | NULLS LAST ]
  ]
  [ 
    LIMIT <expression>
    [ 
      { OFFSET | , }
      <expression> 
    ]
  ] 
} | SCOPE_IDENTITY() 

<expression> ::=
  | <column_reference>
  | @ <parameter> 
  | ?
  | COUNT( * | { [ DISTINCT ] <expression> } )
  | { AVG | MAX | MIN | SUM | COUNT } ( <expression> ) 
  | NULLIF ( <expression> , <expression> ) 
  | COALESCE ( <expression> , ... ) 
  | CASE <expression>
      WHEN { <expression> | <search_condition> } THEN { <expression> | NULL } [ ... ]
    [ ELSE { <expression> | NULL } ]
    END 
  | <literal>
  | <sql_function> 

<search_condition> ::= 
  {
    <expression> { = | > | < | >= | <= | <> | != | LIKE | NOT LIKE | IN | NOT IN | IS NULL | IS NOT NULL | AND | OR | CONTAINS | BETWEEN } [ <expression> ]
  } [ { AND | OR } ... ]

Examples

Return all columns:
```
SELECT * FROM Customers
```

Rename a column:

SELECT [CompanyName] AS MY_CompanyName FROM Customers

Cast a column's data as a different data type:

SELECT CAST(Balance AS VARCHAR) AS Str_Balance FROM Customers

Search data:

SELECT * FROM Customers WHERE Country = 'US'

Return the number of items matching the query criteria:
```
SELECT COUNT(*) AS MyCount FROM Customers 
```
Return the number of unique items matching the query criteria:
```
SELECT COUNT(DISTINCT CompanyName) FROM Customers 
```
Return the unique items matching the query criteria:
```
SELECT DISTINCT CompanyName FROM Customers 
```

Summarize data:

SELECT CompanyName, MAX(Balance) FROM Customers GROUP BY CompanyName

See Aggregate Functions for details.

Retrieve data from multiple tables.

SELECT Customers.ContactName, Orders.OrderDate FROM Customers, Orders WHERE Customers.CustomerId=Orders.CustomerId

See JOIN Queries for details.

Sort a result set in ascending order:

SELECT City, CompanyName FROM Customers  ORDER BY CompanyName ASC

Restrict a result set to the specified number of rows:
```
SELECT City, CompanyName FROM Customers LIMIT 10 
```
Parameterize a query to pass in inputs at execution time. This enables you to create prepared statements and mitigate SQL injection attacks.
```
SELECT * FROM Customers WHERE Country = @param
```

See Explicitly Caching Data for information on using the SELECT statement in offline mode.

Pseudo Columns

Some input-only fields are available in SELECT statements. These fields, called pseudo columns, do not appear as regular columns in the results, yet may be specified as part of the WHERE clause. You can use pseudo columns to access additional features from Spark SQL.

    SELECT * FROM Customers WHERE MyPseudocolumn = 'MyValue'

JDBC Driver for Spark SQL

SELECT Statements

SELECT Syntax

Examples

Pseudo Columns