AmiScript: SQL¶
AmiSQL Introduction¶
AmiSQL is part of AMI's embedded scripting language AmiScript and provides SQL-like syntax for manipulating data from tables and other databases.
AmiSQL contains familiar syntax and instructions, providing a comprehensive instruction set for working with table(s) of data. Key features include:
- Retrieval - Accessing and Normalizing data from external sources
- Blending - Perform Joins and Unions on tables
- Modifying - Inserting, Updating, and Deleting rows
- Analyzing - Grouping, Sorting, and Filtering
- Schema Definition - Creating, Updating, and Dropping tables
Broadly speaking, the logic steps involved in data-visualization are:
-
Running a query, or set of queries, on external datasources and/or the AMI-Realtime database.
- Each datasource is predefined and uniquely named.
- The results of queries on datasources will result in temporary, in-memory tables.
- The queries can be constructed using user-input/user-selected/user-associated data.
- In more advanced cases, the queries can be sequential, meaning the results from one query are then fed into a subsequent query (allowing for reactionary data blending).
-
Blending the results from Step 1, resulting in new temporary in-memory tables 3. Building visualizations on the tables from Step 1 and/or Step 2
AmiSQL and Standard SQL Differences¶
For those familiar with standard SQL, there are a few key differences:
-
Equality is done with
==
- AmiSQL:
SELECT * FROM Users WHERE userid == 15;
- MYSQL:
SELECT * FROM Users WHERE userid = 15;
- AmiSQL:
-
Checking for null is done with equality, ex.
== null
- AmiSQL:
SELECT * FROM Users WHERE status == null;
- MYSQL:
SELECT * FROM Users WHERE status IS NULL;
- AmiSQL:
-
Strings are surrounded in double quotes
- AmiSQL:
SELECT * FROM Users WHERE name == "David";
- MYSQL:
SELECT * FROM Users WHERE name = 'David';
- AmiSQL:
-
"Matches" (
~~
) instead of LIKE- AmiSQL:
SELECT * FROM Users WHERE name ~~ "^Dav";
- MYSQL:
SELECT * FROM Users WHERE name LIKE 'Dav%';
(See Simplified Text Matching for pattern matching rules/expressions)
- AmiSQL:
-
Boolean expressions
&&
works same as AND.||
works same as OR- AmiSQL:
SELECT * FROM Users WHERE userid == 15 AND status == null && name == "David";
- MYSQL:
SELECT * FROM Users WHERE userid = 15 AND status IS NULL AND name = 'David';
- AmiSQL:
-
By default all tables are
TEMPORARY
, use thePUBLIC
key word to make non-temporary tables:- AmiSql:
CREATE TABLE MyTemporaryTable(id String);
- MYSQL:
CREATE PUBLIC TABLE MyPermanantTable(id String);
- AmiSql:
-
To return distinct values, equivalent to SQL
SELECT DISTINCT
, useGROUP BY
instead- AmiSQL:
SELECT name FROM User GROUP BY name
- MYSQL:
SELECT DISTINCT name FROM User
- AmiSQL:
-
To return the number of distinct column values, use the AmiScript method
countUnique()
- AmiSQL:
SELECT countUnique(name) AS unique_name_count FROM User
- MYSQL:
SELECT COUNT(DISTINCT name) AS unique_name_count FROM User
- AmiSQL:
Data Source Access Clauses¶
use...execute clause¶
Overview
This command enables you to "execute" code on an external datasource and process/store the results locally
USE
: While datamodels have default options, you can override them for a particular query. Valid options include- ds = "datasource_name"
- The name of the datasource to execute this script on
- timeout = timeout_in_millis
- The number of milliseconds before the execute command times out
- limit = row_limit_to_return
- The max number of records returned
- ds_url = "url"
- Overrides the url of the datasource
- When used in conjunction with ds="datasource_name", supplied datasource must contain URL in the PermittedOverrides field
- ds_username = "username"
- Overrides the username of the datasource
- When used in conjunction with ds="datasource_name", supplied datasource must contain USERNAME in the PermittedOverrides field
- ds_password = "plain_text_password"
- Overrides the password of the datasource using a plain text password
- When used in conjunction with ds="datasource_name", supplied datasource must contain PASSWORD in the PermittedOverrides field
- Mutually exclusive with the ds_password_enc option
- ds_password_enc = "aes_encrypted_password"
- Overrides the password of the datasource, using an encrypted password
- Use strEncrypt(...) to create an encrypted password
- When used in conjunction with ds="datasource_name", supplied datasource must contain PASSWORD in the PermittedOverrides field
- Mutually exclusive with the ds_password option
- ds_options = "some options"
- Override the options of the datasource
- When used in conjunction with ds="datasource_name", supplied datasource must contain OPTIONS in the PermittedOverrides field
- ds_relay = "relayid"
- Override the relay of the datasource
- Run SHOW RELAYS command and inspect RelayId for available relays
- When used in conjunction with ds="datasource_name", supplied datasource must contain RELAY in the PermittedOverrides field
- ds_adapter = "SOME_ADAPTER"
- Use an "anonymous" datasource of the specified adapter type
- Run SHOW DATASOURCE_TYPES command and inspect the I column for available adapters
- Mutually exclusive with ds="datasource_name" option
- See
ami.db.anonymous.datasources.enabled
option for permission details
- _custom=some_directive
- Prefixing with an underbar (_) passes the option through to the datasource's adapter
- ds = "datasource_name"
EXECUTE
The literal text to execute. The text will be sent directly to the datasource for execution. Therefore, variables cannot be referenced. Instead, substitution can be used. See String Templating.
Syntax
Example
Placeholder Parameters¶
For more complex queries, it is possible to declare parameters after the initial query using AmiScript ?
notation.
The general format of using placeholder parameters uses the following syntax:
Parameters can either be passed individually in the order of being called:
_p1=param1 _p2=param2 ... _pn=paramn
or as a list of parameters:
_paramList
use...show_tables clause¶
Overview
This command enables you to see what tables are available on a given datasource (similar to what the AMI gui wizard shows you when selecting a table).
USE
: While datamodels have default options, you can override them for a particular query. Valid options includeds = "datasource_name"
The name of the datasource to execute the script on
The resulting table will be called TABLE
and contain two String columns: TableName
and Schema
.
Syntax
Example
This example will return a table displaying all tables available in the mydb datasource. If the datasource supports schema names, that will be populated as well.
Returns:
use...insert clause¶
Overview
This command uploads data from AMI into a datasource.
USE
: While datamodels have default options, you can override them for a particular query. Valid options includeds = "datasource_name"
The name of the datasource to execute the script on
The resulting table will be called TABLE
and contain two String columns: TableName
and Schema
Syntax
datasource_name
- string name of the datasource containing the table to insert values into
datasource_table
- name of the table to insert values into
Examples
This example will insert two rows into the mydb datasource's users table:
This example copy all rows from AMI's my_users table into the "mydb" datasource's users table:
This example copy all rows from the old_users table in the otherds datasource to the users table in the mydb datasource.
concurrent_block (advanced)¶
Overview
When running analysis that needs to query many datasources, there can be large performance gains by executing these queries simultaneously. The CONCURRENT{...}
key word causes all EXECUTE
clauses within the block to be run in parallel (instead of sequentially).
In both of the examples below, we are going to execute three queries on three separate datasources (mydb1, mydb2, mydb3). The first example executes the queries sequentially and the second example executes the queries concurrently.
Assuming each query takes 2 seconds for a database to process, the first example will complete in about 6 seconds (2+2+2), while the second, concurrent, example will complete in only 2 seconds:
Example of Sequential Execution (Default)¶
Here is the sequence of events:
-
00:00:00 - EXECUTE clause is sent to mydb1
-
00:00:00 - AMI Script sleeps until there is a response from mydb1...
-
00:00:02 - mytable1 is created (from result of query to mydb1)
-
00:00:02 - EXECUTE clause is sent to mydb2
-
00:00:02 - AMI Script sleeps until there is a response from mydb2...
-
00:00:04 - mytable2 is created (from result of query to mydb2)
-
00:00:04 - EXECUTE clause is sent to mydb3
-
00:00:04 - AMI Script sleeps until there is a response from mydb3...
-
00:00:06 - mytable3 is created (from result of query to mydb3)
-
00:00:06 - Block is complete after 6 seconds (code after the block will be executed)
Example of Concurrent Execution¶
So now let's add the CONCURRENT keyword before the block. In this case, all 3 executes will be initiated at the same time.
Here is the sequence of events:
-
00:00:00 - EXECUTE clause is sent to mydb1
-
00:00:00 - EXECUTE clause is sent to mydb2
-
00:00:00 - EXECUTE clause is sent to mydb3
-
00:00:00 - AMI Script sleeps until there is a response from all 3 queries...
-
00:00:02 - mytable1 is created (from result of query to mydb1)
-
00:00:02 - mytable2 is created (from result of query to mydb2)
-
00:00:02 - mytable3 is created (from result of query to mydb3)
-
00:00:02 - Block is complete after 2 seconds (code after the block will be executed)
Query Clauses¶
select clause¶
Overview
This clause allows the combination of Filtering (WHERE
), Sorting (ORDER BY
), Aggregating (GROUP BY
), Projecting (SELECT ...
) and Cropping (LIMIT
) of individual tables. Next, the results may be combined via Unions (UNION
) and/or Joins (FROM ... JOIN ON
). Complex Strings containing delimited data can be split into distinct rows via the UNPACK
operation. Finally, see the create_table_as_clause for storing the final result of a select clause into another table.
- To aggregate data, use the
GROUP BY
clause to identify any number of columns or derived values that should be used to group rows. Provide an alias (AS
) for derived values to be used within the select clause. The select columns can only use aggregate functions (min, max, count, sum) on underlying rows and columns declared in the group-by clause - To filter data, provide a
WHERE
clause which evaluates to a Boolean. True indicates don't filter. When filtering in conjunction with aggregating (GROUP BY
), those records filtered out will not be considered for aggregation. To filter rows after aggregation, use theHAVING
clause. - For projecting, supply columns to include in the
SELECT
clause. Derived values may be provided. Provide an alias (AS
) for derived values or to rename a column. Use star (*) to include all columns or tblname.* to include only those columns from a particular table provided in theFROM
clause. - To crop out rows, use the
LIMIT
clause. You may provide a number of rows to skip (offset
) and a number of maximum rows to return (rowcount
). - To sort, use the
ORDER BY
clause. If multiple expressions are given the left-most takes priority. - To union (concatenate) the results from multiple select clauses, use the
UNION
clause. The column names and types from the first select clause will provide the result set schema definition. If BYNAME is specified, then target columns and source columns are mapped by column nume. If BYNAME is not specified all select clauses participating in a union must define the same number of columns. Note: this operates like a union all in many databases. - To join (blend) tables, supply all tables together in the
FROM
clause and a Cartesian product will be provided. For more advanced Joins, you can use a combination of the LEFT, RIGHT, OUTER, JOIN, ON and NEAREST clauses. See Joining Data section for details - To process String columns containing multi-value delimited data, use the
UNPACK col ON delimeter
expression. This will split the string such that each entry will result in a new row.
Syntax
offset
- integer number of rows to skip
rowcount
- integer number of maximum rows that will be returned
Joining Data¶
Simple Cartesian Joins¶
Any number of tables can be joined by simply listing them after the FROM
clause which will produce a Cartesian product, such that a row will be returned for each possible combination of rows from the source table. For example, the Cartesian product of a table with 5 rows and another table with 3 rows will produce 15 rows (5x3=15). Of course, the WHERE clause can be used to limit results to only the interesting combinations.
Handling Rows that don't match rows in other tables¶
For more granular control when joining two tables, use the JOIN
syntax. There are 7 types of statements, let's assume two tables a and b:
- JOIN:
FROM a JOIN b ON expr
- This is the same as FROM a,b WHEREexpr
- LEFT JOIN:
FROM a LEFT JOIN b ON expr
- A JOIN plus all records from table a that would otherwise be omitted (due to the expr). - LEFT ONLY JOIN:
FROM a LEFT ONLY JOIN b ON expr
- Includes only records from table a that would otherwise be omitted (due to the expr). - RIGHT JOIN:
FROM a RIGHT JOIN b ON expr
- A JOIN plus all records from table b that would otherwise be omitted (due to the expr). (similar to LEFT JOIN, swapping a and b) - RIGHT ONLY JOIN:
FROM a RIGHT ONLY JOIN b ON expr
- Includes only records from table b that would otherwise be omitted (due to the expr). (similar to LEFT ONLY JOIN, swapping a and b) - OUTER JOIN: A JOIN plus all records from table
a
and from tableb
that would otherwise be omitted (due to the expr) - OUTER ONLY JOIN: Includes only records from table a and from table b that would otherwise be omitted (due to the expr).
Joining to a nearest match¶
Use the NEAREST
clause in conjunction with one of the above 7 JOIN clauses to match two records where there may not be an exact match, such as doing an "as-of" join with timestamps. After the ON
clause, include a NEAREST fuzzy_clause
.
Consider: a JOIN b ON on_expr NEAREST fuzzy_clause
, where the fuzzy clause is one of the 5 syntaxes:
a.cola == b.colb
- For each row in tablea
, join with the row in tableb
that satisfieson_expr
and has the smallest delta betweencola
andcolb
.a.cola <= b.colb
- For each row in tablea
, join with the row in tableb
that satisfieson_expr
andcola
is less than or equal tocolb
and has the smallest delta betweencola
andcolb
.a.cola >= b.colb
- For each row in tablea
, join with the row in tableb
that satisfieson_expr
andcola
is greater than or equal tocolb
and has the smallest delta betweencola
andcolb
.a.cola < b.colb
- For each row in tablea
, join with the row in tableb
that satisfieson_expr
andcola
is less than colb and has the smallest delta betweencola
andcolb
.a.cola > b.colb
- For each row in tablea
, join with the row in tableb
that satisfieson_expr
andcola
is greater than colb and has the smallest delta betweencola
andcolb
.
Note the ordering of table a
vs b
in the fuzzy_clause
does not matter. For example b.colb >= a.cola
is the same as a.cola <= b.colb
General Examples¶
Nearest Join Examples¶
Assume two existing tables: Orders and MarketData
A JOIN will only include orders that have a satisfactory market data event. Note, we're also calculating the price delta from the order vs the market
A LEFT JOIN will include all orders, event those without a market data event (using null as a place holder)
Join Examples¶
Let's assume we have two tables A and B, each with 1 column:
We can see that the value 1 only exists in the A table, the value 3 exists only in the B table and the value 2 existing in both tables. When comparing the data between them there are, exhaustively speaking, 7 different ways you could want to see what matches and what doesn't.
JOIN Syntax - Return only those rows with values that coexist in both tables. (Note, this is the same as the natural join select * from a,b where a.aid==b.bid)
OUTER JOIN Syntax - Return all rows from both tables.
OUTER ONLY JOIN Syntax - Return rows in the left table that don't have matching rows in the right table and rows in the right table that don't have matching rows in the left table.
LEFT JOIN Syntax - Return all rows from the left table and only the rows in the right table matching rows in the left table.
LEFT ONLY JOIN Syntax - Return only rows in the left table that don't have matching rows in the right table.
RIGHT JOIN Syntax - Return all rows from the right table and only the rows in the left table matching rows in the right table. (Note, this is equivalent to ... FROM b LEFT JOIN a ...)
RIGHT ONLY JOIN Syntax - Return only rows in the right table that don't have matching rows in the left table. (Note, this is equivalent to ... FROM b LEFT ONLY JOIN a ...)
MULTIPLE TABLES INNER JOIN Syntax - Return rows that are joined based on columns from multiple tables. (Note, we only support inner join for mutliple table joins)
TABLE UNPACKING Syntax - Return the result unpacked by a specific delimiter based on a specific column
analyze clause¶
Overview
This command runs analysis on a table. Analysis is performed on rows contained in a defined window, similar to a self-join. Each window can define an order-by to allow for finer control over the order in which an aggregate is applied. Multiple windows may be defined. See window_expr for details.
When adding columns, the new cells will default to null. The column will be added to the end (right-most) of the table.
tblname
- string name of the table to alter the schema of
Example
prepare clause¶
Overview
This command allows the user to "prepare" additional columns whose values are calculated as aggregates across rows. For example, a running sum needs to consider not just the current row, but all rows before the current row. This can be accomplished using the STACK()
method inside a PREPARE
clause (see example below).
ORDER BY
: In many cases, the ordering of rows in the input table is important, especially for running sums. TheORDER BY
clause is useful to guarantee the correct ordering.PARTITION BY
: Used to perform many aggregates on a single table, each aggregate considering only a subset of the table's rows based on some grouping criteria.
tblname
- string name of the table to alter the schema of
Example This example will produce a running sum of the qty column order by extime on a per-account basis
Data Modification¶
insert_values clause¶
Overview
This clause enables the insertion of raw values into an existing table. Many rows may be inserted in a single call. Each values_expr represents a single row of data to be inserted. To insert multiple rows append additional values_expr
In order to specify columns in which to insert values, provide an ordered list of column names after the table name. The same number of values must be supplied in the values_expr and in the same order. All other columns will default to null.
If a list of column names is not provided, all columns will be populated. Each values_expr should contain one value for each column in the target table. Values should be in the same natural order as the columns in the target table.
The forloop_expr allows for the insertion of values_expr to be repeated for each cycle of the for loop. The values can reference the for loop's declared variable. Multiple for loops can be nested inside one another.
Syntax
tblname
- string name of the table to insert into
colN
- string name of the column to contain the Nth value from values_expr
Examples
insert_from clause¶
Overview
This clause causes the insertion of the result of a projecting-clause (select_clause, analyze_clause, prepare_clause, execute_clause) into an existing table. Each row from the result set of the select_clause will result in the insertion of a new record into the target table. Please note, multiple tables may be inserted in a single call using the UNION
clause.
In order to specify columns in which to insert values, provide an ordered list of column names after the table name. The result set from the preceding select_clause must have the same number of columns. The columns should be in the same order. All other columns will default to null.
If a list of column names is not provided, all columns will be populated. Each values_expr should contain one value for each column in the target table. Values should be in the same natural order as the columns in the target table.
If the BYNAME keyword is included, then source and target columns are aligned based on column name. Column Names in the target table that do not existing in the query clause will be populated with null
Syntax
tblname
- string name of the table to insert into
colN
- string name of the column to contain the Nth value from results of select clause*
Example
sync clause¶
Overview
This clause causes the target table to reflect the result of a projecting-clause (select_clause, analyze_clause, prepare_clause, execute_clause), meaning that rows will be inserted, deleted and updated accordingly on the target table such that the target table will properly mirror the result of the projecting-clause.
In otherwords, the two examples will result in the same final state for the my_target table, but from a trigger stand point they are significantly different:
In the first example, the truncate instruction will cause a delete trigger to be fired for every row originally in the my_target table and then the subsequent insert instruction will cause an insert trigger to be fired for every row inserted into the my_target table.
The following rules apply for how the target table is mutated:
- Records in the projecting-clause that do not have a matching record in the target table are inserted into the target table
- Records in the target table that do not have a matching record in the projecting-clause are deleted from the target table
- Records in the target table that have a "perfect" matching record in the projecting-clause remain untouched. (no triggers fired
- Records in the target table that have a "non-perfect" matching record in the projecting-clause are updated.
An ON
clause can be used to tell the sync tool how to determine if a source and target row are considered equivalent. When an equivalent row between the target table and projecting-clause is identified, this will result in an update. Rows that don't have a match will result in a delete and/or insert.
Matching Rules:
- If all cells between two rows are equivalent, that is considered a "Perfect match"
- If an
ON
clause is specified: Two row are considered a "non-perfect" match if all of the cells specified in the ON clause are a match, but other cells are not. (If anON
clause is not specified then there can not be non-perfect matches)
In order to specify columns in which to sync values, provide an ordered list of column names after the table name (note that only these columns will be considered for the perfect-matching criteria). The result set from the preceding select_clause must have the same number of columns. The columns should be in the same order. All other columns will default to null for inserts and will be left unaltered for updates.
If a list of column names is not provided, all columns will be populated. Each values_expr should contain one value for each column in the target table. Values should be in the same natural order as the columns in the target table.
Syntax
tblname
- string name of the table to insert into
colN
- string name of the column to contain the Nth value from results of select clause
onColN - string name of the column to be used in determining if two records are equivalent
Examples
delete_from clause¶
Overview
This clause enables the deletion of specific rows or all rows from a table and also calls the onDelete triggers if applicable.
When it comes to deleting all rows from the table, this is similar to using truncate clause but consumes more memory.
To delete only specific rows, provide a WHERE
clause which must evaluate to a Boolean value where true indicates that the row should be deleted. Otherwise, to delete all rows, do not provide a WHERE
clause.
You can specify the maximum number of rows to delete using the LIMIT
clause (maxdelete). Additionally, you may provide a number of rows to skip when deleting (skipdelete). Note the limit is applied after the where clause, meaning that the skipdelete indicates how many records to skip deleting that would otherwise be deleted based on the where clause.
The JOIN .. ON
clause allows you to use the values from another table to determine which rows to delete from the target table. The JOIN
operates similar to the SELECT
's JOIN
clause. Note that LEFT JOIN
and LEFT ONLY
are also supported. Keep in mind that you can only delete rows from the left table.
Syntax
tblname
- string name of the table to delete from
Examples
truncate clause¶
Overview
This command deletes all rows from a table. You can specify multiple tables, separated by commas.
Syntax
Examples
update clause¶
Overview
This clause can be used to update specific rows or all rows in a table. Following the SET
statement, provide an assignment (target=expr
) clause for each column to be updated, where target is the name of an existing column.
To update only specific rows, provide a WHERE
clause which evaluates to a Boolean value, where true indicates that the row should be updated. Otherwise, to update all rows, do not provide a WHERE
clause.
You can specify the maximum number of rows to update using the LIMIT
clause (maxupdate
). Additionally, you may provide a number of rows to skip when updating (skipupdate
). Note the limit is applied after the where clause, meaning that the skipupdate indicates how many records to skip updating that would otherwise be updated based on the where clause.
The JOIN .. ON
clause allows you to use the values from another table to update values in the target table. The JOIN
operates similar to the SELECT
's JOIN
clause. Note that LEFT JOIN
and LEFT ONLY
are also supported. Keep in mind that you can only update values on the left table.
Syntax
tblname
- string name of the table to update
colN
- string name of the column to assign value from expression
Schema Modification¶
create_table clause¶
Overview
This command is used to create a new, empty (no rows) table with a specified set of columns (names and types). If another table already exists with the tblname
, the existing table will be implicitly dropped and replaced with the newly created table.
At least one column must be specified, and each column must have a unique name.
Syntax
tblname
- string name of the table to create
colN
- string name of the Nth column to create
Example
create_table_as clause¶
Overview
This command is used to create a new table which has the same schema definition as the result set of a select_clause. If another table already exists with the tblname
, the existing table will be implicitly dropped and replaced with the newly created table.
All rows from the select_clause's result set will be inserted into the table.
Syntax
tblname
- string name of the table to create
Example
alter_table clause¶
Overview
This command alters the schema (column names and types) of an existing table. You can add columns (ADD
), rename columns (RENAME ... TO
), delete columns (DROP
), and change the type of columns (MODIFY
). Note, when multiple alterations are supplied, they are evaluated left to right. When changing types, a best effort cast will be applied.
When adding columns, the new cells will default to null. The column will be added to the end (right-most) of the table.
Syntax
tblname
- string name of the table to alter
Examples
rename_table clause¶
Overview
This command renames an existing table. If another table already exists with the newname, the existing table will be implicitly dropped and replaced with the renamed table.
Syntax
tblname
- string name of the table to rename
newname
- string name of the new name for the table
Example
drop_table clause¶
Overview
This command drops an existing table, along with its schema and all rows. You can specify multiple tables, separated by commas.
Syntax
tblname
- string name of the table to drop
Example
AMIDB Scope Options¶
TEMPORARY
A table visible only to the user. It exists as long as the user session is active.
VARIABLE
A table created as an AMI Script Table object. It has the same scope and lifetime as a TEMPORARY table.
PUBLIC
A table which is visible to all users. Its definition (but not the content) persists. For persistent content see PersistEngine options.
If the scope is not specified, AMI will first check TEMPORARY
tables, then VARIABLE
tables, and lastly PUBLIC
tables. Tables with different scopes but the same name can exist. However, if a PUBLIC table already exists, then it cannot be created again.
You may look at the DefinedBy column to see if a column is a variable or temporary.
Note: Argument TEMP
can also be used instead of TEMPORARY
with the same effect
Display¶
shows scoped tables
shows columns of scoped tables
shows column details of a scoped table
provides the command which can be used to create the scoped table
Schema Modification¶
CREATE¶
creates a scoped table
Examples
ALTER¶
modifies a scoped table
Examples
DROP¶
deletes a scoped table
Examples
RENAME¶
renames a scoped table
Examples
(All schema modification syntax options apply as in Schema Modification)
Data Modification¶
INSERT¶
inserts row into scoped table
Examples
PREPARE¶
prepares additional columns on scoped table
ANALYZE¶
runs analysis on scoped table
Examples
SYNC¶
syncs scoped table to ProjectingClause result
Example
TRUNCATE¶
deletes all rows from scoped table
(All data modification syntax options apply as in Data Modification)
Query Clauses¶
SELECT¶
queries scoped table
Examples
UPDATE¶
updates scoped table
Examples
DELETE FROM¶
deletes data from scoped table
Examples
(All query clauses syntax options apply as in Query Clauses)
Assigning query results to local variables¶
AMI Script supports the ability to execute queries and store the results into local variables. There are various ways the query results can be stored, beyond just Table objects, see type Coercion rules below.
This is the general form:
type var_name = query;
type
: the type of local variable to store the query result.
var_name
: Name of the local
query
: a valid sql query. Either select_clause, analyze_clause, or prepare_clause
Valid Type Coercion rules are listed below. Please note that an attempt to coerce the query result to an invalid type, will result in null.
List: Query results that have a single column can be stored to a List. Each value in the query result will be inserted into the list in order.
Set: Query results that have a single column can be stored to a Set. Each value in the query result will be inserted into the set in order, ignoring duplicate values (as per set's contract).
Map: Query results that have a single row can be stored to a map. The map's keys will be the columns' names, and the map's values will be the values of the first (and only) row.
Primitives: Queries that result in a single value (one row and one column) can be stored to a primitive variable. The value of the single cell is what will be stored.
Nested Expressions¶
alter_col_expr¶
colname
String name of the column to alter
newname
String name of the column to create
beforecol
String name of the existing column to add the new column before
column_expr¶
colname
name of the column that will contain results of expression
*
expands to available columns for all tables or specified table
expr¶
See Standard Operators section.
forloop_expr¶
varname
variable that can be referenced by expresions within the for loop
start_const
initial value of varname, inclusive
end_const
last number to loop over, inclusive unless the step would cause it to be skipped.
step
positive number dictating size to increment or decrement varname by in each loop cycle.
A negative or zero step will cause an error. If start_const \<= end_const then step is used to increment, otherwise decrement.
in_expr¶
When only a single column will participate in the in clause:
When two or more columns will participate in the in clause:
The IN clause evaluates to true for a given row if the result of expr from the left hand expression exists in the right hand expression. If using multiple columns, the combination of the expr must match the combination of the values_expr or select_clause for a given row.
lambda_expr¶
The lambda expression allows for contained AMI Script to be executed once per row returned by the containing select statement. colname is the name of the column that the return value of this lambda will be placed in.
nested_select_expr¶
select_clause
inner query to execute
aliasname
name of alias for use in column_expr and expr
table_expr¶
*tblname
name of table to include in select
*aliasname
name of alias for use in column_expr and expr
type_expr¶
values_expr¶
where_expr¶
Expressions with the added IN clause (in_expr) funtionality
window_expr¶
window_name
The name of this window which is used in the preceding col_expr (in the analyze clause). For example, if the name of a window is "dates" then max(dates.open) would give you the max value of the open column for that window.
windowing_expr
The boolean expression that defines (for each row) the other rows that belong to the same window.
order_expr
Optional expression which determines the natural order of rows for the window. This is important for order dependent aggregates such as first(...) and last(...)
partitioning_expr
Optional expression which determines how the table is broken up before windowing is considered. In other words, two rows that evaluate to a different partitioning_expr will never belong to the same window.
prep_expr¶
stack(expr) |
running sum of expr from first to row n, inclusive |
count(expr) |
running count where expr!=null to the current row, inclusive |
norm(expr) |
similar to stack, but scaled so that the max value is one. Short hand for: stack(expr)/sum(expr). See dnorm(...) |
dnorm(expr) |
similar to stack, but scaled so that the min value is zero. Short hand for: (stack(expr) - expr)/sum(expr). This, in conjunction with norm(...) can used for preparing area charts that should fill an area, such as a pie chart. |
rank(expr) |
will evaluate expr for all rows and the lowest value will get 1, 2^nd^ will get 2, etc. If multiple rows expr evaluate to the same value, they will get the same ranking. |
urank(expr) |
will evaluate expr for all rows and the lowest value will get 1, 2^nd^ will get 2, etc. If multiple rows expr evaluate to the same value, the first occurrence gets the lower value. |
unique(expr) |
each unique evaluation of expr will return a unique integer, starting at 1. Repeat occurrences of expr will return the same integer value. |
offset(expr,offset) |
Will grab the value offset from the current row. If offset==0 use current row, negative number is nth prior row, positive number is nth future row. Evaluates to null if offset is out of bounds for current table. Ex: offset(price,-1) for first row returns null, for second row returns first row's price and for nth row returns (n-1)ths row's price |
kmeansCluster(expr1,expr2,...,nClusters,seed) |
Clusters data using the k-means clustering algorithm. Takes one or more expressions, followed by int nClusters specifying the number of clusters and long seed specifying the random seed for initializing the model parameters. The seed argument may be set to null if no specific seed is desired. Returns a column of integers labeling each record according to its learned cluster. Input expressions must evaluate to numbers. |
emGmmCluster(expr1,expr2,...,nClusters,seed) |
Same functionality as kmeansCluster(), but instead uses the expectation-maximization algorithm to train a Gaussian mixture model. |
viGmmCluster(expr1,expr2,...,nClusters,seed) |
Same functionality as kmeansCluster(), but instead uses the variational inference algorithm to train a Gaussian mixture model. |
agg_expr¶
sum(expr) |
Summation of all values, skips nulls |
min(expr) |
Minimum value, skips nulls |
max(expr) |
Maximum value, skips nulls |
count(expr) |
Count of non-null values |
countUnique(expr) |
Count of unique non-null values |
cat(expr, delim, limit) |
Concatenates the first limit non-null values to a string with delim. Ex: cat(a,'/',2) selects the first 2 non-null values from column a and appends '/' at the end of each value. |
avg(expr) |
Average of all non-null values |
var(expr) |
Population Variance of all non-null values |
varS(expr) |
Sample Variance of all non-null values |
stdev(expr) |
Population Standard Deviation of all non-null values |
stdevS(expr) |
Sample Standard Deviation of all non-null values |
first(expr) |
First value, may be null if first value is null |
last(expr) |
Last value, may be null if last value is null |
covar(expr1, expr2) |
Population Covariance of 1st expr and 2nd expr where both aren't null. |
covarS(expr1, expr2) |
Sample Covariance of 1st expr and 2nd expr where both aren't null. |
cor(expr1, expr2) |
Correlation of 1st expr and 2nd expr where both aren't null. Returns 0 - 1 |
beta(expr1, expr2) |
Beta of 1st expr and 2nd expr where both are not null. The beta function is defined as beta(x, y) = covar(x, y) / var(y). If stdev(y) = 0, the expression returns NaN. |
avgExp(expr, decay, desc) |
The exponentially weighted average of non-null values with a specified decay rate. Desc of true/false means first/last row are highest weighted. |
avgGauss(expr, variance) |
The Gaussian weighted average of non-null values, such that the middle value has highest weight. |
median(expr) |
The median of all non-null values. |
percentileCont(expr) |
eturns the value at the percentile provided. Is interpolated and might not be equal to any value within the supplied dataset |
percentileDisc(expr) |
Returns the value nearest to the percentile provided |
AMI-SQL Escaping Rules¶
If you want to use a reserve keyword as your table name, such as order,limit, you need to use backtick ` to escape it.