Reference¶
BQL Statements¶
CREATE SINK¶
Synopsis¶
CREATE SINK name TYPE type_name [WITH parameter_name = parameter_value [, ...]]
Description¶
CREATE SINK
creates a new sink in a topology.
Parameters¶
- name
- The name of the sink to be created.
- type_name
- The type name of the sink.
- parameter_name
- The name of a sink-specific parameter.
- parameter_value
- The value for a sink-specific parameter.
Sink Parameters¶
The optional WITH
clause specifies parameters specific to the sink.
See each sink’s documentation to find out parameters it provides.
Examples¶
To create a sink having the name “snk” with no sink-specific parameter:
CREATE SINK snk TYPE fluentd;
To create a sink with sink-specific parameters:
CREATE SINK fluentd TYPE fluentd WITH tag_field = "fluentd_tag";
As you can see, the name of a sink can be same as the type name of a sink.
CREATE SOURCE¶
Synopsis¶
CREATE [PAUSED] SOURCE name TYPE type_name [WITH parameter_name = parameter_value [, ...]]
Description¶
CREATE SOURCE
creates a new source in a topology.
Parameters¶
PAUSED
- The source is paused when it’s created if this option is used.
- name
- The name of the source to be created.
- type_name
- The type name of the source.
- parameter_name
- The name of a source-specific parameter.
- parameter_value
- The value for a source-specific parameter.
Source Parameters¶
The optional WITH
clause specifies parameters specific to the source.
See each source’s documentation to find out parameters it provides.
Notes¶
Some sources stop after emitting all tuples. They can stop even before any
subsequent statement is executed. For such sources, specify PAUSED
parameter
and run RESUME SOURCE after completely setting up a topology so
that no tuple emitted from the source will be lost.
Examples¶
To create a source having the name “src” with no source-specific parameter:
CREATE SOURCE src TYPE dropped_tuples;
To create a source with source-specific parameters:
CREATE SOURCE fluentd TYPE fluentd WITH bind = "0.0.0.0:12345",
tag_field = "my_tag";
As you can see, the name of a source can be same as the type name of a source.
CREATE STATE¶
Synopsis¶
CREATE STATE name TYPE type_name [WITH parameter_name = parameter_value [, ...]]
Description¶
CREATE STATE
creates a new UDS (User Defined State) in a topology.
Parameters¶
- name
- The name of the UDS to be created.
- type_name
- The type name of the UDS.
- parameter_name
- The name of a UDS-specific parameter.
- parameter_value
- The value for a UDS-specific parameter.
UDS Parameters¶
The optional WITH
clause specifies parameters specific to the UDS.
See each UDS’s documentation to find out parameters it provides.
Examples¶
To create a UDS named “my_uds” with no UDS-specific parameter:
CREATE STATE my_uds TYPE my_uds_type;
To create a UDS with UDS-specific parameters:
CREATE STATE my_ids TYPE snowflake_id WITH machine_id = 1;
CREATE STREAM¶
Synopsis¶
CREATE STREAM name AS select
Description¶
CREATE STREAM
creates a new stream (a.k.a a continuous view) in a topology.
Parameters¶
- name
- The name of the stream to be created.
- select
- The
SELECT
statement to generate a stream. select can be anySELECT
statement including a statement usingUNION ALL
.
Examples¶
To create a stream named “strm”:
CREATE STREAM strm AS SELECT RSTREAM * FROM src [RANGE 1 TUPLES];
To create a stream which merges all tuples from multiple streams:
CREATE STREAM strm AS
SELECT RSTREAM * FROM src1 [RANGE 1 TUPLES]
UNION ALL
SELECT RSTREAM * FROM src2 [RANGE 1 TUPLES];
DROP SINK¶
Synopsis¶
DROP SINK name
Description¶
DROP SINK
drops a sink that is already created in a topology. The sink can
no longer be used after executing the statement.
Parameters¶
- name
- The name of the sink to be dropped.
DROP SOURCE¶
Synopsis¶
DROP SOURCE name
Description¶
DROP SOURCE
drops a source that is already created in a topology. The
source is stopped and removed from a topology. After executing the statement,
the source cannot be used.
Parameters¶
- name
- The name of the source to be dropped.
DROP STATE¶
Synopsis¶
DROP STATE name
Description¶
DROP STATE
drops a UDS that is already created in a topology. The UDS can
no longer be used after executing the statement.
Note
Even if a uds
sink exist for the UDS, the sink will not be dropped when
dropping the UDS. The uds
sink must be dropped manually.
Parameters¶
- name
- The name of the UDS to be dropped.
DROP STREAM¶
Synopsis¶
DROP STREAM name
Description¶
DROP STREAM
drops a stream that is already created in a topology. The
stream can no longer be used after executing the statement.
Parameters¶
- name
- The name of the stream to be dropped.
INSERT INTO¶
Synopsis¶
INSERT INTO sink FROM stream
Description¶
INSERT INTO
inserts tuples from a stream or a source to a sink.
Parameters¶
- sink
- The name of the sink to which tuples are inserted.
- stream
- The name of a stream or a source.
Examples¶
To insert tuples into a sink from a source having the name “src”:
INSERT INTO snk FROM src;
LOAD STATE¶
Synopsis¶
LOAD STATE name TYPE type_name [TAG tag]
[SET set_parameter_name = set_parameter_key]
[create_if_not_saved]
where create_if_not_saved is:
OR CREATE IF NOT SAVED
[WITH create_parameter_name = create_parameter_value]
Description¶
LOAD STATE
loads a UDS that is previously saved by
SAVE STATE.
LOAD STATE
fails if the UDS hasn’t been saved yet. When
OR CREATE IF NOT SAVED
is specified, LOAD STATE
creates a new UDS with
the given optional parameters if the UDS hasn’t been saved yet.
LOAD STATE
, even with OR CREATE IF NOT SAVED
, fails if the UDS doesn’t
support the statement.
Parameters¶
- name
- The name of the UDS to be loaded.
- type_name
- The type name of the UDS.
- tag
- The name of the user defined tag for versioning of the saved UDS data. When tag is omitted, “default” is used as the default tag name.
- set_parameter_name
- The name of a UDS-specific parameter defied for
LOAD STATE
. - set_parameter_value
- The value for a UDS-specific parameter defined for
LOAD STATE
. - create_parameter_name
- The name of a UDS-specific parameter defined for
CREATE STATE
. - create_parameter_value
- The value for a UDS-specific parameter defined for
CREATE STATE
.
LOAD STATE
can have two sets of parameters: set_parameters and
create_parameters. set_parameters are used when there’s a saved UDS
data having the given tag. On the other hand, create_parameters are used
when the UDS hasn’t been saved yet. create_parameters are exactly same as
parameters that the UDS defines for CREATE STATE. However,
set_parameters are often completely different from create_parameters.
Because create_parameters are often saved as a part of the UDS’s
information by SAVE STATE
, set_parameters doesn’t have to have the same
set of parameters defined in create_parameters.
There’re some use-cases that a UDS uses set_parameters:
Customize loading behavior
- When a UDS doesn’t provide a proper versioning of saved data,
LOAD STATE
may fail to load it due to the format incompatibility. In such a case, it’s difficult to modify saved binary data to have a format version number. Thus, providing a set_parameter specifying the format version number could be the only solution.
- When a UDS doesn’t provide a proper versioning of saved data,
Overwrite some saved values of create_parameters
Like create_parameters, set_parameters are specific to each UDS. See the documentation of each UDS to find out parameters it provides.
Examples¶
To load a UDS named “my_uds” and having the type “my_uds_type”:
LOAD STATE my_uds TYPE my_uds_type;
Note that “my_uds” needs to be saved before executing this statement. Otherwise, it fails.
To load a UDS named “my_uds” and having the type “my_uds_type” and assigned a tag “trained”:
LOAD STATE my_uds TYPE my_uds_type TAG trained;
To load a UDS with set_parameters:
LOAD STATE my_uds TYPE my_uds_type TAG trained
SET force_format_version = "v1";
To load a UDS that hasn’t been saved yet with OR CREATE IF NOT SAVED
:
LOAD STATE my_uds TYPE my_uds_type OR CREATE IF NOT SAVED;
To load a UDS that hasn’t been saved yet with OR CREATE IF NOT SAVED
with
create_parameters:
LOAD STATE my_uds TYPE my_uds_type OR CREATE IF NOT SAVED
WITH id = 1;
When the UDS hasn’t been saved previously, the statement above falls back into
CREATE STATE
as follows:
CREATE STATE my_uds TYPE my_uds_type WITH id = 1;
OR CREATE IF NOT SAVED
can be used with a tag and set_parameters:
LOAD STATE my_uds TYPE my_uds_type TAG trained SET force_format_version = "v1"
OR CREATE IF NOT SAVED WITH id = 1;
PAUSE SOURCE¶
Synopsis¶
PAUSE SOURCE name
Description¶
PAUSE SOURCE
pauses a running source so that the source stops emitting
tuples until executing RESUME SOURCE on it again. Executing
PAUSE SOURCE
on a paused source doesn’t affect anything.
PAUSE SOURCE
fails if the source doesn’t support the statement.
Parameters¶
- name
- The name of the source to be paused.
RESUME SOURCE¶
Synopsis¶
RESUME SOURCE name
Description¶
RESUME SOURCE
resumes a paused source so that the source can start to emit
tuples again. Executing RESUME SOURCE
on a running source doesn’t affect anything.
RESUME SOURCE
fails if the source doesn’t support the statement.
Parameters¶
- name
- The name of the source to be resumed.
Examples¶
A common use case of RESUME SOURCE
to resume a source which is created by
CREATE PAUSED SOURCE
.
CREATE PAUSED SOURCE src TYPE some_source_type WITH ...parameters...;
-- ... construct a topology connected to src ...
RESUME SOURCE src;
By doing this, no tuple emitted from src will be lost.
REWIND SOURCE¶
Synopsis¶
REWIND SOURCE name
Description¶
REWIND SOURCE
rewinds a source so that the source emits tuples from the
beginning again.
REWIND SOURCE
fails if the source doesn’t support the statement.
Parameters¶
- name
- The name of the source to be rewound.
SAVE STATE¶
Synopsis¶
SAVE STATE name [TAG tag]
Description¶
SAVE STATE
saves a UDS to SensorBee’s storage. The location or the format
of the saved UDS depends on a storage that SensorBee uses and is not
controllable from BQL.
SAVE STATE
fails if the UDS doesn’t support the statement.
Parameters¶
- name
- The name of the UDS to be saved.
- tag
- The name of the user defined tag for versioning of the saved UDS data. When tag is omitted, “default” is used as the default tag name.
Examples¶
To save a UDS named “my_uds” without a tag:
SAVE STATE my_uds;
To save a UDS with a tag “trained”:
SAVE STATE my_uds TAG trained;
SELECT¶
Synopsis¶
SELECT emitter {* | expression [AS output_name]} [, ...]
FROM from_item stream_to_relation_operator
[AS stream_alias] [, ...]
[WHERE condition [, ...]]
[GROUP BY expression [, ...]]
[HAVING condition [, ...]]
[UNION ALL select]
where emitter is:
{RSTREAM | ISTREAM | DSTREAM}
where stream_to_relation_operator is:
'[' RANGE range_number {TUPLES | SECONDS | MILLISECONDS}
[, BUFFER SIZE buffer_size]
[, drop_mode IF FULL]
']'
Description¶
SELECT
retrieves tuples from one or more streams. The general processing of
SELECT
is as follows:
- Each from_item is converted into a relation (i.e. a window) from a
stream. Then, each tuple emitted from a from_item is computed within
the window. If more than one element is specified in the
FROM
clause, they are cross-joined. - When the
WHERE
clause is specified, condition is evaluated for each set of tuples cross-joined in theFROM
clause. Tuples which do not satisfy the condition are eliminated from the output. - If the
GROUP BY
clause is specified, tuples satisfied the condition in theWHERE
clause are combined into groups based on the result of expressions. Then, aggregate functions are performed on each group. When theHAVING
clause is given, it eliminates groups that do not satisfy the given condition. - The output tuples are computed using the
SELECT
output expressions for each tuple or group. - Computed output tuples are converted into a stream using
Relation-to-Stream Operators and emitted from the
SELECT
. UNION ALL
, if present, combines outputs from multipleSELECT
. It simply emits all tuples from allSELECT
without considering duplicates.
Parameters¶
Emitter¶
emitter controls how a SELECT
emits resulting tuples.
RSTREAM
- When
RSTREAM
is specified, all tuples in a relation as a result of processing a newly coming tuple are output. See Relation-to-Stream Operators for more details. ISTREAM
- When
ISTREAM
is specified, tuples contained in the current relation but not in the previously computed relation are emitted. In other words, tuples that are newly inserted or updated since the previous computation are output. See Relation-to-Stream Operators for more details. DSTREAM
- When
DSTREAM
is specified, tuples contained in the previously computed relation but not in the current relation are emitted. In other words, tuples in the previous relation that are deleted or updated in the current relation are output. Note that output tuples are from the previous relation so that they have old values. See Relation-to-Stream Operators for more details.
FROM
Clause¶
The FROM
clause specifies one or more source streams for the SELECT
and converts those streams into relations using stream to relation operators.
The FROM
clause contains following parameters:
- from_item
- from_item is a source stream which can be either a source, a stream, or a UDSF.
- range_number
range_number is a numeric value which specifies how many tuples are in the window. range_number is followed by one of interval types:
TUPLES
,SECONDS
, orMILLISECONDS
.When
TUPLES
is given, range_number must be a positive integer and the window can contain at most range_number tuples. If a new tuple is inserted into the window having range_number tuples, the oldest tuple is removed. “The oldest tuple” is the tuple that was inserted into the window before any other tuples, not the tuple having the oldest timestamp. The maximum range_number is 1048575 withTUPLES
keywords.When
SECONDS
orMILLISECONDS
is specified, range_number can be a positive number and the difference of the minimum and maximum timestamps of tuples in the window can be at most range_number seconds or milliseconds. If a new tuple is inserted into the window, tuples whose timestamp is range_number seconds or milliseconds earlier than the new tuple’s timestamp are removed. The maximum range_number is 86400 withSECONDS
and 86400000 withMILLISECONDS
.- buffer_size
- buffer_size specifies the size of buffer, or a queue, located between
from_item and the
SELECT
. buffer_size must be an integer and greater than 0. The meximum buffer_size is 131071. - drop_mode
drop_mode controls how a new tuple is inserted into the buffer located between from_item and the
SELECT
when the buffer is full. drop_mode can be one of the followings:WAIT
- A new tuple emitted from from_item is blocked until the
SELECT
consumes at least one tuple.
- A new tuple emitted from from_item is blocked until the
DROP OLDEST
- The oldest tuple in the buffer is removed and a new tuple is inserted into the buffer. “The oldest tuple” is the tuple that was inserted into the buffer before any other tuples, not the tuple having the oldest timestamp.
DROP NEWEST
- The oldest tuple in the buffer is removed and a new tuple is inserted into the buffer. “The newest tuple” is the tuple that was inserted into the buffer after any other tuples, not the tuple having the newest timestamp.
Note
A buffer is different from a window. A buffer is placed in front of a window. A window gets a new tuple from a buffer and computes a new relation. A buffer is used not to block emitting tuples so that multiple
SELECT
statements can work concurrently without waiting for their receivers to consume tuples.- stream_alias
- stream_alias provides an alias of from_item and it can be referred
by the alias in other parts of the
SELECT
. If the alias is given, the original name is hidden and cannot be used to refer from_item.
Fields of tuples can be referred by <field_name>
or
<stream>:<field_name>
in other clauses and the SELECT
list. For
example, when the SELECT
has FROM strm [RANGE 1 TUPLES]
and strm
emits {"a":<some value>}
, the field a
can be referred by a
or
strm:a
. These two forms cannot be mixed in a SELECT
statement.
The form <stream>:<field_name>
is required when the FROM
clause has
multiple input streams.
WHERE
Clause¶
The SELECT
can optionally have a WHERE
clause. The WHERE
clause
have a condition. The condition can be any expression that evaluates to a
result of type bool
. Any tuple that does not satisfy the condition
(i.e. the result of the expression is false
) will be eliminated from the
output.
Operators describes operators that can be used in the condition.
GROUP BY
Clause¶
The GROUP BY
clause is an optional clause and condenses into a single
tuple all selected tuples whose expressions specified in GROUP BY
clause
result in the same value.
expression can be any expression using fields of an input tuple. When there’re multiple expressions in the clause, tuples having the same set of values computed from those expressions are grouped into a single tuple.
When the GROUP BY
clause is present, any ungrouped field cannot be used
as an output field without aggregate functions. For example, when tuples have
4 fields a
, b
, c
, and d
, and the GROUP BY
clause has
following expressions:
GROUP BY a, b + c
a
can only be used as an output field:
SELECT a FROM stream [RANGE 1 TUPLES]
GROUP BY a, b + c;
Other fields need to be specified in aggregate functions:
SELECT a, max(b), min(b + c), avg(c * d) FROM stream [RANGE 1 TUPLES]
GROUP BY a, b + c;
Aggregate functions are evaluated for each group using all tuples in the group.
Note
The GROUP BY
clause performs grouping within a window:
SELECT a FROM stream [RANGE 10 TUPLES]
GROUP BY a;
This SELECT
computes at most 10 groups of tuples because there’re only
10 tuples in the window.
HAVING
Clause¶
The HAVING
clause is an optional clause and placed after the GROUP BY
clause. The HAVING
clause has a condition and evaluate it for each group,
instead of each tuple. When ungrouped fields are used in the condition, they
need to be in aggregate functions:
SELECT a, max(b), min(b + c), avg(c * d) FROM stream [RANGE 1 TUPLES]
GROUP BY a, b + c HAVING min(b + c) > 1 AND avg(c * d) < 10;
In this example, b
, c
, and d
are ungrouped fields and cannot
directly specified in the condition.
SELECT
List¶
The SELECT
list, placed between the emitter and the FROM
clause,
defines the form of the output tuples emitted from the SELECT
statement.
Each item in the list can be any expression. Each item (i.e. output field)
will have a name. When an expression only consists of a field name, the output
name of the expression will be the field name. For example, the output name
of strm:price
in SELECT RSTREAM strm:price FROM ...
will be price
,
not strm:price
. When the expression is a UDF call, the name of the UDF
will be used as the name of the output field. For example, the result of
count(*)
is named as count
. If an expression is other than a field
name or a UDF call, the output name will be col_n
where n
is replaced
with the number corresponding to n-th expression (counting from 0). The output
field name can manually be specified by AS output_name
.
When the expression is *
, all fields which have not been specified in the
SELECT
list yet will be included. Output names of those fields will be
identical to the original field names.
If an expression results in a map, its output name can be AS *
. In such
case, all fields of the map is extended to the top level fields. For example,
in SELECT RSTREAM a, b AS *, c FROM strm ...
, when strm emits tuples
having
{
"a": v1,
"b": {
"d": v3,
"e": v4
},
"c": v2,
}
to the SELECT
, its output will be
{
"a": v1,
"c": v2,
"d": v3,
"e": v4
}
When some fields have the same name, only one of them will be included in the result. It is undefined which field will be chosen as a result.
Notes¶
An Emitter and Its Performance¶
There’re some use case specific optimizations of the evaluation of the
SELECT
and this subsection describes each optimization and its limitation.
Simple Transformation and Filtering¶
Performing a simple per-tuple transformation or filtering over an input stream is a very common task. Therefore, BQL optimizes statements having the following form:
SELECT RSTREAM projection FROM input [RANGE 1 TUPLES] WHERE condition;
Limitations of this optimization are:
- There can only be one input stream and its range is
[RANGE 1 TUPLES]
. - The emitter must be
RSTREAM
.
Evaluation in WHERE
Clause¶
Each set of tuples cross-joined in the FROM
clause is evaluated exactly once
in the WHERE
clause. Therefore, all functions in the WHERE
clause are
only called once for each set:
SELECT RSTREAM * FROM stream1 [RANGE 100 TUPLES], stream2 [RANGE 100 TUPLES]
WHERE random() < 0.2;
In this example, 80% of sets of cross-joined tuples are filtered out and only 20% of sets (around 20 tuples for each input from either stream) are emitted.
Commands¶
build_sensorbee¶
Because SensorBee is written in Go, all its dependencies including plugins
generally need to be statically linked, or at least cannot be dynamically
loaded at runtime. The build_sensorbee
command is provided to support
building a custom sensorbee
command.
Basic Usage¶
Prepare build.yaml
configuration file and run build_sensorbee
in the
same directory as build.yaml
is located at:
$ ls
build.yaml
$ build_sensorbee
sensorbee_main.go
$ ls
build.yaml
sensorbee
sensorbee_main.go
sensorbee
is the result executable file and sensorbee_main.go
is a
Go file generated by build_sensorbee
and passed for go build
command
to build sensorbee
.
Configuration¶
build_sensorbee
requires a configuration file named build.yaml
. The file
is written in YAML and has following optional sections:
plugins
commands
plugins
¶
The plugins
section is optional and may have a list of plugins as follows:
plugins:
- github.com/sensorbee/twitter/plugin
- github.com/sensorbee/fluentd/plugin
- github.com/sensorbee/nlp/plugin
- github.com/sensorbee/tutorial/ml/plugin
- github.com/sensorbee/jubatus/classifier/plugin
A plugin must be provided as a valid import path of Go. A path depends on each plugin.
commands
¶
The commands
section is optional and is used to customize subcommands that
the sensorbee
command will have. By default, or when the section is empty,
subcommands include all standard commands:
run
runfile
shell
topology
commands
section is a map whose key is the name of subcommand. Standard
subcommands like run
can be added by providing empty entries:
commands:
run:
shell:
With this configuration, the sensorbee
command will only have run
and
shell
commands.
To add a custom command, an entry must have path
parameter that is a Go
import path of the command:
commands:
run:
shell:
mytest:
path: "path/to/sb-mytest"
With this configuration, the sensorbee
command will also have the mytest
subcommand. The subcommand is implemented at path/to/sb-mytest
.
Names of commands must be unique and cannot be any of:
cli
os
version
time
Prohibited names might be added in the future version.
Custom Subcommand Development¶
A custom subcommand for the sensorbee
command can be developed as a Go
package. Only thing the package has to do is to provide a function
func SetUp() cli.Command
. cli
is gopkg.in/urfave/cli.v1
.
A minimum example is provided in
the SensorBee tutorial repository:
package hello
import (
"fmt"
cli "gopkg.in/urfave/cli.v1"
)
func SetUp() cli.Command {
return cli.Command{
Name: "hello",
Usage: "say hello",
Action: func(c *cli.Context) error {
fmt.Println("hello")
return nil
},
}
}
This command prints “hello” when sensorbee hello is executed. See
https://github.com/urfave/cli to learn how to create a command using the
cli
library.
A Complete Example¶
plugins:
- github.com/sensorbee/twitter/plugin
- github.com/sensorbee/fluentd/plugin
- github.com/sensorbee/nlp/plugin
- github.com/sensorbee/tutorial/ml/plugin
- github.com/sensorbee/jubatus/classifier/plugin
commands:
run:
runfile:
shell:
topology:
hello:
path: "github.com/sensorbee/tutorial/custom_command/hello"
Flags and Options¶
--config path
or -c path
This option specifies the path to the configuration file to be used. Its default value is
build.yaml
. With this option, a configuration file in another directory can be used as follows:$ build_sensorbee -c /path/to/dir/special_build.yaml
--download-plugins={true|false}
This option have to be
true
orfalse
. When the value istrue
,build_sensorbee
downloads (i.e.go get
) all plugins listed inbuild.yaml
. When it’sfalse
,build_sensorbee
doesn’t download plugins and tries to used plugins as installed in the environment. The default value istrue
.Specifying
false
is useful when the customsensorbee
command needs to depend on a plugin that is in a special git branch or locally modified.
--help
or -h
When this flag is given, the command shows the usage of itself and exits without doing anything.
--only-generate-source
When this flag is given,
build_sensorbee
doesn’t build asensorbee
command but only generate a source code that can be built bygo build
command. For example:$ build_sensorbee --only-generate-source sensorbee_main.go $ go build -o sensorbee sensorbee_main.go
--out executable_name
or -o executable_name
This option customizes the name of the output executable file. The default is
sensorbee.exe
in Windows andsensorbee
in all other environment. The following command generates an executable namedmy_sensorbee
instead ofsensorbee
:$ build_sensorbee -o my_sensorbee
--source-filename filename
The filename of the Go source code file automatically generated by
build_sensorbee
can be specified by this option. The default value issensorbee_main.go
.$ build_sensorbee --source-filename custom_main.goBy executing this command,
custom_main.go
is generated instead ofsensorbee_main.go
.
--version
or -v
When this flag is given, the command prints the version of thebuild_sensorbee
command.
sensorbee¶
sensorbee
is the main command to manipulate SensorBee. sensorbee
consists of a set of following subcommands:
sensorbee
command can needs to be created by build_sensorbee command and
all the example commands are written as ./sensorbee
to emphasize that
there’s no default sensorbee
command.
See each command’s reference for details.
Flags and Options¶
--help
or -h
When this flag is given, the command prints the usage of itself.
--version
or -v
The command prints the version of SensorBee.
sensorbee run¶
sensorbee run
runs the SensorBee server that manages multiple topologies
that can dynamically modified at runtime.
Basic Usage¶
$ ./sensorbee run -c sensorbee.yaml
Configuration¶
A configuration file can optionally be provided for sensorbee run
command.
The file is written in YAML and has following optional
sections:
logging
network
storage
topologies
logging
¶
The logging
section customizes behavior of the logger. It has following
optional parameters:
target
The
target
parameter changes the destination of log messages. Its value can be of followings:
stdout
: write log messages to stdoutstderr
: write log messages to stderr- file path: a path to the log file
When the value is neither
stdout
norstderr
, it’s considered to be a file path. The default value of this parameter isstderr
.
min_log_level
This option specifies the minimum level (severity) of log messages to be written. Valid values are one ofdebug
,info
,warn
,warning
,error
, orfatal
.warn
can also bewarning
. Whendebug
is given, all levels of messages will be written into the log. When the value iserror
, only log messages witherror
orfatal
level will be written. The default value of this parameter isinfo
.
log_dropped_tuples
The SensorBee server can prints a log message and contents of tuples when they’re dropped from a topology. When this option istrue
, the server writes log messages reporting dropped tuples. When it’sfalse
, the server doesn’t. The default value of this option isfalse
.
log_destinationless_tuples
A destinationless tuple is one kind of dropped tuples that is caused when a source or a stream doesn’t have any destination and it drops a tuple. By settingtrue
to this option, the server reports all destinationless tuples. The default value of this option isfalse
. Note that, to log replication tuples, thelog_dropped_tuples
option also needs to betrue
.
summarize_dropped_tuples
This option turns on or off summarization of dropped tuple logging. Valid values for this option is
true
orfalse
. When its value is thetrue
, dropped tuples are summarized in log messages.Note
At the current version, only
blob
fields are summarized to"(blob)"
. Other configuration parameters will be supported in the future version such as the maximum number of fields, the maximum depths of maps, the maximum length of arrays, and so on.When
false
is specified, each log message shows a complete JSON that are compatible to the original tuple. Although this is useful for debugging, tuples containing large binary data like images may result in disk.The default value of this option is
false
.
Example:
logging:
target: /path/to/sensorbee.log
min_log_level: info
log_dropped_tuples: true
summarize_dropped_tuples: true
network
¶
The network
section has parameters related to server’s network
configuration. It has following optional parameters:
listen_on
This parameter controls how the server expose its listening port. The syntax of the value is likehost:port
.host
can be IP addresses such as0.0.0.0
or127.0.0.1
. Whenhost
is given, the server only listens on the interface with the given host address. If thehost
is omitted, the server listens on all available interfaces, that is, the server accepts connections from any host. The default value of this parameter is:15601
.
Example:
network:
listen_on: ":15601"
storage
¶
The storage
section contains the configuration of storages used for saving
UDSs or other information. It has following optional subsections:
uds
uds
¶
The uds
subsection configures the storage for saving and loading UDSs. It
provides following optional parameters:
type
The type name of the storage.in_memory
is used as the default value.
params
params
has subparameter specific to the giventype
.
Currently, following types are available:
in_memory
fs
Descriptions of types and parameters are provided below:
in_memory
in_memory
saves UDSs in memory. It loses all saved data when the server restarts. This type doesn’t have any parameter.Example:
storage: uds: type: in_memory
fs
fs
saves UDSs in the local file system. It has following required parameters:
dir
dir
has the path to the directory that saved data will be stored.
fs
also has following optional parameters:
temp_dir
temp_dir
has the path to the temporary directory that is used when the UDS writes data. After the UDS has written all the data, the file is move to the directory specified bydir
parameter. The same value asdir
is used by default.The file name of each saved UDS is formatted as
<topology>-<name>-<tag>.state
.Example:
storage: uds: type: fs params: dir: /path/to/uds_dir temp_dir: /tmp
topologies
¶
The topologies
section contains the configuration of topologies in the
following format:
topologies:
name_of_topology1:
... configuration for name_of_topology1 ...
name_of_topology2:
... configuration for name_of_topology2 ...
name_of_topology3:
... configuration for name_of_topology3 ...
... other topologies ...
Topologies listed in this section will be created at the startup of the server based on the sub-configuration of each topology. Following optional configuration parameters are provided for each topology:
bql_file
This parameter has the path to the file containing BQL statements for the topology. All statements are executed before the server gets ready. If the execution fails, the server would exit with an error.
Example:
$ ls
my_topology.bql
sensorbee.yaml
$ cat my_topology.bql
CREATE SOURCE fluentd TYPE fluentd;
CREATE STREAM users AS
SELECT RSTREAM users FROM fluentd [RANGE 1 TUPLES];
CREATE SINK user_file TYPE file WITH path = "users.jsonl";
$ cat sensorbee.yaml
topologies:
my_topology:
bql_file: my_topology.bql
$ ./sensorbee run -c sensorbee.yaml
As a result of these commands above, the server started with sensorbee.yaml
has a topology named my_topology
. The topology has three nodes: fluentd
,
users
, and user_file
.
Note
This is the only way to persist the configuration of topologies at the
moment. Any updates applied at runtime will not be reflected into the bql file.
For example, if the server restarts after creating a new stream in
my_topology
, the new stream will be lost unless it’s explicitly added
to my_topology.bql
manually.
The configuration of a topology can be empty:
topologies:
my_empty_topology:
In this case, an empty topology my_empty_topology
will be created so that
the sensorbee topology create
command doesn’t have to be executed every
time the server restarts.
A Complete Example¶
logging:
target: /path/to/sensorbee.log
min_log_level: info
log_dropped_tuples: true
summarize_dropped_tuples: true
network:
listen_on: ":15601"
storage:
uds:
type: fs
params:
dir: /path/to/uds_dir
temp_dir: /tmp
topologies:
empty_topology:
my_topology:
bql_file: /path/to/my_topology.bql
Flags and Options¶
--config path
or -c path
This option receives the path of the configuration file. By default, the value is empty and no configuration file is used. This value can also be passed throughSENSORBEE_CONFIG
environment variable.
--help
or -h
When this flag is given, the command prints the usage of itself.
sensorbee runfile¶
sensorbee runfile
runs a single BQL file. This command is mainly designed
for offline data processing but can be used as a standalone SensorBee process
that doesn’t expose any interface to manipulate the topology.
sensorbee runfile
stops after all the nodes created by the given BQL file
stops. The command doesn’t stop if it contains a source that generates infinite
tuples or is rewindable. Other non-rewindable sources such as file
stopping
when it emits all tuples written in a file can work well with the command.
Sources generally need to be created with PAUSED
keyword in the
CREATE SOURCE statement. Without PAUSED
, a source can start
emitting tuples before all nodes in a topology can correctly be set up.
Therefore, a BQL file passed to the command should look like:
CREATE PAUSED SOURCE source_1 TYPE ...;
CREATE PAUSED SOURCE source_2 TYPE ...;
...
CREATE PAUSED SOURCE source_n TYPE ...;
... CREATE STREAM, CREATE SINK, or other statements
RESUME SOURCE source_1;
RESUME SOURCE source_2;
...
RESUME SOURCE source_n;
With the --save-uds
option described later, it saves UDSs at the end of its
execution.
Basic Usage¶
$ ./sensorbee runfile my_topology.bql
With options:
$ ./sensorbee runfile -c sensorbee.yaml -s '' my_topology.bql
Configuration¶
sensorbee runfile
accepts the configuration file for sensorbee run
. It
only uses logging
and storage
sections. The configuration file may
contain other sections as well and the same file for sensorbee run
can also
be used for sensorbee runfile
. See
its configuration for details.
Flags and Options¶
--config path
or -c path
This option receives the path of the configuration file. By default, the value is empty and no configuration file is used. This value can also be passed throughSENSORBEE_CONFIG
environment variable.
--help
or -h
When this flag is given, the command prints the usage of itself.
--save-uds udss
or -s udss
This option receives a list of names of UDSs separated by commas. UDSs listed in it will be saved at the end of execution. For example, when the option is
-s "a,b,c"
, UDSs nameda
,b
, andc
will be saved. To save all UDSs in a topology, pass an empty string:-s ""
.By default, all UDSs will not be saved at the end of execution.
--topology name
or -t name
This option changes the name of the topology to be run with the given BQL file. The default name is taken from the file name of the BQL file. The name specified to this option will be used in log messages or saved UDS data. Especially, names of files containing saved UDS data has contains the name of the topology. Therefore, providing the same name as the topology that will be run bysensorbee run
later on allows users to prepare UDSs including pre-trained machine learning models in advance.
sensorbee shell or bql¶
sensorbee shell
or bql
starts a new shell to manipulate the SensorBee
server. The shell can be terminated by writing exit
or typing C-d
.
Both sensorbee shell
and bql
have the same interface, but bql
is
installed by default while the sensorbee
command needs to be built manually
to run sensorbee shell
.
Basic Usage¶
To run sensorbee shell
,
$ ./sensorbee shell -t my_topology
my_topology>
To run bql
,
$ bql -t my_topology
my_topology>
Flags and options¶
--api-version version
This option changes the API version of the SensorBee server. The default value of this option isv1
.
--help
or -h
When this flag is given, the command prints the usage of itself.
--topology name
or -t name
The name of a topology to be manipulated can be specified through this option so thatUSE topology_name
doesn’t have to be used in the shell. The default value is an empty name, that is, no topology is specified.
--uri
This option is used when the SensorBee server is running at non-localhost or using non-default port number (15601). The value should have a format likehttp://host:port/
. The default value of this option ishttp://localhost:15601/
.
sensorbee topology¶
sensorbee topology
, or sensorbee t
, is used to manipulate topologies on
the SensorBee server.
Note
This command is provided because the syntax of BQL statements that controls topologies has not been discussed enough yet.
The command consists of following subcommands:
sensorbee topology create <name>
or sensorbee t c <name>
This command creates a new topology on the SensorBee server. The<name>
argument is the name of the topology to be created.$?
will be 0 if the command is successful. Otherwise, it’ll be non-zero. The command fails if the topology already exists on the server.
sensorbee topology drop <name>
or sensorbee t drop <name>
This command drops an existing topology on the SensorBee server. The<name>
argument is the name of the topology to be dropped.$?
will be 0 if the command is successful. Otherwise, it’ll be non-zero. The command doesn’t fail even if the topology doesn’t exist on the server.
sensorbee topology list
or sensorbee t l
This commands prints names of all topologies that the SensorBee server has, one name per line.
All commands share the same flags and options. Flags and options need to be given after the subcommand name:
$ ./sensorbee topology create --flag --option value my_topology
In this example, a flag --flag
and an option --option value
are
provided. The argument of the command, i.e. the name of topology, is
my_topology
.
Flags and Options¶
--api-version version
This option changes the API version of the SensorBee server. The default value of this option isv1
.
--help
or -h
When this flag is given, the command prints the usage of itself.
--uri
This option is used when the SensorBee server is running at non-localhost or using non-default port number (15601). The value should have a format likehttp://host:port/
. The default value of this option ishttp://localhost:15601/
.
Function Reference¶
Common Mathematical Functions¶
For the functions below, if a given parameter is outside the mathematically valid range for that function (e.g., sqrt(-2)
, log(0)
, div(2.0, 0.0)
) and the return type is float
, then NaN
is returned.
However, if the return type is int
(e.g., div(2, 0)
), there is no NaN
option and an error will occur instead.
abs
¶
abs(x)
Description¶
abs
computes the absolute value of a number.
Parameter Types¶
x
int
orfloat
Return Type¶
same as input
Examples¶
Function Call | Result |
---|---|
abs(-17.4) |
17.4 |
cbrt
¶
cbrt(x)
Description¶
cbrt
computes the cube root of a number.
Parameter Types¶
x
int
orfloat
Return Type¶
float
Examples¶
Function Call | Result |
---|---|
cbrt(27.0) |
3.0 |
cbrt(-3) |
-1.4422495703074083 |
ceil
¶
ceil(x)
Description¶
ceil
computes the smallest integer not less than its argument.
Parameter Types¶
x
int
orfloat
Return Type¶
same as input
The return type is float
for float
input in order to avoid problems with input values that are too large for the int
data type.
Examples¶
Function Call | Result |
---|---|
ceil(1.3) |
2.0 |
ceil(-1.7) |
-1.0 |
degrees
¶
degrees(x)
Description¶
degrees
converts radians to degrees.
Parameter Types¶
x
int
orfloat
Return Type¶
float
Examples¶
Function Call | Result |
---|---|
degrees(3.141592653589793) |
180.0 |
div
¶
div(y, x)
Description¶
div
computes the integer quotient y
/x
of two numbers y
and x
.
If x
is 0.0
(float) then NaN
will be returned; if it is 0
(integer) then a runtime error will occur.
Parameter Types¶
y
int
orfloat
x
- same as
y
Return Type¶
same as input
Examples¶
Function Call | Result |
---|---|
div(9, 4) |
2 |
div(9.3, 4.5) |
2.0 |
exp
¶
exp(x)
Description¶
exp
computes the exponential of a number.
Parameter Types¶
x
int
orfloat
Return Type¶
float
Examples¶
Function Call | Result |
---|---|
exp(1.0) |
2.718281828459045 |
floor
¶
floor(x)
Description¶
floor
computes the largest integer not greater than its argument.
Parameter Types¶
x
int
orfloat
Return Type¶
same as input
The return type is float
for float
input in order to avoid problems with input values that are too large for the int
data type.
Examples¶
Function Call | Result |
---|---|
floor(1.3) |
1.0 |
floor(-1.7) |
-2.0 |
ln
¶
ln(x)
Description¶
ln
computes the natural logarithm of a number.
If the parameter is not strictly positive, NaN
is returned.
Parameter Types¶
x
int
orfloat
Return Type¶
float
Examples¶
Function Call | Result |
---|---|
ln(2) |
0.6931471805599453 |
log
¶
log(x)
log(b, x)
Description¶
log
computes the logarithm of a number x
to base b
(default: 10).
Parameter Types¶
x
int
orfloat
b
(optional)- same as
x
Return Type¶
float
Examples¶
Function Call | Result |
---|---|
log(100) |
2.0 |
log(2.5, 6.25) |
2.0 |
log(2, 8) |
3.0 |
mod
¶
mod(y, x)
Description¶
mod
computes the remainder of integer division y
/x
of two numbers y
and x
.
If x
is 0.0
(float) then NaN
will be returned; if it is 0
(integer) then a runtime error will occur.
Parameter Types¶
y
int
orfloat
x
- same as
y
Return Type¶
same as input
Examples¶
Function Call | Result |
---|---|
mod(9, 4) |
1 |
mod(9.3, 4.5) |
0.3 |
pi
¶
pi()
Description¶
pi
returns the π constant (more or less 3.14).
Return Type¶
float
Examples¶
Function Call | Result |
---|---|
pi() |
3.141592653589793 |
power
¶
power(a, b)
Description¶
power
computes a
raised to the power of b
.
Parameter Types¶
a
int
orfloat
b
- same as
a
Return Type¶
float
The return type is float
even for integer input in order to have a uniform behavior for cases such as power(2, -2)
.
Examples¶
Function Call | Result |
---|---|
power(9.0, 3.0) |
729.0 |
power(2, -1) |
0.5 |
radians
¶
radians(x)
Description¶
radians
converts degrees to radians.
Parameter Types¶
x
int
orfloat
Return Type¶
float
Examples¶
Function Call | Result |
---|---|
radians(180) |
3.141592653589793 |
round
¶
round(x)
Description¶
round
computes the nearest integer of a number.
Parameter Types¶
x
int
orfloat
Return Type¶
same as input
The return type is float
for float
input in order to avoid problems with input values that are too large for the int
data type.
Examples¶
Function Call | Result |
---|---|
round(1.3) |
1.0 |
round(0.5) |
1.0 |
round(-1.7) |
-2.0 |
sign
¶
sign(x)
Description¶
sign
returns the sign of a number: 1 for positive numbers, -1 for negative numbers and 0 for zero.
Parameter Types¶
x
int
orfloat
Return Type¶
int
Examples¶
Function Call | Result |
---|---|
sign(2) |
1 |
sqrt
¶
sqrt(x)
Description¶
sqrt
computes the square root of a number.
If the parameter is negative, NaN
is returned.
Parameter Types¶
x
int
orfloat
Return Type¶
float
Examples¶
Function Call | Result |
---|---|
sqrt(2) |
1.4142135623730951 |
trunc
¶
trunc(x)
Description¶
trunc
computes the truncated integer (towards zero) of a number.
Parameter Types¶
x
int
orfloat
Return Type¶
same as input
The return type is float
for float
input in order to avoid problems with input values that are too large for the int
data type.
Examples¶
Function Call | Result |
---|---|
trunc(1.3) |
1.0 |
trunc(-1.7) |
-1.0 |
width_bucket
¶
width_bucket(x, left, right, count)
Description¶
width_bucket
computes the bucket to which x
would be assigned in an equidepth histogram with count
buckets in the range \([\text{left},\text{right}[\).
Points on a bucket border belong to the right bucket.
Points outside of the \([\text{left},\text{right}[\) range have bucket number \(0\) and \(\text{count}+1\), respectively.
Parameter Types¶
x
int
orfloat
left
int
orfloat
right
int
orfloat
count
int
Return Type¶
int
Examples¶
Function Call | Result |
---|---|
width_bucket(5, 0, 10, 5) |
3 |
Pseudo-Random Functions¶
The characteristics of the functions below are equal to those from the Go rand module. They are not suitable for cryptographic applications.
random
¶
random()
Description¶
random
returns a pseudo-random number in the range \(0.0 <= x < 1.0\).
This function is not safe for use in cryptographic applications. See the Go math/rand package for details.
Return Type¶
float
Examples¶
Function Call | Result |
---|---|
random() |
0.6046602879796196 |
setseed
¶
setseed(x)
Description¶
setseed
initializes the seed for subsequent random()
calls.
The parameter must be in the range \(-1.0 <= x <= 1.0\).
This function is not safe for use in cryptographic applications. See the Go math/rand package for details.
Parameter Types¶
x
float
Trigonometric Functions¶
All trigonometric functions take arguments and return values of type float
.
Trigonometric functions arguments are expressed in radians.
Inverse functions return values are expressed in radians.
String Functions¶
bit_length
¶
bit_length(s)
Description¶
bit_length
computes the number of bits in a string s
.
Note that due to UTF-8 encoding, this is equal to octet_length(s) * 8
,
not necessarily char_length(s) * 8
.
Parameter Types¶
s
string
Return Type¶
int
Examples¶
Function Call | Result |
---|---|
bit_length("über") |
40 |
btrim
¶
btrim(s)
btrim(s, chars)
Description¶
btrim
removes the longest string consisting only of characters in chars
(default: whitespace) from the start and end of s
.
Parameter Types¶
s
string
chars
(optional)string
Return Type¶
string
Examples¶
Function Call | Result |
---|---|
btrim(" trim ") |
"trim" |
btrim("xyxtrimyyx", "xy") |
"trim" |
char_length
¶
char_length(s)
Description¶
char_length
computes the number of characters in a string.
Parameter Types¶
s
string
Return Type¶
int
Examples¶
Function Call | Result |
---|---|
char_length("über") |
4 |
concat
¶
concat(s [, ...])
Description¶
concat
concatenates all strings given as input arguments.
NULL
values are ignored, i.e., treated like an empty string.
Parameter Types¶
s
and all subsequent parametersstring
Return Type¶
string
Examples¶
Function Call | Result |
---|---|
concat("abc", NULL, "22") |
"abc22" |
concat_ws
¶
concat_ws(sep, s [, ...])
Description¶
concat_ws
concatenates all strings given as input arguments s
using the separator sep
.
NULL
values are ignored.
Parameter Types¶
sep
string
s
and all subsequent parametersstring
Return Type¶
string
Examples¶
Function Call | Result |
---|---|
concat_ws(":", "abc", NULL, "22") |
"abc:22" |
format
¶
format(s, [x, ...])
Description¶
format
formats a variable number of arguments x
according to a format string s
.
See the Go package fmt for details of what formatting codes are allowed.
Parameter Types¶
s
string
x
and all subsequent parameters (optional)- any
Return Type¶
string
Examples¶
Function Call | Result |
---|---|
format("%s-%d", "abc", 22) |
"abc-22" |
lower
¶
lower(s)
Description¶
lower
converts a string s
to lower case.
Non-ASCII Unicode characters are mapped to their lower case, too.
Parameter Types¶
s
string
Return Type¶
string
Examples¶
Function Call | Result |
---|---|
lower("ÜBer") |
"über" |
ltrim
¶
ltrim(s)
ltrim(s, chars)
Description¶
ltrim
removes the longest string consisting only of characters in chars
(default: whitespace) from the start of s
.
Parameter Types¶
s
string
chars
(optional)string
Return Type¶
string
Examples¶
Function Call | Result |
---|---|
ltrim(" trim ") |
"trim " |
ltrim("xyxtrimyyx", "xy") |
"trimyyx" |
md5
¶
md5(s)
Description¶
md5
computes the MD5 checksum of a string s
and returns it in hexadecimal format.
Parameter Types¶
s
string
Return Type¶
string
Examples¶
Function Call | Result |
---|---|
md5("abc") |
"900150983cd24fb0d6963f7d28e17f72" |
octet_length
¶
octet_length(s)
Description¶
octet_length
computes the number of bytes in a string s
.
Note that due to UTF-8 encoding, this may differ from the number returned by char_length
.
Parameter Types¶
s
string
Return Type¶
int
Examples¶
Function Call | Result |
---|---|
octet_length("über") |
5 |
overlay
¶
overlay(s, repl, from)
overlay(s, repl, from, for)
Description¶
overlay
replaces for
characters in a string s
with the string repl
, starting at from
.
(Index counting starts at 0.)
If for
is not given, the length of repl
is used as a default.
Parameter Types¶
s
string
repl
string
from
int
for
(optional)int
Return Type¶
string
Examples¶
Function Call | Result |
---|---|
overlay("Txxxxas", "hom", 1) |
"Thomxas" |
overlay("Txxxxas", "hom", 1, 4) |
"Thomas" |
rtrim
¶
rtrim(s)
rtrim(s, chars)
Description¶
rtrim
removes the longest string consisting only of characters in chars
(default: whitespace) from the end of s
.
Parameter Types¶
s
string
chars
(optional)string
Return Type¶
string
Examples¶
Function Call | Result |
---|---|
rtrim(" trim ") |
" trim" |
rtrim("xyxtrimyyx", "xy") |
"xyxtrim" |
sha1
¶
sha1(s)
Description¶
sha1
computes the SHA1 checksum of a string s
and returns it in hexadecimal format.
Parameter Types¶
s
string
Return Type¶
string
Examples¶
Function Call | Result |
---|---|
sha1("abc") |
"a9993e364706816aba3e25717850c26c9cd0d89d" |
sha256
¶
sha256(s)
Description¶
sha256
computes the SHA256 checksum of a string s
and returns it in hexadecimal format.
Parameter Types¶
s
string
Return Type¶
string
Examples¶
Function Call | Result |
---|---|
sha256("abc") |
"ba7816bf8f01cfea414140de5dae2223b00361a396177a9cb410ff61f20015ad" |
strpos
¶
strpos(s, t)
Description¶
strpos
returns the index of the first occurrence of t
in s
(index counting starts at 0) or -1 if it is not found.
Parameter Types¶
s
string
t
string
Return Type¶
int
Examples¶
Function Call | Result |
---|---|
strpos("high", "ig") |
1 |
substring
¶
substring(s, r)
substring(s, from)
substring(s, from, for)
Description¶
substring(s, r)
extracts the substring matching regular expression r
from s
.
See the Go regexp package for details of matching.
substring(s, from, for)
returns the for
characters of str
starting from the from
index.
(Index counting starts at 0.)
If for
is not given, everything until the end of str
is returned.
Which of those behaviors is used depends on the type of the second parameter (int
or string
).
Parameter Types¶
s
string
r
string
from
int
for
(optional)int
Return Type¶
string
Examples¶
Function Call | Result |
---|---|
substring("Thomas", "...$") |
"mas" |
substring("Thomas", 1) |
"homas" |
substring("Thomas", 1, 3) |
"hom" |
Time Functions¶
distance_us
¶
distance_us(u, v)
Description¶
distance_us
computes the signed temporal distance from u
to v
in microseconds.
Parameter Types¶
u
timestamp
v
timestamp
Return Type¶
int
Examples¶
Function Call | Result |
---|---|
distance_us("2016-02-09T05:40:25.123Z"::timestamp, "2016-02-09T05:41:25.456Z"::timestamp) |
60333000 |
distance_us(clock_timestamp(), clock_timestamp()) |
2 |
clock_timestamp
¶
clock_timestamp()
Description¶
clock_timestamp
returns the current date and time in UTC.
Return Type¶
timestamp
now
¶
now()
Description¶
now
returns the date and time in UTC of the point in time when processing of the current tuple started.
In particular and as opposed to clock_timestamp
, the timestamp returned by now()
does not change during a processing run triggered by the arrival of a tuple.
For example, in
SELECT RSTREAM clock_timestamp() AS a, clock_timestamp() AS b,
now() AS c, now() AS d FROM ...
the values of a
and b
are most probably different by a very short timespan, but c
and d
are equal by definition of now()
.
now
cannot be used in an EVAL
statement outside of a stream processing context.
Return Type¶
timestamp
Array Functions¶
Other Scalar Functions¶
Aggregate Functions¶
Aggregate functions compute a single result from a set of input values.
It should be noted that except for count
, these functions return a NULL
value when no rows are selected.
In particular, sum
of no rows returns NULL
, not zero as one might expect, and array_agg
returns NULL
rather than an empty array when there are no input rows.
The coalesce
function can be used to substitute zero or an empty array for NULL
when necessary.
Also note that most aggregate functions ignore singular NULL
values in their input, i.e., processing is done as if this row had not been in the input.
(One notable exception is the array_agg
function that includes input NULL
values in its output.)
array_agg
¶
array_agg(x)
Description¶
array_agg
returns an array containing all input values, including NULL
values.
There is no guarantee on the order of items in the result.
Use the ORDER BY
clause to achieve a certain ordering.
Parameter Types¶
x
- any
Return Type¶
array
avg
¶
avg(x)
Description¶
avg
computes the average (arithmetic mean) of all input values.
Parameter Types¶
x
int
orfloat
(mixed types are allowed)
Return Type¶
float
bool_and
¶
bool_and(x)
Description¶
bool_and
returns true
if all input values are true, otherwise false
.
Parameter Types¶
x
bool
Return Type¶
bool
bool_or
¶
bool_or(x)
Description¶
bool_or
returns true
if at least one input value is true, otherwise false
.
Parameter Types¶
x
bool
Return Type¶
bool
count
¶
count(x)
count(*)
Description¶
count
returns the number of input rows for which x
is not NULL
, or the number of total rows if *
is passed.
Parameter Types¶
x
- any
Return Type¶
int
json_object_agg
¶
json_object_agg(k, v)
Description¶
json_object_agg
aggregates pairs of key k
and value v
as a map.
If both key and value are NULL
, the pair is ignored.
If only the value is NULL
, it is still added with the corresponding key.
It is an error if only the key is NULL
.
It is an error if a key appears multiple times.
A map does not have an ordering, therefore there is no guarantee on the result map ordering, whether or not ORDER BY
is used.
Parameter Types¶
k
string
v
- any
Return Type¶
map
max
¶
max(x)
Description¶
max
computes the maximum value of all input values.
Parameter Types¶
x
int
orfloat
(mixed types are allowed)
Return Type¶
same as largest input value
median
¶
median(x)
Description¶
median
computes the median of all input values.
Parameter Types¶
x
int
orfloat
(mixed types are allowed)
Return Type¶
float
min
¶
min(x)
Description¶
min
computes the minimum value of all input values.
Parameter Types¶
x
int
orfloat
(mixed types are allowed)
Return Type¶
same as smallest input value