JGrok operators


[top]

Assignment Operator

The assignment operator has the following syntax in JGrok:

 id = expression[;] 
The assignment syntax closely follows that of C or java, but the semicolon is optional. Grok has a slightly different syntax:
 id := expression
The C-like = has been replaced with a := from Pascal; there can be no semicolon at the end of the statement.

The assignment operator evaluates a value of expression and assigns it to id. It does not return a value, so

 var2 = ( var1 = "xyz") 
has no meaing. Assignment always is always "by value", i.e. if A is being assigned to B, a copy of A is created and placed in B. Future changes to A do not affect the value of B.



[top]

Regular Expression Matching

This family of operators is unique to JGrok. They are not available in Grok.

Match operator

The syntax for the match operator is:

 expression  =~ pattern 
Both the expression and the pattern can be expressions, but they both must evaluate to something of type string, otherwise the JGrok interpreter will produce an error.

The match operator is relational; returns either true or false, depending on whether the strings match or not. "Match" means that the expression conforms to the pattern defined by pattern.

The pattern can be any Java regular expression. JGrok differs from many other regular expression matching tools and operators in that it requires the pattern to match the comlete string (rather than any substring) for the match to be considered successful. For example,

"hello" =~ "he.*" 
returns true, because "he.* matches the entire word "hello", with the first two letters of the pattern matching the first two letters of the string, and the .* construct matching everything else. However,
"hello" =~"he"
will return false, because the pattern "he" matches only part of the string, and not the whole string. To receive standard Unix regexp matching behaviour in JGrok, simply surround your patterns with ".*" on both sides. For example, to fix the previous example so that it works as in Unix, we could write:
"hello" =~ ".*he.*"
This expression returns true.

Regular Expression Guide.

Sun Micorsystems maintains a complete guide to Java regular expressions at this web address: http://java.sun.com/docs/books/tutorial/extra/regex/.

[top]

Not-match operator

The syntax for the not-match operator is:

 expression  !~ pattern 
The not-match operator works in every way the same as the match operator described above, but returns the negated result of the match test: "true" if the pattern does not match the expression, and false if it does. For more details, refer to the match operator description above.

[top]

Relational operators

Grok has a fairly standard set of relational operators. The table below gives a complete list.

Operator Meaning
== equal
!= not equal
<strictly less than
>strictly greater than
<=less than or equal to
>=greater than or equal to

Applying relational operators to scalars. When used to compare scalar types (int, float, string, boolean), the types of both operands must be compatible, i.e. both operands must be numbers (int, float), or strings, or booleans. A string cannot be compared to a n integer, or a float to a boolean. As an additional restriction, only "equal" and "not equal" operators can be applied to booleans. Operators >, <, >=, <= cannot be used to compare boolean values.

Applying relational operators to sets and relations. As with scalars, both operands compared by a relational operator must be of compatible types. Sets can be only compared to sets, esets to esets, and tsets to tsets.

The equality and inequality operators function exactly as you would expect them to - two sets or relations are found equal if they contain the same number of tuples, and for each tuple in one set there exists a tuple equal to it in another set.

Operators >, <, >=, <= can be applied to sets and relations, but produce unexpected results, and the result of application of any of these operators to non-scalar operands is best treated as undefined. !!TODO!!

[top]

Arithmetic operators

Four standard arithmetic operators are supported: addition (+), substraction (-), multiplication (*) and division (/). Arithmetic operators can only be applied to numeric scalars (integers and floats).

[top]

Logical operators

Three logical operators are available: AND (&&), OR (||) and NOT (!).

AND returns true only if both its operands are true.

OR returns true if at least one of its operands is true.

NOT is a prefix operator and returns true if its sole operand is not ture.

Logical operators can be applied to Boolean operands only. Unlike many other languages, JGrok does not treat non-zero integer values or non-empty strings as being true.

[top]

Concatenation operator

JGrok provides a concatenation operator (+). Concatenation operates on two strings and returns a string that is produced by concatenating the two strings together.

Since the + symbol is overloaded and used for several other operators as well, JGrok uses the operand types to determine what operation to perform. If at least one of operands of the + operator is a string, the operation is assumed to be concatenation. If the other operand is something other than a string, it is converted to its string representation before the concatenation. This approach yields predictable results for scalar types; if you try to concatenate a string with a set or a relation, the relation will be converted to a string in a peculiar way: it will be transformed to its internal name. The example below illustrates:

>> set = {1,2,3}
>> "foo " + set
foo ca.uwaterloo.cs.ql.fb.NodeSet@1f934ad
>>

Because of this behaviour, using concatenation on sets and relations is discouraged.

[top]

Union operator

The union operator (+) operates on sets or relations, and returns a set or relation composed of tuples that occur in either one of its operands. Duplicates are eliminated. The two operands must be compatible with each other; it is impossible to take a union of a set and a relation, or of two relations with different column counts.

[top]

Difference operator

The difference operator (-) operates on sets or relations, and returns the difference between its operands, i.e. a set (or relation) consisting of all tuples in the first operand that are not in the second operand. Just like with the union operator above, the two operands must be compatible with each other; it is impossible to take a union of a set and a relation, or of two relations with different column counts.

An example below illustrates the use of the difference operator to compute, given sets of carnivores and herbivores, a list of animals which are pure carnivores (i.e. do not eat plants as well as meant):

>> carnivores = {"wolf", "lion", "bear"}
>> herbivores = {"cow", "deer", "bear", "elk"}
>> not_omnivores = carnivores - herbivores
>> not_omnivores
wolf
lion


[top]

Composition operator

The composition operator, denoted * or o, can be used to join two relations, or a set and a relation. The join is always performed on the rightmost column of the first operand and the leftmost column of the second operand. The result of such operation is a relation made of tuples from the first operand "joined" to the tuples of the left operand that match them in the field the join is performed on. Tuples that do not have a match do not appear in the result. The column on which the join was performed is discarded and does not appear in the result. The examples below illustrate the use of the composition operator.

The first example creates a relation containing first and last names of several famous people.

>> people  = {"paul"} X {"allen"} + {"bill"} X {"gates", "clinton"} + {"bart", "homer", "marge"} X {"simpson"}
>> people
paul allen
bill gates
bill clinton
bart simpson
homer simpson
marge simpson
We can use the composition operator to find out which people in the relation are named "bill". To do that, we compose a set consisting of just the string "bill" with the relation "people".
>> {"bill"} * people
bill gates
bill clinton
In a similar fashion, we can find out who has the last name "simpson":
>> people o {"simpson"}
bart simpson
homer simpson
marge simpson
Naturally, we can also compose relations with sets that have several members in them. For example, here is a list of people named "paul" or "bill":
>> {"paul", "bill"} o people
paul allen
bill gates
bill clinton
We can also compose two relations together. Note that the column on which the composition is performed does not appear in the result set:
>> ( {"buffalo"} X {"bill"} ) * people
buffalo gates
buffalo clinton
>> ( {"buffalo", "susquahana"} X {"bill"} ) * people
buffalo gates
buffalo clinton
susquahana gates
susquahana clinton


[top]

Intersection operator

The intersection operator (^) can be applied to two sets or relations to determine their intersection, i.e. common components. The result of execution would be a set or relation composed of tuples that exist in both operands. The two operands to the set operator must be compatible: you cannot take an intersection of a relation and a set, or two relations with different column counts.

[top]

Projection operator

The projection operator (.) is used for projecting a set through a relation. The result of such an operation is a set ( or a relation, if the original relation had more than two columns) containing such tuples from the original relation that their entries in the projection column are equal to one of the members of the set being projected. Only the first or the last column of a relation can be used as a projection column.

The examples below, which operate on a relation people, specified in the description of composition operator above, illustrate the use of the projection operator.

In the first example, the projection operator can be used to find out the last names of people whose first names are "paul" or "bill". Note that this example is similar to one given above for the composition operator, but in this case a set, rather than a relation, is returned. In this case the first column of the "people" relation is used as the projection column.

>> {"paul", "bill"} . people
allen
gates
clinton
If the first operand of the projection operator is a relation and the second is a set, the last column of the relation is used as the projection column:
>> people . {"simpson", "allen"}
paul
bart
homer
marge
If the projection operator is applied to a relation with more than two columns, the result of the operation is a relation rather than a set. In the next example, we creat a three-column relation and project a set through it, to generate a relation with one less column as a result:
>> more_people  = people ** inv people
>> more_people
paul allen paul
bill gates bill
bill clinton bill
bart simpson bart
bart simpson homer
bart simpson marge
homer simpson bart
homer simpson homer
homer simpson marge
marge simpson bart
marge simpson homer
marge simpson marge
>> more_people  . { "marge", "bart"}
bart simpson
homer simpson
marge simpson
>>


[top]

Transitive closure and reflective transitive closure

The transitive closure operator (+) computes transitive closure on a relation. The relation must be binary. The reflective transitive closure operator (*) is identical to the transitive closure in application, but the results of reflective transitive closure include self-loops. The examples below illustrate the use of the transitive closure operators.

The example below shows the creation of a relation which compares relative speeds of different modes of transportation:

>> faster = {"rocket"} X {"car"} + {"car"} X {"bicycle"} + {"bicycle"} X {"pedestrian"}
>> faster
rocket car
car bicycle
bicycle pedestrian
Using transitive closure, from this relation we can obtain speed comparisons between all transportation modes listed:
>> faster+
rocket car
rocket bicycle
rocket pedestrian
car bicycle
car pedestrian
bicycle pedestrian
The resulting relation gives us, for a given transportation mode, all modes that are slower than it. If, on the other hand, we want a list of transportation modes that are not faster than a give one, we would use the reflective transitive closure:
>> faster*
rocket rocket
rocket car
rocket bicycle
rocket pedestrian
car car
car bicycle
car pedestrian
bicycle bicycle
bicycle pedestrian
pedestrian pedestrian
You will notice that the result of the reflective transitive closure operator is augmented with self-loops (e.g "car car"), but is otherwise the same as the result of ordinary transitive closure.

[top]

Domain operator

The domain operator (dom) computes the domain of its operand; the operand has to be a set or a relation. What this operator really returns is the set of all entities contained in the first column (column zero) of a relation; duplicates are eliminated. The domain operator can be applied to relations with any number of columns, but is most meaningful for two-column relations, which can be seen as mapping functions having a domain and a range; when applied to a set, the domain operation returns the set itself.

The example below illustrates how to use the domain operator.

>> map = {1} X {"one"} + {2} X {"two"} + {3} X {"three"}
>> map
1 one
2 two
3 three
>> dom map
1
2
3


[top]

Range operator

The range operator (rng) computes the range of its operand; the sole operand has to be a set or relation. What is truly returned is the set of all entities contained in the last column of a relation, with duplicates (if any) eliminated. The range operator can be applied to relations with any number of columns, but is most meaningful for two-column relations, which can be interpreted as mapping functions with a domain and a range; when applied to a set, the range operation returns the set itself.

The example below illustrates how to use the range operator.

>> map = {1} X {"one"} + {2} X {"two"} + {3} X {"three"}
>> map
1 one
2 two
3 three
>> rng map
one
two
three


[top]

Entities operator

The entities operator (ent) is a prefix operator (like dom and rng above) that returns a set containing all the entities in its operand, which can be a set or a relation; duplicates, if any, are eliminated. When applied to a set, the entities operation returns the set itself.

[top]

Inverse operator

The inverse operator (inv) is a prefix operator that inverses the order of columns in its operand, which can be a set or a relation. When applied to a set, this operator returns the set itself.

[top]

Identity operator

The identity operator (id) returns a binary relation in which the first column consists of all entities in the operand (which must be a set or a relation). The entries in the second column of each tuple of the result are equal to the entries in the first column. The example below illustrates:

>> map = {1} X {"one"} + {2} X {"two"} + {3} X {"three"}
>> map
1 one
2 two
3 three
>> id map
1 1
2 2
3 3
one one
two two
three three
>> numbers = {1,2,3}
>> id numbers
1 1
2 2
3 3


[top]

Cross product operator

The cross product operator (X) returns the cross-product of its two operands, which must be sets. The result is a binary relation. For example:

>> {"one", "two", "three"} X {1,2,3}
one 1
one 2
one 3
two 1
two 2
two 3
three 1
three 2
three 3


[top]

Cardinality operator

The cardinality operator (#) returns the number of tuples or entries in a set or relation. For example:

>> map = {1} X {"one"} + {2} X {"two"} + {3} X {"three"}
>> #map
3
>> strings = {"foo", "bar", "buzz", "lightyear"}
>> #strings
4


[top]

Cat Composition operator

The cat. compostion operator (**) is very similar to the composition operator described above, with one difference: the column on which the join is performed is not omitted from the result. This means that you can use cat. composition to construct relations with more than two columns in them. An example below illustrates by creating two binary relations and joining them to create a relation with three columns.

>> map = {1} X {"one"} + {2} X {"two"} + {3} X {"three"}
>> map1 = {"one"} X {"0x1"} + {"two"} X {"0x2"} + {"three"} X {"0x3"}
>> big_map = map ** map1
>> big_map
1 one 0x1
2 two 0x2
3 three 0x3


[top]

Set construction operator

The set construction operator ({ ... }) can be used to construct a set. The set construction operator begins with an opening curly bracket ({), continues with a comma-separated list of values to be included in the set, and ends with the closing curly bracket (}). The values in the comma-separated list can be either scalars (integers, floats, booleans and strings) or sets or relations. Sets and relations, however will be converted to their internal string name before being added to the set (see more details on the conversion in the description of the concatenation operator. For this reason, adding sets or relations to other sets as members is discouraged.

The operator returns the set with all values in the comma-separated list as its values. Duplicates, if any, are eliminated.

Examples of the set construction operator in use are scattered throughout this document.

[top]

Selection operator

The selection operator ([ ... ]) allows to select tuples from a set or relation which satisfy a certain condition. The general format of this operator is

  operand [ expression ]
where operand is a set or a relation, and expression is any boolean expression. The boolean expression can refer to the entity in the first column of the operand as &0, to the entity in the second column as &1, and so on. The expression will be evaluated for every tuple in the operand and the operator will return a relation or set containing only those tuples for which the expression evaluated to true.

References to column content (e.g. &0, &1 and so on) are treated as strings inside the selection bracket; this might lead to expressions evaluating to unexpected values. For example, &0 + 1 is string concatenation and not addition even if &0 does contain a numeric value for some tuples.

The second form of the selection operator is:

  operand [ relname expression1 expression2 ... ]
In this form, operand is again a set or a relation, relname is a name of a relation, and expression1, expression2 and so on are expression evaluating to scalars. In this case, the values of expression1, expression2 and so on will be evaluated for every tuple in the operand, and the set or relation returned by the operator will contain only those tuples for which there existed a tuple in relation relname with the values to which the expressions evaluated.

The following examples demonstrate the use of the first form of the selection operator:

>> map = {1} X {"one"} + {2} X {"two"} + {3} X {"three"}
>> map [&0 == 1]
1 one
>> map [&0 > 1]
2 two
3 three
>> map [&1 =~ ".*e"]
1 one
3 three
This example shows the second form of the selection operator.
>> big_map
1 one 0x1
2 two 0x2
3 three 0x3
>> key
1 one 0x1
>> big_map [key &0 &1 &2]
1 one 0x1


[top]

Output redirection operators

The output redirection operators redirect new (>>) and redirect append (>>>) mimic I/O redirection abilities of operating system shells and allow the output of a command that would normally go to the screen to be redirected to a file, either overwriting the existing file (if any) in case of redirect new, or appending to the existing file in case of redirect append.

The syntax for redirection operators is as follows:

command >> string
In this case, command is any JGrok command, and string is a string specifying the output file. The string must specify the file either as an absolute pathname, or as a relative pathname from current directory; the string does not undergo tilde expansion or any other kind of expansion that operating system shells commonly perform.