Spécifications
This page defines the specifications of a BlueIron descriptor with the following semantics:
- The attribute names suffixed with a star
*
are mandatory. - The types between parenthesis
()
are Options of the given type. - The types between brackets
[]
denotes a list of the given type.
Root steps - Global
Ce tableau contient les propriétés possibles sur le root-tag.
Node | Attribute | Type | Sub node name | Comments |
---|---|---|---|---|
root | ||||
blueiron | output | String | ID of the output step. | |
input | ||||
step |
Root steps - InputStep
The goal of this step is to describe the structure of a set of input rows.
Node | Attribute | Type | Sub node name | Comments |
---|---|---|---|---|
input | column | |||
id* | String | ID of the input. Must be unique across all the steps. | ||
alias | String | Alias of the input. Must be unique across all the steps. | ||
name | String | |||
description | String | |||
column | id* | String | ID of the column. | |
type* | Type | Type of the column’s content. | ||
label | String |
Root steps - Step
Generic step container that can contain multiple sub steps.
Node | Attribute | Type | Sub node name | Comments |
---|---|---|---|---|
step | id* | String | ||
source* | String | ID of the a step to read from. | ||
name | String | |||
description | String | |||
alias | String | |||
enabled | (Boolean) | |||
project | ||||
cube | ||||
break | ||||
sort | ||||
filter | ||||
union | ||||
remove | ||||
update | ||||
skip | ||||
notify | ||||
step |
Normal steps - Base step
Simple container step that can be used to group sub-steps.
Node | Attribute | Type | Sub node name | Comments |
---|---|---|---|---|
step | id | String | ||
alias | String | |||
name | String | |||
description | String | |||
enabled | (Boolean) | |||
source | String | ID of the step to read from, otherwise uses the previous output data. | ||
project | ||||
cube | ||||
break | ||||
sort | ||||
filter | ||||
union | ||||
remove | ||||
update | ||||
skip | ||||
notify | ||||
step |
Normal steps - Projection
This step allows to select columns to be kept and/or to create new columns.
Node | Attribute | Type | Sub node name | Comments |
---|---|---|---|---|
project | id | String | ||
alias | String | |||
name | String | |||
description | String | |||
enabled | (Boolean) | |||
insert | String | before,after,position,exclude (by default: exclude ). |
||
index | int | only if insert is ‘position’ and column attribute is not set. |
||
column | String | only if insert is ‘position’ and index attribute is not set. |
||
column | ||||
column | id* | String | ID of the column. | |
type* | Type | Type of the column’s content. | ||
label | String |
Normal steps - Update
This step allows to modify the values of columns.
Node | Attribute | Type | Sub node name | Comments |
---|---|---|---|---|
update | id | String | ||
alias | String | |||
name | String | |||
description | String | |||
enabled | (Boolean) | |||
buffer | int | Defines the number of input/output rows to hold in memory (accessible with inputRows and outputRows . |
||
column | ||||
column | id* | String | ID of the column. | |
type* | Type | Type of the column’s content. | ||
label | String |
Normal steps - Remove
This step allows to remove some columns.
Node | Attribute | Type | Sub node name | Comments |
---|---|---|---|---|
remove | id | String | ||
alias | String | |||
name | String | |||
description | String | |||
enabled | (Boolean) | |||
column | ||||
column | id* | String | ID of the column to remove. |
Normal steps - Notification
This step will send notifications to registered listeners. Listeners can be added
with blueIron.addListener(listener)
.
Node | Attribute | Type | Sub node name | Comments |
---|---|---|---|---|
notif | id | String | ||
alias | String | |||
name | String | |||
description | String | |||
enabled | (Boolean) | |||
code | String | Code of the notification event. | ||
input | String | ID of the column of the notification event. | ||
type | String | block,row (by default: row ) |
Normal steps - Union
This step will make a union between different input steps.
Node | Attribute | Type | Sub node name | Comments |
---|---|---|---|---|
notif | id | String | ||
alias | String | |||
name | String | |||
description | String | |||
enabled | (Boolean) | |||
source | ||||
source | id* | String | ID of the step to read from. |
Normal steps - Skip
This step will forward rows that match a filter to the substep, otherwise the substep processing will be skipped (but all the rows will be outputted anyway).
Node | Attribute | Type | Sub node name | Comments |
---|---|---|---|---|
skip | id | String | ||
alias | String | |||
name | String | |||
description | String | |||
enabled | (Boolean) | |||
filter | Filter to apply on the input rows. | |||
step | Step to apply when a row matches the filter. |
Normal steps - Filter
This step will output only the rows matching the filter.
Node | Attribute | Type | Sub node name | Comments |
---|---|---|---|---|
filter | id | String | ||
alias | String | |||
name | String | |||
description | String | |||
enabled | (Boolean) | |||
matching | String | all,any (by default: all ). Defines if all the criteriums must match (all ) or only one of them (any ). |
||
column | Column processor criterium | |||
script | Script processor criterium | |||
expr | Expr processor criterium |
Normal steps - Sort
Reads all the input rows and output them sorted.
Node | Attribute | Sub node name | Comments |
---|---|---|---|
name | name | type | name |
id | String | ||
name | String | ||
description | String | ||
enabled | (Boolean) | ||
column | |||
script | |||
value | |||
expr | |||
SortColumn | |||
id* | String | ||
direction | [[#divers_option | ([String])]] | |
nullValue | [[#divers_option | ([String])]] | |
emptyString | [[#divers_option | ([String])]] | |
SortProc | |||
type | Type | ||
direction | [[#divers_option | ([String])]] | |
nullValue | [[#divers_option | ([String])]] | |
emptyString | [[#divers_option | ([String])]] | |
Normal steps - Break
Generates title and total lines on breaks.
Node | Attribute | Sub node name | Comments |
---|---|---|---|
name | name | type | name |
id | String | ||
name | String | ||
description | String | ||
enabled | (Boolean) | ||
data | (Boolean) | ||
forceDataType | (Boolean) | ||
firstRowsToSkip | (Integer) | ||
hideIfNoData | (Boolean) | ||
inheritsComputations | (Boolean) | ||
resetComputations | [String] | ||
condition | |||
force | |||
restrict | |||
display | |||
titleDisplay | |||
title | |||
data | |||
totalDisplay | |||
total | |||
computations | |||
break | |||
Condition | |||
id | String | ||
type | [String] | ||
TitleTotal | |||
enabled | (Boolean) | ||
row | [String] | ||
type | [String] | ||
Normal steps - Cube
Creates a data cube given a pivot.
Node | Attribute | Sub node name | Comments |
---|---|---|---|
name | name | type | name |
id* | String | ||
name | String | ||
description | String | ||
enabled | (Boolean) | ||
key | |||
group | |||
values | |||
Key | |||
id | String | ||
value | |||
label | |||
ValueLabel | |||
type | [String] | ||
Group | |||
column | |||
Column | |||
id* | String | ||
Normal steps - Join
Joins the current step with another one.
Node | Attribute | Sub node name | Comments |
---|---|---|---|
name | name | type | name |
id | String | ||
name | String | ||
description | String | ||
enabled | (Boolean) | ||
with* | String | ||
type* | [String] | ||
leftIndex | |||
rightIndex | |||
constraint | |||
Condition | |||
id | String | ||
type | [String] | ||
Processors - Copy/Column
Data based on an existing column.
Node | Attribute | Type | Sub node name | Comments |
---|---|---|---|---|
copy | id* | String | ID of the column to put the value to. | |
source | String | ID of the column to copy the value from. It will be the same as id if not specified. |
||
type | Type | Type of the column, if different from the one specified by id . |
Note: the node copy
can also be named column
.
Processors - Value
Allows data combination of other columns with static values. See the documentation.
Node | Attribute | Type | Sub node name | Comments |
---|---|---|---|---|
value | id* | String | ID of the column to put the value to. | |
type | Type | Type of the column, if different from the one specified by id . |
Processors - Script
Scripted value using JavaScript.
Node | Attribute | Type | Sub node name | Comments |
---|---|---|---|---|
script | id* | String | ID of the column to put the value to. | |
type | Type | Type of the column, if different from the one specified by id . |
Processors - Expr
Scripted value using JexlScript.
Node | Attribute | Type | Sub node name | Comments |
---|---|---|---|---|
expr | id* | String | ID of the column to put the value to. | |
type | Type | Type of the column, if different from the one specified by id . |
Processors - Compute
Extended specification of Script.
Node | Attribute | Type | Sub node name | Comments |
---|---|---|---|---|
compute | id* | String | ID of the column to put the value to. | |
input | String | ID of the column to put in the inputValue variable. |
||
type | Type | Type of the column, if different from the one specified by id . |
||
filter | Filter expression. Any row that matches this expression will be skipped. | |||
pre | Expression to execute before the processing starts. | |||
each | Expression to execute on each row. | |||
post | Expression to execute after the processing completed. |
Processors - Expression
Extended specification of Expr.
Node | Attribute | Type | Sub node name | Comments |
---|---|---|---|---|
expression | id* | String | ID of the column to put the value to. | |
input | String | ID of the column to put in the inputValue variable. |
||
type | Type | Type of the column, if different from the one specified by id . |
||
filter | Filter expression. Any row that matches this expression will be skipped. | |||
pre | Expression to execute before the processing starts. | |||
each | Expression to execute on each row. | |||
post | Expression to execute after the processing completed. |
Processors - ForEach
Loops on the selected columns. Useful to treat dynamic columns created columns by a Cube.
Node | Attribute | Type | Sub node name | Comments |
---|---|---|---|---|
foreach | id* | String | ID of the cube to loop on. |
Misc - Type
A type is a String attribute that can contain the following values:
- String
- Boolean
- Integer
- BigInteger
- Double
- BigDecimal
- Date
- Long
Misc - Option
The attribute having their type defined between parenthesis are meant to contain an option. For instance an attribute with the type (Boolean)
could contain ‘true’, ‘false’ or an option content.
An option content has always the syntax #{...}
and can contain (very) simple expressions that use external variables, for instance: #{myVariable
==
'value'}
. Note that the syntax has to be stricly respected: spaces between operator and operands is mandatory and only quote is supported to denote a string (you can still put a number).
External variables can be injected programmaticaly in BlueIron with the following code:
BlueIron bir = ...
Map<String, Object> options = bir.getOptions();
options.put("myVariable", "value");
The supported operators are described in the table below. Since an option will be contained in an XML attribute, alternate syntax for each of them is allowed to keep readability and XML-compliance.
Operator | Alternate | Description |
---|---|---|
== | eq | Equal |
< | lt | Less than |
<= | le | Less or equal than |
> | gt | Greater than |
>= | ge | Greater or equal than |
!= | ne | Not equal |
Both syntax are the same: #{value < 4.55}
and #{value lt 4.55}
.
Misc - Value content
This type is a very simple and straightforward format meant to easily mix data from different columns together. Basically, it is a string with references with the format ${refName}
where refName
could be the id of a column or a computation/expression.
Assuming that the columns FIRSTNAME
and LASTNAME
exist, the following step will create a new column named FULLNAME
with both values separated by a space:
<project>
<column id="FULLNAME">${FIRSTNAME} ${LASTNAME}</column>
</project>
Since the reference can be of any supported type, it is possible to specify some arguments, separated by commas:
<project>
<column id="DESC">${FIRSTNAME} ${LASTNAME} was born on ${BIRTHDATE,dddd}, the ${BIRTHDATE,dd/MM/yyyy}</column>
</project>
The arguments are handled by a Transformer
which is implemented by the class DefaultTransformer
in BlueIron (see here). It provides the following features:
Type | Argument | Description |
---|---|---|
String | action | - lower : Makes the string lower case. |
- upper : Makes the string upper case.
- fupper : Makes the first char upper case.
- substring : Takes the substring of the string from a start index and an end index.
- substr : Takes the substring of the string from a start index and a length. |
| Date | pattern | Formats the date with the given pattern (SimpleDateFormat
pattern). |
| Number | pattern[,decimalSeparator,[groupSeparator]] | Formats the number with the specified pattern, decimal separator and group separator. |
Misc - Javascript
BlueIron provides avanced script features by using the JDK JavaScript engine.
The input row will always be describec by the variable inputRow
or the most recent alias. There are two ways to access the rows input: either by calling the column’s name as a function (JDK7) or to use the standard get
function (JDK6 and further).
<root>
<input id="employees" alias="e">
<column id="ID" type="String" />
<column id="FIRSTNAME" type="String" />
<column id="LASTNAME" type="String" />
</input>
<step id="final" description="Simple projection" source="employees">
<project insert="after">
<script id="FULLNAME1" type="String"></script>
<script id="FULLNAME2" type="String"></script>
</project>
</step>
</root>
Performance tip: prefer the usage of JexlScript, especially with the JDK6. Even if the scripts are compiled, BlueIron will perform much better when a lot of rows are involved.
Misc - JexlScript
Provides basic expressions parsing with boolean operators. Jexl is an open source Apache Project.
Compared with a JavaScript expression, Jexl is much more readable and performant, especially for simple computations. However, Jexl is limited to simple expressions, hence if you need ‘real’ language features (loops for instance), you’ll need to fallback to JavaScript.
The script below shows the difference between a Jexl expression and a JavaScript computation:
<root>
<input id="salaries" alias="e">
<column id="ID" type="String" />
<column id="SALARY" type="BigDecimal" />
<column id="FEES" type="BigDecimal" />
</input>
<step id="final" description="Simple projection" source="salaries">
<project insert="after">
<expr id="SUM1">SALARY + FEES</script> <!-- by default the type is BigDecimal -->
<script id="SUM" type="BigDecimal">e.get("SALARY").add(e.get("FEES"))</script> <!-- by default the type is String -->
</project>
</step>
</root>
Misc - Jexl functions
All the functions below are callable directly in any jexl script.
- unique(str), Integer
- Returns 1 if str has already been seen, else 0.
unique("H"); //1 (H was not seen) unique("H"); //0 (H was seen) unique("B"); //1 (B was not seen) unique("C"); //1 (C was not seen) unique("B"); //0 (B was seen)
- distinct(str), Integer
- Returns the number of distinct str that have been seen.
distinct("A") //1 (one value has been seen) distinct("B") //2 (two values have been seen) distinct("A") //2 (A was already seen) distinct("C") //3 (three values have been seen) distinct("B") //3 (B was already seen)
- count(str), Integer
- Returns the number of times str has been seen.
count("A"); //1 (A was not seen) count("B"); //1 (B was not seen) count("A"); //2 (A was already seen one time) count("C"); //1 (C was not seen) count("A"); //3 (A was already seen two times) count("C"); //2 (C was already seen one time)
- list(str, [separator]), String
- Returns a list of all str that have been seen, separated by separator
(comma by default).
list("A"); //A list("B"); //A,B list("A"); //A,B,A list("C"); //A,B,A,C list("A"); //A,B,A,C,A list("C"); //A,B,A,C,A,C
- set(str, [separator]), String
- Returns a list of all str that have been seen, separated by separator
(comma by default).
set("A"); //A set("B"); //A,B set("A"); //A,B set("C"); //A,B,C set("A"); //A,B,C set("C"); //A,B,C
- log([level], str, params…), void
- Logs the specified str, merged with params at the specified level (SEVERE by default).
- parse:asString(value, [pattern], [decimalSeparator], [groupSeparator]), String
- Transforms the specified value to a String. The parameters depend of the type of value.
parse:asString(aDate); //31.12.2017 parse:asString(aDate, "dd/MM/yyyy HH:mm:ss"); // 31/12/2017 23:59:59 parse:asString(11700.1234); //11700.12 parse:asString(11700.1234, "#,###.00"); //11'700.12 parse:asString(11700.1234, "#,###.00", ",", "."); //11.700,12 parse:asString(1234); //1234 parse:asString(1234, "#,###"); //1'234 parse:asString("some string"); //some string parse:asString(true); //true
- parse:asInteger(value), int
- Parse the value as Integer.
parse:asInteger("1234"); //1234
- parse:asDouble(value), double
- Parse the value as Double.
parse:asDouble("1234.56"); //1234.56
- parse:asBigDecimal(value), BigDecimal
- Parse the value as BigDecimal.
parse:asBigDecimal("1234.56"); //1234.56
- parse:asBigInteger(value), BigInteger
- Parse the value as BigInteger.
parse:asBigInteger("1234"); //1234
- parse:asDateTime(value), Date
- Parse the value as DateTime. The format is ‘dd.MM.yyyy HH:mm:ss’.
parse:asDateTime("31.12.2017 23:59:59");
- parse:asDate(value, [format]), Date
- Parse the value as Date. The default format is ‘dd.MM.yyyy’.
parse:asDate("31.12.2017"); parse:asDate("31/12/2017 15:30", "dd/MM/yyyy HH:ss");
Note that theses functions are not builtin in BlueIron but are provided by the SAINet integration.
- date:addXXX(date), Date
- Add one unit of time (depending on the called function) to the specified date. The date is left untouched and a new Date instance will be returned.
newDate = date:addMillis(myDate); newDate = date:addSecond(myDate); newDate = date:addMinute(myDate); newDate = date:addHour(myDate); newDate = date:addDay(myDate); newDate = date:addMonth(myDate); newDate = date:addYear(myDate);
- date:addXXX(date, nb), Date
- Add nb unit of time (depending on the called function) to the specified date. The date is left untouched and a new Date instance will be returned.
newDate = date:addMillis(myDate, 500); newDate = date:addSeconds(myDate, 10); newDate = date:addMinutes(myDate, 5); newDate = date:addHours(myDate, 2); newDate = date:addDays(myDate, 1); newDate = date:addMonths(myDate, 1); newDate = date:addYears(myDate, 1);
- date:setTime(date, hour, minute, second, [millis]), Date
- Set the specified hour,minute,second,millis (or zero if unspecified) to date. The date is left untouched and a new Date instance will be returned.
newDate = date:setTime(myDate, 15, 30, 0);
- date:setDate(date, dayOfMonth, month, year), Date
- Set the specified day of month,month and year to date. The date is left untouched and a new Date instance will be returned.
newDate = date:setDate(myDate, 31, 12, 2017);
- date:getDate(year, month, dayOfMonth), Date
- Returns a new Date with the specified year,month and day of month.
myDate = date:getDate(2017, 12, 31); myDate = date:date(2017, 12, 31);
- date:getDateTime(year, month, dayOfMonth, hour, minute, second), Date
- Returns a new Date with the specified year,month,day of month,hour,minute and second.
myDate = date:getDateTime(2017, 12, 31, 23, 59, 59); myDate = date:date(2017, 12, 31, 23, 59, 59); myDate = date:create(2017, 12, 31, 23, 59, 59);
- date:date(year, month, dayOfMonth, [hour], [minute], [second]), Date
- Simple alias for getDate and getDateTime.
- date:create(year, month, dayOfMonth, hour, minute, second), Date
- Simple alias for getDateTime.
- date:now(), Date
- Returns the current date.
currentDate = date:now();
- date:startOfDay(date), Date
- Returns a new Date instance that starts at 00:00 of the day.
midnight = date:startOfDay(myDate);
- date:startOfMonth(date), Date
- Returns a new Date instance that starts at 00:00 the first of the month.
firstOfMonth = date:startOfMonth(myDate);
- date:startOfYear(date), Date
- Returns a new Date instance that starts at 00:00 the first of the year.
firstOfYear = date:startOfYear(myDate);
- date:endOfDay(date), Date
- Returns a new Date instance that ends at 23:59:59’999 of the day.
endDay = date:endOfDay(myDate);
- date:endOfMonth(date), Date
- Returns a new Date instance that ends at 23:59:59’999 of the last day of the month.
endMonth = date:endOfMonth(myDate);
- date:endOfYear(date), Date
- Returns a new Date instance that ends at 23:59:59’999 of the last day of the year.
endYear = date:endOfYear(myDate);
- date:XXX(date), Date
- Returns the date value, according to the invoked function.
//consider myDate being Sunday, 31.12.2017 23:59:59'999 date:year(myDate); //2017 date:month(myDate); //12 date:date(myDate); //31 date:dayOfWeek(myDate); //7 date:weekOfYear(myDate); //52 date:hour(myDate); //23 date:minute(myDate); //59 date:second(myDate); //59 date:millis(myDate); //999
- date:daysInMonth(year, month), int
- Returns the number of days in the specified month of year.
nbDays = date:daysInMonth(2017, 2); //28
- date:daysInYear(year), int
- Returns the number of days in the specified year.
nbDays = date:daysInYear(2017); //365
- date:weeksInYear(year), int
- Returns the number of weeks in the specified year.
nbWeeks = date:weeksInYear(2017); //52
- date:XXXDiff(date1, date2), int
- Returns the differences between date1 and date2, according to the called function. The returned value will be floored.
nbYears = date:yearsDiff(d1, d2); //2 nbMonths = date:monthsDiff(d1, d2); nbDays = date:daysDiff(d1, d2); nbHours = date:hoursDiff(d1, d2); nbMinutes = date:minutesDiff(d1, d2); nbSeconds = date:secondsDiff(d1, d2); nbMillis = date:millisDiff(d1, d2);
- date:XXXBetween(date1, date2), double
- Returns the differences between date1 and date2, according to the called function.
nbYears = date:yearsBetween(d1, d2); //2.34 nbMonths = date:monthsBetween(d1, d2); nbDays = date:daysBetween(d1, d2); nbHours = date:hoursBetween(d1, d2); nbMinutes = date:minutesBetween(d1, d2); nbSeconds = date:secondsBetween(d1, d2); nbMillis = date:millisBetween(d1, d2); //same as date:millisDiff(d1,d2)
Misc - Magic variables
Describes variables that are automatically injected in execution context of the expression/script/values.
Name | Step | Context | Description |
---|---|---|---|
computations |
Break | computations | Used in |
nbDataRows |
Break | display,displayTitle,data,displayTotal,total | Defines the number of data rows that have been read. |
nbInnerRows |
Break | display,displayTitle,title,data,displayTotal,total | Defines the number of output rows before data processing. |
nbOutputRows |
Break | display,displayTitle,title,displayTotal,total | Defines the number of output rows after data processing (all rows that will be shown in the break). |
nbOutputDataRows |
Break | display,displayTitle,title,displayTotal,total | Defines the number of output rows after data processing (all data rows that will be shown in the break). |
subBreak |
Break | display,displayTitle,title,displayTotal,total,computations | Access to the computations of the direct subbreak. This is an array containing all the (direct) sub-breaks computations. |
breakRow |
Break | displayTitle,title,data,displayTotal,total | Defines the break row (for the next break). Might be null. |
rowTypeIndex |
Break | displayTitle,displayTotal | Defines the index linked to the type of the row. (see here) |
titleRowDisplayed |
Break | displayTotal | Defines if the title row is displayed. Only available if one <title> has been declared (syntax sugar). |
titleRowDisplayed_ n |
Break | displayTitle,displayTotal | Defines if the title row with the rowTypeIndex n is displayed. The n is an index starting from 0. |
totalRowDisplayed_ n |
Break | displayTotal | Defines if the total row with the rowTypeIndex n is displayed. The n is an index starting from 0. |
currentKey |
Break | condition,force,displayTitle,title,data,displayTotal,total | Defines the break value (defined by the condition). |
previousKey |
Break | condition,force,displayTitle,title,data,displayTotal,total | Defines the previous break value. |
inputRows |
Update | expr | Defines the input rows (only if a buffer value is set). |
outputRows |
Update | expr | Defines the output rows (only if a buffer value is set). |
inputValue |
compute,expression | Represents the value of the column defined in the input attribute. |
|
previousValue |
script,expr,compute,expression | Contains the last returned value. | |
nbRows |
compute,expression | Number of rows to process. | |
nbMatchingRows |
compute,expression | Number of rows that matched the filter (available in |
|
allInputRows |
compute,expression | Array of the input rows. | |
options |
script,expr,compute,expression | Defines the map of the options. | |
inputRow |
script,compute | Represents the current input row (might be null). | |
inputMetadata |
script,compute | Represents the current input metadata (might be null). | |
outputRow |
script,compute | Represents the output row (might be null). | |
outputMetadata |
script,compute | Represents the output metadata (might be null). | |
currentGroup |
Cube | values | List containing the group values. |
currentKey |
Cube | values | Defines the key value. |
currentLabel |
Cube | values | Defines the key label. |
context |
Defines the context in which the script is running. This is a map that can be used to share variable across different run. |