Spécifications

Semantics

This page defines the specifications of a BlueIron descriptor with the following semantics:

  • The attribute names suffixed with a star * are mandatory.
  • The types between parenthesis () are Options of the given type.
  • The types between brackets [] denotes a list of the given type.


Root steps - Global

Ce tableau contient les propriétés possibles sur le root-tag.

Node Attribute Type Sub node name Comments
root
blueiron output String ID of the output step.
input
step

Root steps - InputStep

The goal of this step is to describe the structure of a set of input rows.

Node Attribute Type Sub node name Comments
input column
id* String ID of the input. Must be unique across all the steps.
alias String Alias of the input. Must be unique across all the steps.
name String
description String
column id* String ID of the column.
type* Type Type of the column’s content.
label String

Root steps - Step

Generic step container that can contain multiple sub steps.

Node Attribute Type Sub node name Comments
step id* String
source* String ID of the a step to read from.
name String
description String
alias String
enabled (Boolean)
project
cube
break
sort
filter
union
remove
update
skip
notify
step

Normal steps - Base step

Simple container step that can be used to group sub-steps.

Node Attribute Type Sub node name Comments
step id String
alias String
name String
description String
enabled (Boolean)
source String ID of the step to read from, otherwise uses the previous output data.
project
cube
break
sort
filter
union
remove
update
skip
notify
step

Normal steps - Projection

This step allows to select columns to be kept and/or to create new columns.

Node Attribute Type Sub node name Comments
project id String
alias String
name String
description String
enabled (Boolean)
insert String before,after,position,exclude (by default: exclude).
index int only if insert is ‘position’ and column attribute is not set.
column String only if insert is ‘position’ and index attribute is not set.
column
column id* String ID of the column.
type* Type Type of the column’s content.
label String

Normal steps - Update

This step allows to modify the values of columns.

Node Attribute Type Sub node name Comments
update id String
alias String
name String
description String
enabled (Boolean)
buffer int Defines the number of input/output rows to hold in memory (accessible with inputRows and outputRows.
column
column id* String ID of the column.
type* Type Type of the column’s content.
label String

Normal steps - Remove

This step allows to remove some columns.

Node Attribute Type Sub node name Comments
remove id String
alias String
name String
description String
enabled (Boolean)
column
column id* String ID of the column to remove.

Normal steps - Notification

This step will send notifications to registered listeners. Listeners can be added with blueIron.addListener(listener).

Node Attribute Type Sub node name Comments
notif id String
alias String
name String
description String
enabled (Boolean)
code String Code of the notification event.
input String ID of the column of the notification event.
type String block,row (by default: row)

Normal steps - Union

This step will make a union between different input steps.

Node Attribute Type Sub node name Comments
notif id String
alias String
name String
description String
enabled (Boolean)
source
source id* String ID of the step to read from.

Normal steps - Skip

This step will forward rows that match a filter to the substep, otherwise the substep processing will be skipped (but all the rows will be outputted anyway).

Node Attribute Type Sub node name Comments
skip id String
alias String
name String
description String
enabled (Boolean)
filter Filter to apply on the input rows.
step Step to apply when a row matches the filter.

Normal steps - Filter

This step will output only the rows matching the filter.

Node Attribute Type Sub node name Comments
filter id String
alias String
name String
description String
enabled (Boolean)
matching String all,any (by default: all). Defines if all the criteriums must match (all) or only one of them (any).
column Column processor criterium
script Script processor criterium
expr Expr processor criterium

Normal steps - Sort

Reads all the input rows and output them sorted.

Node Attribute Sub node name Comments
name name type name
id String
name String
description String
enabled (Boolean)
column
script
value
expr
SortColumn
id* String
direction [[#divers_option ([String])]]
nullValue [[#divers_option ([String])]]
emptyString [[#divers_option ([String])]]
SortProc
type Type
direction [[#divers_option ([String])]]
nullValue [[#divers_option ([String])]]
emptyString [[#divers_option ([String])]]

Normal steps - Break

Generates title and total lines on breaks.

Node Attribute Sub node name Comments
name name type name
id String
name String
description String
enabled (Boolean)
data (Boolean)
forceDataType (Boolean)
firstRowsToSkip (Integer)
hideIfNoData (Boolean)
inheritsComputations (Boolean)
resetComputations [String]
condition
force
restrict
display
titleDisplay
title
data
totalDisplay
total
computations
break
Condition
id String
type [String]
TitleTotal
enabled (Boolean)
row [String]
type [String]

Normal steps - Cube

Creates a data cube given a pivot.

Node Attribute Sub node name Comments
name name type name
id* String
name String
description String
enabled (Boolean)
key
group
values
Key
id String
value
label
ValueLabel
type [String]
Group
column
Column
id* String

Normal steps - Join

Joins the current step with another one.

Node Attribute Sub node name Comments
name name type name
id String
name String
description String
enabled (Boolean)
with* String
type* [String]
leftIndex
rightIndex
constraint
Condition
id String
type [String]

Processors - Copy/Column

Data based on an existing column.

Node Attribute Type Sub node name Comments
copy id* String ID of the column to put the value to.
source String ID of the column to copy the value from. It will be the same as id if not specified.
type Type Type of the column, if different from the one specified by id.

Note: the node copy can also be named column.


Processors - Value

Allows data combination of other columns with static values. See the documentation.

Node Attribute Type Sub node name Comments
value id* String ID of the column to put the value to.
type Type Type of the column, if different from the one specified by id.

Processors - Script

Scripted value using JavaScript.

Node Attribute Type Sub node name Comments
script id* String ID of the column to put the value to.
type Type Type of the column, if different from the one specified by id.

Processors - Expr

Scripted value using JexlScript.

Node Attribute Type Sub node name Comments
expr id* String ID of the column to put the value to.
type Type Type of the column, if different from the one specified by id.

Processors - Compute

Extended specification of Script.

Node Attribute Type Sub node name Comments
compute id* String ID of the column to put the value to.
input String ID of the column to put in the inputValue variable.
type Type Type of the column, if different from the one specified by id.
filter Filter expression. Any row that matches this expression will be skipped.
pre Expression to execute before the processing starts.
each Expression to execute on each row.
post Expression to execute after the processing completed.

Processors - Expression

Extended specification of Expr.

Node Attribute Type Sub node name Comments
expression id* String ID of the column to put the value to.
input String ID of the column to put in the inputValue variable.
type Type Type of the column, if different from the one specified by id.
filter Filter expression. Any row that matches this expression will be skipped.
pre Expression to execute before the processing starts.
each Expression to execute on each row.
post Expression to execute after the processing completed.

Processors - ForEach

Loops on the selected columns. Useful to treat dynamic columns created columns by a Cube.

Node Attribute Type Sub node name Comments
foreach id* String ID of the cube to loop on.

Misc - Type

A type is a String attribute that can contain the following values:

  • String
  • Boolean
  • Integer
  • BigInteger
  • Double
  • BigDecimal
  • Date
  • Long

Misc - Option

The attribute having their type defined between parenthesis are meant to contain an option. For instance an attribute with the type (Boolean) could contain ‘true’, ‘false’ or an option content.

An option content has always the syntax #{...} and can contain (very) simple expressions that use external variables, for instance: #{myVariable == 'value'}. Note that the syntax has to be stricly respected: spaces between operator and operands is mandatory and only quote is supported to denote a string (you can still put a number).

External variables can be injected programmaticaly in BlueIron with the following code:

BlueIron bir = ...
Map<String, Object> options = bir.getOptions();
options.put("myVariable", "value");

The supported operators are described in the table below. Since an option will be contained in an XML attribute, alternate syntax for each of them is allowed to keep readability and XML-compliance.

Operator Alternate Description
== eq Equal
< lt Less than
<= le Less or equal than
> gt Greater than
>= ge Greater or equal than
!= ne Not equal

Both syntax are the same: #{value < 4.55} and #{value lt 4.55}.


Misc - Value content

This type is a very simple and straightforward format meant to easily mix data from different columns together. Basically, it is a string with references with the format ${refName} where refName could be the id of a column or a computation/expression.

Assuming that the columns FIRSTNAME and LASTNAME exist, the following step will create a new column named FULLNAME with both values separated by a space:

<project>
  <column id="FULLNAME">${FIRSTNAME} ${LASTNAME}</column>
</project>

Since the reference can be of any supported type, it is possible to specify some arguments, separated by commas:

<project>
  <column id="DESC">${FIRSTNAME} ${LASTNAME} was born on ${BIRTHDATE,dddd}, the ${BIRTHDATE,dd/MM/yyyy}</column>
</project>

The arguments are handled by a Transformer which is implemented by the class DefaultTransformer in BlueIron (see here). It provides the following features:

Type Argument Description
String action - lower : Makes the string lower case.
                                                         -   upper : Makes the string upper case.
                                                         -   fupper : Makes the first char upper case.
                                                         -   substring : Takes the substring of the string from a start index and an end index.
                                                         -   substr : Takes the substring of the string from a start index and a length.        |

| Date | pattern | Formats the date with the given pattern (SimpleDateFormat pattern). | | Number | pattern[,decimalSeparator,[groupSeparator]] | Formats the number with the specified pattern, decimal separator and group separator. |


Misc - Javascript

BlueIron provides avanced script features by using the JDK JavaScript engine.

The input row will always be describec by the variable inputRow or the most recent alias. There are two ways to access the rows input: either by calling the column’s name as a function (JDK7) or to use the standard get function (JDK6 and further).

<root>
  <input id="employees" alias="e">
    <column id="ID" type="String" />
    <column id="FIRSTNAME" type="String" />
    <column id="LASTNAME" type="String" />
  </input>
  <step id="final" description="Simple projection" source="employees">
    <project insert="after">
      <script id="FULLNAME1" type="String"><![CDATA[e.FIRSTNAME()+" "+e.LASTNAME()]]></script>
      <script id="FULLNAME2" type="String"><![CDATA[e.get("FIRSTNAME")+" "+e.get("LASTNAME")]]></script>
    </project>
  </step>
</root>

Performance tip: prefer the usage of JexlScript, especially with the JDK6. Even if the scripts are compiled, BlueIron will perform much better when a lot of rows are involved.


Misc - JexlScript

Provides basic expressions parsing with boolean operators. Jexl is an open source Apache Project.

Compared with a JavaScript expression, Jexl is much more readable and performant, especially for simple computations. However, Jexl is limited to simple expressions, hence if you need ‘real’ language features (loops for instance), you’ll need to fallback to JavaScript.

The script below shows the difference between a Jexl expression and a JavaScript computation:

<root>
  <input id="salaries" alias="e">
    <column id="ID" type="String" />
    <column id="SALARY" type="BigDecimal" />
    <column id="FEES" type="BigDecimal" />
  </input>
  <step id="final" description="Simple projection" source="salaries">
    <project insert="after">
      <expr id="SUM1">SALARY + FEES</script> <!-- by default the type is BigDecimal -->
      <script id="SUM" type="BigDecimal">e.get("SALARY").add(e.get("FEES"))</script> <!-- by default the type is String -->
    </project>
  </step>
</root>

Misc - Jexl functions

All the functions below are callable directly in any jexl script.

Global functions

unique(str), Integer
Returns 1 if str has already been seen, else 0.
 unique("H"); //1 (H was not seen)
 unique("H"); //0 (H was seen)
 unique("B"); //1 (B was not seen)
 unique("C"); //1 (C was not seen)
 unique("B"); //0 (B was seen)
distinct(str), Integer
Returns the number of distinct str that have been seen.
 distinct("A") //1 (one value has been seen)
 distinct("B") //2 (two values have been seen)
 distinct("A") //2 (A was already seen)
 distinct("C") //3 (three values have been seen)
 distinct("B") //3 (B was already seen)
count(str), Integer
Returns the number of times str has been seen.
 count("A"); //1 (A was not seen)
 count("B"); //1 (B was not seen)
 count("A"); //2 (A was already seen one time)
 count("C"); //1 (C was not seen)
 count("A"); //3 (A was already seen two times)
 count("C"); //2 (C was already seen one time)
list(str, [separator]), String
Returns a list of all str that have been seen, separated by separator (comma by default).
 list("A"); //A
 list("B"); //A,B
 list("A"); //A,B,A
 list("C"); //A,B,A,C
 list("A"); //A,B,A,C,A
 list("C"); //A,B,A,C,A,C
set(str, [separator]), String
Returns a list of all str that have been seen, separated by separator (comma by default).
 set("A"); //A
 set("B"); //A,B
 set("A"); //A,B
 set("C"); //A,B,C
 set("A"); //A,B,C
 set("C"); //A,B,C
log([level], str, params…), void
Logs the specified str, merged with params at the specified level (SEVERE by default).

Parse functions (namespace: parse)

parse:asString(value, [pattern], [decimalSeparator], [groupSeparator]), String
Transforms the specified value to a String. The parameters depend of the type of value.
parse:asString(aDate); //31.12.2017
parse:asString(aDate, "dd/MM/yyyy HH:mm:ss"); // 31/12/2017 23:59:59
parse:asString(11700.1234); //11700.12
parse:asString(11700.1234, "#,###.00"); //11'700.12
parse:asString(11700.1234, "#,###.00", ",", "."); //11.700,12
parse:asString(1234); //1234
parse:asString(1234, "#,###"); //1'234
parse:asString("some string"); //some string
parse:asString(true); //true
parse:asInteger(value), int
Parse the value as Integer.
parse:asInteger("1234"); //1234
parse:asDouble(value), double
Parse the value as Double.
parse:asDouble("1234.56"); //1234.56
parse:asBigDecimal(value), BigDecimal
Parse the value as BigDecimal.
parse:asBigDecimal("1234.56"); //1234.56
parse:asBigInteger(value), BigInteger
Parse the value as BigInteger.
parse:asBigInteger("1234"); //1234
parse:asDateTime(value), Date
Parse the value as DateTime. The format is ‘dd.MM.yyyy HH:mm:ss’.
parse:asDateTime("31.12.2017 23:59:59");
parse:asDate(value, [format]), Date
Parse the value as Date. The default format is ‘dd.MM.yyyy’.
parse:asDate("31.12.2017");
parse:asDate("31/12/2017 15:30", "dd/MM/yyyy HH:ss");

Date functions (namespace: date)

Note that theses functions are not builtin in BlueIron but are provided by the SAINet integration.

date:addXXX(date), Date
Add one unit of time (depending on the called function) to the specified date. The date is left untouched and a new Date instance will be returned.
newDate = date:addMillis(myDate);
newDate = date:addSecond(myDate);
newDate = date:addMinute(myDate);
newDate = date:addHour(myDate);
newDate = date:addDay(myDate);
newDate = date:addMonth(myDate);
newDate = date:addYear(myDate);
date:addXXX(date, nb), Date
Add nb unit of time (depending on the called function) to the specified date. The date is left untouched and a new Date instance will be returned.
newDate = date:addMillis(myDate, 500);
newDate = date:addSeconds(myDate, 10);
newDate = date:addMinutes(myDate, 5);
newDate = date:addHours(myDate, 2);
newDate = date:addDays(myDate, 1);
newDate = date:addMonths(myDate, 1);
newDate = date:addYears(myDate, 1);
date:setTime(date, hour, minute, second, [millis]), Date
Set the specified hour,minute,second,millis (or zero if unspecified) to date. The date is left untouched and a new Date instance will be returned.
newDate = date:setTime(myDate, 15, 30, 0);
date:setDate(date, dayOfMonth, month, year), Date
Set the specified day of month,month and year to date. The date is left untouched and a new Date instance will be returned.
newDate = date:setDate(myDate, 31, 12, 2017);
date:getDate(year, month, dayOfMonth), Date
Returns a new Date with the specified year,month and day of month.
myDate = date:getDate(2017, 12, 31);
myDate = date:date(2017, 12, 31);
date:getDateTime(year, month, dayOfMonth, hour, minute, second), Date
Returns a new Date with the specified year,month,day of month,hour,minute and second.
myDate = date:getDateTime(2017, 12, 31, 23, 59, 59);
myDate = date:date(2017, 12, 31, 23, 59, 59);
myDate = date:create(2017, 12, 31, 23, 59, 59);
date:date(year, month, dayOfMonth, [hour], [minute], [second]), Date
Simple alias for getDate and getDateTime.
date:create(year, month, dayOfMonth, hour, minute, second), Date
Simple alias for getDateTime.
date:now(), Date
Returns the current date.
currentDate = date:now();
date:startOfDay(date), Date
Returns a new Date instance that starts at 00:00 of the day.
midnight = date:startOfDay(myDate);
date:startOfMonth(date), Date
Returns a new Date instance that starts at 00:00 the first of the month.
firstOfMonth = date:startOfMonth(myDate);
date:startOfYear(date), Date
Returns a new Date instance that starts at 00:00 the first of the year.
firstOfYear = date:startOfYear(myDate);
date:endOfDay(date), Date
Returns a new Date instance that ends at 23:59:59’999 of the day.
endDay = date:endOfDay(myDate);
date:endOfMonth(date), Date
Returns a new Date instance that ends at 23:59:59’999 of the last day of the month.
endMonth = date:endOfMonth(myDate);
date:endOfYear(date), Date
Returns a new Date instance that ends at 23:59:59’999 of the last day of the year.
endYear = date:endOfYear(myDate);
date:XXX(date), Date
Returns the date value, according to the invoked function.
//consider myDate being Sunday, 31.12.2017 23:59:59'999
date:year(myDate); //2017
date:month(myDate); //12
date:date(myDate); //31
date:dayOfWeek(myDate); //7
date:weekOfYear(myDate); //52
date:hour(myDate); //23
date:minute(myDate); //59
date:second(myDate); //59
date:millis(myDate); //999
date:daysInMonth(year, month), int
Returns the number of days in the specified month of year.
nbDays = date:daysInMonth(2017, 2); //28
date:daysInYear(year), int
Returns the number of days in the specified year.
nbDays = date:daysInYear(2017); //365
date:weeksInYear(year), int
Returns the number of weeks in the specified year.
nbWeeks = date:weeksInYear(2017); //52
date:XXXDiff(date1, date2), int
Returns the differences between date1 and date2, according to the called function. The returned value will be floored.
nbYears = date:yearsDiff(d1, d2); //2
nbMonths = date:monthsDiff(d1, d2);
nbDays = date:daysDiff(d1, d2);
nbHours = date:hoursDiff(d1, d2);
nbMinutes = date:minutesDiff(d1, d2);
nbSeconds = date:secondsDiff(d1, d2);
nbMillis = date:millisDiff(d1, d2);
date:XXXBetween(date1, date2), double
Returns the differences between date1 and date2, according to the called function.
nbYears = date:yearsBetween(d1, d2); //2.34
nbMonths = date:monthsBetween(d1, d2);
nbDays = date:daysBetween(d1, d2);
nbHours = date:hoursBetween(d1, d2);
nbMinutes = date:minutesBetween(d1, d2);
nbSeconds = date:secondsBetween(d1, d2);
nbMillis = date:millisBetween(d1, d2); //same as date:millisDiff(d1,d2)

Misc - Magic variables

Describes variables that are automatically injected in execution context of the expression/script/values.

Name Step Context Description
computations Break computations Used in to access other computations data
nbDataRows Break display,displayTitle,data,displayTotal,total Defines the number of data rows that have been read.
nbInnerRows Break display,displayTitle,title,data,displayTotal,total Defines the number of output rows before data processing.
nbOutputRows Break display,displayTitle,title,displayTotal,total Defines the number of output rows after data processing (all rows that will be shown in the break).
nbOutputDataRows Break display,displayTitle,title,displayTotal,total Defines the number of output rows after data processing (all data rows that will be shown in the break).
subBreak Break display,displayTitle,title,displayTotal,total,computations Access to the computations of the direct subbreak. This is an array containing all the (direct) sub-breaks computations.
breakRow Break displayTitle,title,data,displayTotal,total Defines the break row (for the next break). Might be null.
rowTypeIndex Break displayTitle,displayTotal Defines the index linked to the type of the row. (see here)
titleRowDisplayed Break displayTotal Defines if the title row is displayed. Only available if one <title> has been declared (syntax sugar).
titleRowDisplayed_n Break displayTitle,displayTotal Defines if the title row with the rowTypeIndex n is displayed. The n is an index starting from 0.
totalRowDisplayed_n Break displayTotal Defines if the total row with the rowTypeIndex n is displayed. The n is an index starting from 0.
currentKey Break condition,force,displayTitle,title,data,displayTotal,total Defines the break value (defined by the condition).
previousKey Break condition,force,displayTitle,title,data,displayTotal,total Defines the previous break value.
inputRows Update expr Defines the input rows (only if a buffer value is set).
outputRows Update expr Defines the output rows (only if a buffer value is set).
inputValue compute,expression Represents the value of the column defined in the input attribute.
previousValue script,expr,compute,expression Contains the last returned value.
nbRows compute,expression Number of rows to process.
nbMatchingRows compute,expression Number of rows that matched the filter (available in ).
allInputRows compute,expression Array of the input rows.
options script,expr,compute,expression Defines the map of the options.
inputRow script,compute Represents the current input row (might be null).
inputMetadata script,compute Represents the current input metadata (might be null).
outputRow script,compute Represents the output row (might be null).
outputMetadata script,compute Represents the output metadata (might be null).
currentGroup Cube values List containing the group values.
currentKey Cube values Defines the key value.
currentLabel Cube values Defines the key label.
context Defines the context in which the script is running. This is a map that can be used to share variable across different run.