BlueIron
BlueIron is a library that takes a set of rows as input, applies some transformations and output a new set of rows. For instance, one can sort the input rows and create some breaks with title and total rows.
Consider the following input rows:
ID | NAME | FIRSTNAME | SEX | BIRTHDATE | DEPARTMENT | SALARY |
---|---|---|---|---|---|---|
01 | Peter | Bosshard | M | 05.02.1958 | DIR | 5800 |
02 | Maria | Skinov | F | 25.06.1964 | DIR | 5700 |
03 | Casey | Cole | M | 28.08.1955 | DIR | 5700 |
04 | Roger | Binner | M | 13.01.1959 | DIR | 5700 |
05 | Olive | Saltin | F | 31.03.1961 | DIR | 5700 |
06 | Bill | Amacker | M | 27.10.1968 | HEA | 5300 |
07 | Aby | Thornson | F | 26.11.1967 | HEA | 5300 |
08 | Anne | Pevler | F | 14.09.1967 | HEA | 5300 |
09 | Annita | Smith | F | 17.12.1967 | HEA | 5300 |
10 | Robert | Smith | M | 13.12.1970 | DEV | 5000 |
11 | Maggie | Frill | F | 29.02.1972 | DEV | 5000 |
12 | Daniel | Metzler | M | 01.01.1973 | DEV | 4900 |
13 | Frank | Witz | M | 05.01.1973 | DEV | 4900 |
14 | Franky | Bilen | M | 12.12.1972 | DEV | 4900 |
15 | Ed | Krack | M | 08.04.1969 | DEV | 5100 |
16 | Sean | Huskynd | M | 04.09.1971 | DEV | 4900 |
17 | Alice | Muller | F | 09.10.1976 | STG | 4500 |
With the following BlueIron descriptor:
<blueiron>
<!-- input data structure -->
<input id="employees">
<column id="ID" type="String" />
<column id="FIRSTNAME" type="String" />
<column id="LASTNAME" type="String" />
<column id="SEX" type="String" />
<column id="BIRTHDATE" type="Date" />
<column id="DEPARTMENT" type="String" />
<column id="SALARY" type="BigDecimal" />
</input>
<!-- global step holder -->
<step id="final" source="employees">
<!-- a break per departement with a total line. Inner data are thrown away. -->
<break data="false">
<condition type="column" id="DEPARTMENT" />
<total>
<copy id="DEPARTMENT" />
<copy id="SALARY" source="SUM_SALARY" />
</total>
<computations>
<expr id="SUM_SALARY">previousValue + SALARY</expr>
</computations>
</break>
<!-- keep only the 2 last columns -->
<project>
<copy id="DEPARTMENT" />
<copy id="SALARY" />
</project>
</step>
</blueiron>
Will produce the following output:
DEPARTMENT | SALARY |
---|---|
DIR | 28600.00 |
HEA | 21200.00 |
DEV | 34700.00 |
STG | 4500.00 |
In order to provide maximum flexibility in data transformation, BlueIron uses the concept of step. Those steps will then use one or more processors.
A step denotes a unit of work: it reads the input rows, makes its transormation(s) and then output a (potentially) new set of rows. Then many steps will be piped together to produce the final output result.
Generally, a step is needed especially in this cases:
- The structure of the output is changed
- New rows are created
- Rows are removed
BlueIron comes with a set of basic steps which