BlueIron

What is BlueIron ?

BlueIron is a library that takes a set of rows as input, applies some transformations and output a new set of rows. For instance, one can sort the input rows and create some breaks with title and total rows.

Example

Consider the following input rows:

ID NAME FIRSTNAME SEX BIRTHDATE DEPARTMENT SALARY
01 Peter Bosshard M 05.02.1958 DIR 5800
02 Maria Skinov F 25.06.1964 DIR 5700
03 Casey Cole M 28.08.1955 DIR 5700
04 Roger Binner M 13.01.1959 DIR 5700
05 Olive Saltin F 31.03.1961 DIR 5700
06 Bill Amacker M 27.10.1968 HEA 5300
07 Aby Thornson F 26.11.1967 HEA 5300
08 Anne Pevler F 14.09.1967 HEA 5300
09 Annita Smith F 17.12.1967 HEA 5300
10 Robert Smith M 13.12.1970 DEV 5000
11 Maggie Frill F 29.02.1972 DEV 5000
12 Daniel Metzler M 01.01.1973 DEV 4900
13 Frank Witz M 05.01.1973 DEV 4900
14 Franky Bilen M 12.12.1972 DEV 4900
15 Ed Krack M 08.04.1969 DEV 5100
16 Sean Huskynd M 04.09.1971 DEV 4900
17 Alice Muller F 09.10.1976 STG 4500

With the following BlueIron descriptor:

<blueiron>
  <!-- input data structure -->
  <input id="employees">
    <column id="ID" type="String" />
    <column id="FIRSTNAME" type="String" />
    <column id="LASTNAME" type="String" />
    <column id="SEX" type="String" />
    <column id="BIRTHDATE" type="Date" />
    <column id="DEPARTMENT" type="String" />
    <column id="SALARY" type="BigDecimal" />
  </input>

  <!-- global step holder -->
  <step id="final" source="employees">
    <!-- a break per departement with a total line. Inner data are thrown away. -->
    <break data="false">
      <condition type="column" id="DEPARTMENT" />
      <total>
        <copy id="DEPARTMENT" />
        <copy id="SALARY" source="SUM_SALARY" />
      </total>
      <computations>
        <expr id="SUM_SALARY">previousValue + SALARY</expr>
      </computations>
    </break>
    <!-- keep only the 2 last columns -->
    <project>
      <copy id="DEPARTMENT" />
      <copy id="SALARY" />
    </project>
  </step>
</blueiron>

Will produce the following output:

DEPARTMENT SALARY
DIR 28600.00
HEA 21200.00
DEV 34700.00
STG 4500.00

Basic concepts

In order to provide maximum flexibility in data transformation, BlueIron uses the concept of step. Those steps will then use one or more processors.

Step

A step denotes a unit of work: it reads the input rows, makes its transormation(s) and then output a (potentially) new set of rows. Then many steps will be piped together to produce the final output result.

Generally, a step is needed especially in this cases:

  • The structure of the output is changed
  • New rows are created
  • Rows are removed

BlueIron comes with a set of basic steps which