quinta-feira, 26 de setembro de 2013

Controlling Data Loading Process using Kettle



Sometimes you have a process to load data from a flat file to a database, even from a database to your DW.

But if something goes wrong? How do you know if your load process finished successfully or not? If not, how do you reprocess the files without duplicate records?

This post will bring an option to try addressing those problems.

The main ideas of control process are (1) generate an ID for the process (2) save a timestamp at beginning (3) save ID on all the tables controlled (4) save a timestamp at the end.

So, if some problem occurs and aborts the job, next time you will be able to identify all data inserted by that job, delete them and insert again.


I’m assuming PostgreSQL as Database. For others databases might be necessary some adjusts.

domingo, 22 de setembro de 2013

Secretaria de Saúde do Estado de Goiás utiliza Pentaho como ferramenta de BI

A Secretaria de Saúde do Estado de Goiás a três anos vem desenvolvendo projetos de Business Intelligence utilizando a suite de ferramentas Pentaho.

Dentre seus projetos encontram-se painéis para acompanhamento e monitoramento da dengue nos municípios goianos, bem como painéis que permitem monitorar as campanhas de vacinação no Estado.

Esses painéis fornecem ainda aos analistas epidemiológicos e