Being given a task that takes two hours to accomplish, I was usually spending one and a half hour on figuring out, how to get it done in half an hour.

I am solving most problems by writing scripts, small programs that accomplish simple tasks. The key thing is that a script, once it’s written, can be run many times when new data have arrived.

I’ve always had problems with the XLS format. When writing scripts, I need data in a simple, text format. Someone sends me an XLS file, I open it and see something like this:

@O^@G^@Ó^@L^@N^@A^@ ^@:^@^D^@^AI^@M^@I^@^X^
A^H^@^@NAZWISKO^C^@^@OKO^E^ @^@prawe^D^@^@lewe^R^@^

How can I process such data? I can’t.

Luckily, I’ve found Catdoc, a set of tools which are able to convert the binary DOC and XLS formats into simple text, that can be further processed with standard script languages. One of the tools is xls2csv, which reads a binary XLS file and writes a comma-separated CSV.


You won't believe what a skeptic I am.

