Please protect this area from dust

For those who want to charm their Polish builders, there’s an informative article on how to do that.

Being a native Polish speaker myself, I’ve found a sophisticated amendment done apparently by the person who translated the original English sentences. Original phrase number seven is:

Please protect this area from dust.

However, the Polish translation has this one tiny yet important addition:

Please protect this area from dust and beer.

Now, that’s a thoughtful translator!

UPDATE: Michał suggested that it wasn’t beer, but phonetically spelled dust. He’s probably right. I’m keeping the entry anyway, to remember this funny misreading

Scientists, share your source code

It’s a typical example: the paper is published, describing a new algorithm for data analysis. Mathematical background is described in the paper, roughly. A piece of software that implements it, is written and available for download from a web-site. You visit the web site, download it and run it. You get unexpected results. You wonder what’s happening. You go back to the site and look for the source code ― and it’s not there.

I’ve recently visited and tested two pieces of software doing basically the same thing: predicting missing genotypes. There is no source code for any of those two, and fastPHASE additionally needs you to register and accept an academic license to use it, introducing an annoying delay in obtaining the program.

By the way, why are all those scientific program names written in UPPERCASE? Because it creates an impression of IMPORTANCE? Just a side note.

Scientists work for the sake of humanity (I hope), striving to make our world a better place. Right? So why don’t they make the source code available?

Not releasing source code of scientific software is a Bad Thing, because it harms research in the field and is antisocial. The ones that lose, is the closed-source project itself, other projects in the field, and subsequently, everyone who could have benefit from the research. The only one who can possibly benefit from it, is only the author, but I highly doubt that they ever do.

Keeping the source code secret is a typical practice for corporations, who seek to profit from selling the binaries. I don’t know what business model can be built on restricted source code access in science, but I don’t think they’re every going to make any money on that.

What could be other reasons not to release the source code? Remaining the sole author, keeping all the credit? Keeping complete control? Hoping to sell license to business clients?

The main effect of making the source code unavailable is that the program internals cannot be inspected and analyzed. It’s only a binary that is available; people can obtain it and run it, without being able to modify it.

All the general arguments pro open-source software apply to the scientific software. Obstructing the software has several negative results.

  • Fewer people use the program.
  • None of the users can adapt or fix the program.
  • Other developers cannot learn from the program, or base new work on it.

I think that should be enough, but I would like to add two points that apply specifically to scientific software.

Loss of credibility

In scientific research, they key point is to prove and verify the results. With closed source, other scientists can only run the software and examine the output, without being able to check if the program really does what the paper describes. Being unable to do that, the rest of the world has to believe the authors. Do they have something to hide?

I don’t think scientists would actually question a paper as a whole because of the source code unavailability, but it certainly makes raises some concerns about its quality.

It’s antisocial

Scientific research is usually funded from government grants, which in turn come from tax payers. Scientists are not corporations who fund themselves. It’s the society, it’s the other people who effectively pay for the research (through various funding organizations), and I believe it’s a moral obligation to, if they share their research results, share them fully.

By not releasing the source code, they only make an impression of publishing their work. They can get away with that, because many people will think that, if they can download the program and run it, it’s “available”. But it’s not!

Please, dear scientists, do what guys from projects such as GNU Octave, or R project do: share your source code. Everybody will benefit from it, including your projects and yourselves.

White & Nerdy quiz

I suck in quizzes. I participated in one on Friday and I was next to useless. I didn’t know about anything about anything except who defined the three laws of robotics. And I thought that the best selling album of all time was Thriller, while the correct answer was… wait a minute! I was right!

It’s the guy who wrote the quiz that was wrong. Or have I misunderstood the question? That wouldn’t be unlikely, if you only could hear his accent…

There is a quiz card on Weird Al Yankovic’s White & Nerdy music video at about 1:10. If you weren’t watching carefully enough, you might not notice it at all. If you have noticed it, did you wonder what are the questions? In case you did, here’s a transcript.

  • In what city is the largest hall of twine built by one man?
  • What’s the deal with Lindsay Lohan? I mean, seriously?
  • F.D.R. – was he faking it?
  • On what page does Harry Potter die in the next book?
  • What is the melting point of a gorilla’s head?
  • How many Wicket Men are there on a 41-Man Squeamish trans?

If I ever get to write a quiz, I will have something to get started with.

Engineers’ salaries rising in Poland

Seems like I left Poland only to learn that engineers’ salaries have dramatically risen there. I’ve read something almost comical about Polish headhunters. Usually, headhunters work in opposition to each other, fighting for candidates. However, in Poland, headhunters cooperate.

(…) They don’t do that in opposition, but they exchange candidate contacts instead. “Few months after recruitment, my colleague [from another agency] gives a new job offer to an engineer that I recruited. I do the same with another engineer that was recuited by my colleague. Specialists are will just swap places, each one will get higher salary, and we [the recruiters] will have the orders of our clients’ nervous managers fulfilled.” – says Jacek.

The article (in Polish) says that the guy went from PLN 3k up to PLN 14k. Taking into account costs of living in Poland, it’s like earning €14k per annum in Ireland, and the guy is quite fresh on the market. Am I coming back?

No, I’m not. It’s not about the money. It’s about an interesting job.

Removing executable flag from text files

If you ever shared code repository with people who use Windows, you might notice text files with set executable flag. Such files are a little annoying. When you do a directory listing you suddenly see a lot of programs. When you look at those programs, it turns out they are merely data text files or C++ source code. Since Windows doesn’t distinguish between executable and non-executable files in any other way than the *.exe extension, Windows guys are likely to commit such files unwittingly.

Here’s a line in shell that will help you. Give it a file name extension (say, *.txt) and it will find all the executables with that name and fix them.

find . -name '*.txt' -perm /u+x,g+x,o+x \
-exec chmod a-x {} \;

You can alter the *.txt part, specifying any extension you want. You might want to look for all the *.cpp, *.hpp, *.java, etc., files.

To find out if there are any other files that you might want to correct, try the following line. It will display all executable files:

find . -type f -perm /u+x,g+x,o+x

Subversion

If you use Subversion for your project, the above method will not work, because in Subversion, the executable flag is stored as a property. Here’s the same script that will deal with a local copy of code pulled from Subversion:

find . -name '*.txt' -perm /u+x,g+x,o+x \
-exec svn pd svn:executable {} \;