Olimpic lecture from The Teaching Company

TTC, The Teaching Company, produces courses on various topics. I’ve purchased several courses, all of them were excellent. I’m currently listening to Philosophy of Science course, which begins with a simple question “what makes science science?”, then turns to definition of definition and later on pursues the meaning of meaning. If it sounds crazy, it’s because it is. But it turns out that you really need to understand this kind of stuff if you want to know why science works and why do scientists get their grants.

I also watched a bit of human anatomy course, but hearing about all the things that can go wrong in heart basically make me faint, so I’m progressing rather slowly. But it’s nevertheless a fascinating course. It would have helped me a great deal if I watched it before writing my thesis.

Here’s today’s perk. In celebration of the 2008 Olympic Games in in Bejing, The Teaching Company makes a lecture about the history of olympic games available for free download. Lectures are available until the 4th of September 2008.

Method of comparing hospitals in the EACTS Congenital Database

I have published my MSc thesis on-line. It’s available for (free) download in PDF format. It contains:

  1. An example of complicated data reshaped to a form which allows statistical analysis
  2. A method of comparing hospitals fairly

Read more and download the thesis.

Abuse of scoring systems

Apgar is a scoring system,

…a simple and repeatable method to quickly and summarily assess the health of newborn children immediately after childbirth. (…) The Apgar score is determined by evaluating the newborn baby on five simple criteria on a scale from zero to two and summing up the five values thus obtained. The resulting Apgar score ranges from zero to 10.

One of the criteria is the skin color, which can be blue all over, blue at extremities and normal. This is an ordinal variable, which means that the variable does not have number values, but named and ordered levels. Blue at extremities is worse than normal, blue all over is worse than blue at extremities. By transition, blue all over is worse than normal.

Apgar score is meant to provide a single number as an outcome. To achieve that, five ordinal criteria need to be aggregated. Unfortunately, there is no way to directly aggregate skin color with pulse, for example. However, numbers are easy to aggregate, by means of addition. Hence the idea of transforming levels to numbers and aggregating them.

This is somewhat dangerous approach. The main purpose of Apgar is:

…to determine quickly whether a newborn needs immediate medical care.

However, having the Apgar outcome in form of numbers, people might be quick to calculate mean value and standard deviation. Looking for “mean apgar” in scholar.google.com reveals some 400 documents. It’s not a majority, because ther are 71 thousands documents with word “apgar”, so those 400 are only 0.5%.

Calculating mean and standard deviance of Apgar values wasn’t something that Apgar creator had in mind. Its purpose was to quickly assess if a newborn needs medical care.


Apgar score values are not numbers. They are summed identifiers of five ordinal variables. In order to calculate statistics, the original data (criteria values) should be used, as there are dedicated statistical methods to analyze ordinal variables. These methods, as the reader may already have guessed, are not transforming ordinal variable values into numbers in order to perform calculations on them.

When designing a survey for statistical analysis, Apgar score must not be used. The five original criteria must be included in the survey instead.

All the things that apply to Apgar, apply also to the Aristotle Score, which I have already criticized. Height and weight are numbers. Generally, things that are measured, are numbers. Things that are assessed subjectively, like newborn skin color, are ordinal and do not have values. Aristotle Score values are seemingly numbers. However, it’s important to bear in mind that they are not! Therefore, one must not calculate mean or standard deviation of Aristotle Score.

Current Basic Score reports are based on mean Basic Score values, which is an abuse of a scoring system. I suggest finding another method of quality of care evaluation.

minus 4 days to go

I have proof-read a hardcopy of my thesis, made final corrections and commited them to the repository.

Transmitting file data ……..
Committed revision 384.

I noticed that the error rate varied across chapters. I think the earliest parts were the worst, there was no page left without a change. The newest parts, however, were mostly OK, with just few slight modifications.

Some of the corrections were because of the integrity and continuity. I expected to write or do some things that I haven’t finally written or done. For example, I planned to include an appendix, which occurred to be too big and was finally removed. I spotted and removed two references to the non-existing appendix today. At first, I considered this appendix an integral part of the thesis. However, it could distract readers from the main concept, i.e. the normalization. I wouldn’t like to discuss the details of the way I have normalized the International Nomenclature for CHD. It is a task for a medicine expert, I just had to do it in order to be able to move forward with my analysis.

2×2×2 days to go

I removed a huge appendix from my thesis to make it thinner, but the thesis is still growing. The current chapter, the analysis, contains dumps of the models, where one model can take up a whole page.

It’s 23 days left to the submission. I’ve promised my supervisor a final-candidate on Sunday, so I’ve got today and tomorrow to do it. It’s going to be a busy weekend.

I am somewhat disappointed with the predictive weakness of the models. There are lots of false negatives, even though the classification threshold is low (5%). Fortunately, the classification is not a key point in my thesis. The models can still be used for fair comparing the hospitals.

Last but one step of the analysis

So far, everything I was doing was a preparation. Now, there are 10102 days to go and I’m starting the actual data analysis, i.e. the final multiple logistic regression. Two mighty servers are currently processing my data. They have already calculated the simple additive models without interactions. I hope they will finish the models with interactions by tomorrow.

After many conversations with my expert consultant,  I have achieved results that do make sense to him. No revelations, but I don’t expect them anymore. It’s good enough when the regression results match his expectations.  The calculatated coefficients are informative, as they represent the size of the effect.

Once I have the regressions ready, I’ll be ready to perform the hospital comparisons, the very final phase of the analysis.

My method of hospital comparing greenlighted

My supervisor has greenlighted my method of comparing the hospitals. The basic idea behind my method is that two hospitals are considered different, when the difference in mortality can not be sufficiently explained by the risk factors. This method uses a statistical tool ― the binomial regression. All the calculations are well-defined. I’m also working on an intuitive presentation of the results, which will use position, color and transparency.

This will be the finale of my thesis. 11102 days to go.