Basic idea of my thesis is to find groups of patients with constant complexity of treatment. How to find those groups is still an open question. K-means was very fast to find a given number of clusters. However, I have no idea, how much clusters should there be. Manual group creation according to combinations of procedures looks reasonable. However, the combination of procedures might be not enough to provide a constant treatment complexity. Binomial regressions show that patients weight can have significant influence on the result.
I need to keep the number of groups as low as possible. Grouping by combinations of procedures gives a lot of groups already. If I create even more groups, I’ll end up with so little observations inside each group that any further statistical calculations will have no sense.
I need to rethink my analysis so it will take into account additional facts about the patients.
- Minimum weight
- Minimum height (missing very often)
- BMI, kg/m2
- Total number of operations
Why minimum weight instead of just weight? I need to find a parameter which describes a patient regardless of the operation. Since one patient can have more than one operation and the weight is described as “weight at the time of surgery”, I need to calculate a single weight value from (possibly) multiple operations. One of the simplest way to do it, is to take the minimal weight value.