Vital signs and disease progression

vital signs of patients

Challenge We were given a dataset of about 17'000 patients in hospital whose vital signs, e.g. body temperature, insulin levels, blood sugar level etc, were measured over a period of 12h. In this time span, these measures led doctors to prescribe further exams (boolean variables) and further other vital signs, e.g. oxygen saturation, blood pressure and heartrate. The following 12h hours were held-out as test set and the task consisted in training a neural network with the data of the first 12h in order to predict which exams will be prescribed by doctors on the following 12h based, together with oxygen saturation, blood pressure and heartrate, based on the standard vital signs only.

Solution We trained ElasticNet and MLPRegressor to forecast real-valued and boolean variables. Other important challenges were represented by missing data (patients' vital values are not recorded continuously while other values are not recorded at all), importance sampling (the proportion of patients who are prescribed with certain exams can vary greatly), feature engineering (some very relevant quantities are missing in the dataset, for example the temperature is given, but its difference over the 12h, which is clearly relevant to the evolution of the patient’s conditions of the next 12h, is not).

Along with technical skills, I learned together with my group how to approach complicated problems as we had to visualize data, test different models such as linear regression and neural nets, make extensive use of cross-validation, perform feature engineering and data pre-processing to solve it. We tackled this task exploiting many techniques, PCA, up-sampling, down-sampling.

Marcello Negri
Marcello Negri
PhD candidate

I am a PhD student in machine learning currently trying to make models more flexible and interpretable.