From Clinical Prediction Models by Ewout W. Steyerberg we have the following:
A calibration plot has predictions on the x axis, and the outcome on the y axis. A line of identity helps for orientation: Perfect predictions should be on the 45° line. For linear regression, the calibration plot results in a simple scatter plot. For binary outcomes, the plot contains only 0 and 1 values for the y axis. Probabilities are not observed directly. However, smoothing techniques can be used to estimate the observed probabilities of the outcome ( p ( y = 1)) in relation to the predicted probabilities. The observed 0/1 outcomes are replaced by values between 0 and 1 by combining outcome values of subjects with similar predicted probabilities, e.g. using the loess algorithm.
I'm fitting a logistic regression model with a binary outcome. Below is an example code. The calibration curve is going to look weird because the sample is so small. I'm mostly wondering if the methodology is correct.
library(tidyverse)
tibble_ex <- tibble(
event = c(1, 0, 1, 0, 0, 1),
weight = c(100, 200, 110, 210, 220, 105)
)
model <- glm(event ~ weight, family = 'binomial', data = tibble_ex)
tibble_ex <- tibble_ex %>%
mutate(pred = predict(model, type = 'response'))
tibble_ex %>%
arrange(pred) %>%
ggplot(aes(x = pred, y = event)) +
stat_smooth(method = 'glm', method.args = list(family = binomial), se = F) +
geom_abline()
