TY - JOUR
T1 - Clinical stability and propensity score matching in Cardiac Surgery
T2 - Is the clinical evaluation of treatment efficacy algorithm-dependent in small sample size settings?
AU - Bottigliengo, Daniele
AU - Acar, Aslıhan Sentürk
AU - Sciannameo, Veronica
AU - Lorenzoni, Giulia
AU - Bejko, Jonida
AU - Bottio, Tomaso
AU - Cozzi, Emanuele
AU - Vadori, Marta
AU - Soulillou, Jean Paul
AU - Roussel, Jean Christian
AU - Le Torneau, Thierry
AU - Senage, Thomas
AU - Mañez, Rafael
AU - Costa, Cristina
AU - Padler-Karavani, Vered
AU - Scali, Sofia
AU - Carrozzini, Massimiliano
AU - Fiorello, Emilia
AU - Fusca, Samuel
AU - Gerosa, Gino
AU - Baldi, Ileana
AU - Berchialla, Paola
AU - Gregori, Dario
N1 - Publisher Copyright:
© 2019, Indian Journal of Public Health Research and Development. All rights reserved.
PY - 2019
Y1 - 2019
N2 - Background: Propensity score matching represents one of the most popular techniques to deal with treatment allocation bias in observational studies. However, when the number of enrolled patients is very low, the creation of matched set of subjects may highly depend on the model used to estimate individual propensity scores, undermining the stability of consequential clinical findings. In this study, we investigate the potential issues related to the stability of the matched sets created by different propensity score models and we propose some diagnostic tools to evaluate them. Methods: Matched groups of patients were created using five different methods: Logistic Regression, Classification and Regression Trees, Bagging, Random Forest and Generalized Boosted Model. Differences between subjects in the matched sets were evaluated by comparing both pre-treatment covariates and propensity score distributions. We applied our proposal to a cardio-surgical observational study that aims to compare two different procedures of cardiac valve replacement. Results: Both baseline characteristics and propensity score distributions were systematically different across matched samples of patients created with different models used to estimate propensity score. The most relevant differences were observed for the matched set created by estimating individual propensity scores with Classification and Regression Trees algorithm. Conclusion: Clinical stability of matched samples created with different statistical methods should always be evaluated to ensure reliability of final estimates. This work opens the door for future investigations that fully assess the implications of this finding.
AB - Background: Propensity score matching represents one of the most popular techniques to deal with treatment allocation bias in observational studies. However, when the number of enrolled patients is very low, the creation of matched set of subjects may highly depend on the model used to estimate individual propensity scores, undermining the stability of consequential clinical findings. In this study, we investigate the potential issues related to the stability of the matched sets created by different propensity score models and we propose some diagnostic tools to evaluate them. Methods: Matched groups of patients were created using five different methods: Logistic Regression, Classification and Regression Trees, Bagging, Random Forest and Generalized Boosted Model. Differences between subjects in the matched sets were evaluated by comparing both pre-treatment covariates and propensity score distributions. We applied our proposal to a cardio-surgical observational study that aims to compare two different procedures of cardiac valve replacement. Results: Both baseline characteristics and propensity score distributions were systematically different across matched samples of patients created with different models used to estimate propensity score. The most relevant differences were observed for the matched set created by estimating individual propensity scores with Classification and Regression Trees algorithm. Conclusion: Clinical stability of matched samples created with different statistical methods should always be evaluated to ensure reliability of final estimates. This work opens the door for future investigations that fully assess the implications of this finding.
KW - Clinical stability
KW - Low sample size
KW - Propensity Score Matching
KW - Propensity Score Models
UR - http://www.scopus.com/inward/record.url?scp=85064349015&partnerID=8YFLogxK
U2 - 10.2427/13001
DO - 10.2427/13001
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85064349015
SN - 2282-2305
VL - 16
JO - Epidemiology Biostatistics and Public Health
JF - Epidemiology Biostatistics and Public Health
IS - 1
M1 - e13001
ER -