Yudai Iwasaki, Takahiro Kinoshita, Jumpei Yoshimura, Shuhei Maruyama, Shinichiro Ohshimo, Shuhei Murao, Makoto Watanabe, Kenichiro Uchida, Yutaka Igarashi, Yuji Nishimoto, Shinshu Katayama, Hiroshi Kurosawa, Yoshiaki Inoue, Akira Kodate, Keita Iyama, Shigeaki Inoue, Keisuke Kaneda, Yusuke Ito, Hirotada Kobayashi, Emiko Nakataki, Nobuaki Shime
Critical care (London, England) 30(1) 2026年4月12日
BACKGROUND: The Sequential Organ Failure Assessment (SOFA)-2 score was developed to better reflect contemporary critical care practice by incorporating modern organ support modalities and updated thresholds based on recent data. However, the generalizability of this framework to intensive care unit (ICU) populations beyond the development cohort, particularly across organ support subgroups and major disease categories, remains uncertain. We aimed to evaluate the external validity of SOFA-2 using the OneICU database, a large Japanese critical care database with comprehensive domain-level data. METHODS: Adult ICU stays between February 2013 and August 2025 were included and classified into two cohorts: those with complete SOFA-1 and SOFA-2 component data on the day of ICU admission, and those with complete SOFA-2 data on that day. Discriminatory performance for ICU mortality was evaluated using the area under the receiver operating characteristic curve (AUROC) and compared between SOFA-1 and SOFA-2 using the DeLong test. Subgroup analyses were performed by major organ support device use and across disease categories. RESULTS: Among 152,883 eligible ICU stays, 67,116 had complete SOFA-1 and SOFA-2 data, and 121,443 had complete SOFA-2 data. SOFA-2 showed a slightly higher AUROC for ICU mortality than SOFA-1 (0.859 vs. 0.853; p < 0.001), although the absolute difference was small. Across subgroups defined by mechanical circulatory support use, SOFA-2 showed higher discrimination than SOFA-1. Discrimination was similar in other device-defined subgroups and in patients readmitted to the ICU. SOFA-2 also demonstrated good discrimination across major diagnostic groups. CONCLUSIONS: SOFA-2 showed similar discrimination for ICU mortality compared with SOFA-1 and maintained broadly comparable performance across clinically relevant subgroups, supporting its applicability for early severity assessment in heterogeneous ICU populations. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13054-026-06020-x.