Toward hepatitis C virus elimination using artificial intelligence
Article information
Since the introduction of direct-acting antivirals (DAAs), the treatment paradigm for hepatitis C virus (HCV) infection has changed, leading to the World Health Organization’s agenda to reduce new HCV infections by 90% and deaths by 65% between 2016 and 2030 [1,2]. However, despite the high potency of DAAs regardless of HCV genotypes, about 1–3% of patients with chronic hepatitis C still fail to achieve a sustained virologic response (SVR) [3,4]. Decompensated liver cirrhosis, hepatocellular carcinoma (HCC), and high HCV RNA levels are associated with a greater risk of SVR failure [5]. As patients who do not attain SVR should be considered for rescue therapy, predicting and responding to SVR failure in advance can play a critical role in achieving HCV elimination goals.
Under these circumstances, in the article accompanying this editorial, Lu et al. developed and validated an artificial intelligence (AI) model to predict DAA treatment failure using a nationwide cohort in Taiwan [6]. The Taiwan HCV Registry database is a multi-center, prospective cohort that has enrolled over 30,000 patients with chronic hepatitis C receiving DAA treatment with available SVR data. This database includes various baseline demographic information and virologic factors before and after DAA treatment, and a total of 55 host and virologic factors were incorporated in the model development. The authors constructed several models using different machine learning algorithms and showed that a model employing Extreme Gradient Boosting (XGBoost) was more effective in predicting DAA treatment failure compared to other algorithms and a model based on traditional statistics (i.e., logistic regression). The XGBoost algorithm demonstrated accuracy, specificity, positive predictive value, and negative predictive value greater than 97%. The AI model detected 69.7% of the subjects who failed to achieve SVR among the top five decile subgroups. In a multivariable regression analysis, liver cirrhosis, HCC, poor compliance with DAA, and high hemoglobin A1c levels were identified as independent risk factors for SVR failure. The AI model showed that patients with higher fibrosis-4 index, bilirubin, aspartate aminotransferase, and alpha-fetoprotein levels, as well as lower albumin and platelet levels, were less likely to attain SVR. It is consistent with the result of multivariable regression analysis, as these variables are known risk factors for the development of liver cirrhosis or HCC.
This model may contribute to an individualized decision-making process for antiviral treatment in patients with chronic hepatitis C. Patients with decompensated cirrhosis are less likely to achieve SVR compared to those with compensated cirrhosis [7]. International guidelines vary in their recommendations for the initiation of DAA therapy in patients with decompensated liver cirrhosis, based on liver function assessed by the Child-Pugh score or Model for End-Stage Liver Disease (MELD) score [8-11]. In general, patients with a significantly higher MELD score should receive liver transplantation followed by DAA treatment, while those with a lower MELD score and no planned liver transplantation are recommended to be treated with DAAs first. However, patients in the gray zone should be managed on a case-by-case basis. In this case, the risk of DAA treatment failure predicted by the AI model can help patients and physicians make appropriate decisions. If the risk of SVR failure calculated by the AI model is significantly high, it may be in the patient’s best interest to defer treatment, given the side effects and costs of DAA therapy. Chronic hepatitis C patients without decompensated liver cirrhosis who are at high risk of DAA treatment failure as predicted by the AI model may also benefit from a more potent combination of antivirals than is currently recommended by clinical guidelines to increase SVR rates and prevent the development of DAA resistance. In addition, high-risk patients who are likely to fail on existing treatment recommendations can be considered for early enrollment in clinical trials.
The risk of DAA treatment failure is likely to be determined by a combination of several variables, such as disease-related factors (e.g., serum HCV RNA level and presence of fibrosis) and host factors (e.g., age, sex, and comorbidities). It is possible to train machine learning models to identify complex non-linear associations between variables, which are challenging to capture through traditional statistical methods. Considering the heterogeneity of risk variables and patient populations, a more comprehensive model using AI may be better at accurately stratifying patients with different levels of risk, thereby providing individualized treatment strategies [12].
The AI model in this study utilized multiple variables from a large cohort and demonstrated impressive performance in predicting DAA treatment failure. The model showed higher predictive power than a model using conventional statistics, as well as another machine learning model that had been developed under similar conditions. A machine learning model developed to predict DAA treatment failure in the HCV-TARGET registry of patients in North America and Europe showed a predictive power of c-index of 0.69 in its validation cohort [13]. Despite the fact that these are two distinct cohorts, which complicates a direct comparison and necessitates cross-validation between them, the model in the current study achieved a good accuracy of AUROC 0.803 in the validation cohort. This impressive performance might be attributed to the larger sample size. A large number of events are required to appropriately train an AI model. Since the rate of DAA failure is relatively low, the authors utilized a nationwide cohort to ensure that the number of DAA failure events was large enough to train the model. In fact, SVR failure was confirmed in 538 individuals, accounting for 1.6% of the entire study population.
While the AI models in this study demonstrated impressive predictive accuracy, it is important to note that there was a significant decrease in model performance in the validation cohort compared to the training cohort. The superior performance of AI models over logistic regression models was maintained in both the training and validation cohorts. However, while the performance of the logistic regression models did not differ significantly between the two cohorts (rather, it slightly increased in the validation cohort), the decrease in the performance of the AI models in the validation cohort suggests the presence of overfitting. In other words, the AI models may be overtrained in the training cohort and less accurate than expected in other populations. Although this model has been developed in a large cohort, it is limited by the fact that the study was conducted in a single ethnically homogeneous country. Therefore, external validation of this AI model in another independent international cohort is required.
In conclusion, this AI model can be used to identify patients with chronic hepatitis C who are susceptible to SVR failure and to recommend more intensive antiviral therapy than is recommended by current guidelines. In patients with decompensated liver cirrhosis, the AI model may also help determine the optimal timing for DAA treatment. However, given the generalizability issue of this model, it needs to be validated in another international cohort.
Notes
Authors’ contribution
Conceptualization, J-H Lee; Original draft, MH Hur and J-H Lee; Review and editing, MH Hur and J-H Lee.
Conflicts of Interest
MH Hur has no conflict of interest to disclose. J-H Lee receives research grants from Yuhan Pharmaceuticals and GreenCross Cell, and lecture fees from GreenCross Cell, Daewoong Pharmaceuticals, and Gilead Korea.
Abbreviations
DAAs
direct-acting antivirals
HCV
hepatitis C virus
SVR
sustained virologic response
HCC
hepatocellular carcinoma
AI
artificial intelligence
MELD
Model for End-Stage Liver Disease