Journals
Although autonomous vehicles have revolutionized the transportation landscape by enabling driving without direct human intervention, it is not yet perfect. For this reason, it is critically important for users to quickly respond to takeover requests from autonomous driving agents. Based on literature on framing effects in persuasion, this study focused on the efficacy of message framing and construal level theory. An experiment (N = 78 participants) was conducted using a driving simulator, employing a 2 (message framing: gain vs. loss) × 2 (temporal distance: distant vs. close) between-subjects design. The key findings indicate that gain framing led to higher levels of perceived benefit as well as compliance and behavioral intention. In contrast, loss framing resulted in higher levels of perceived risk related to danger and prompted quicker behavioral changes, such as lower levels of distraction and faster responses to takeover requests. Conversely, construal level in the messages did not show significant differences and had an impact only on perceived risk and distraction as a moderator. Discussion and implications are provided emphasizing the importance of the messages that autonomous car agents provide.
language disorders; multimodal; artificial intelligence
https://doi.org/10.1080/10447318.2025.2514257
Although autonomous vehicles have revolutionized the transportation landscape by enabling driving without direct human intervention, it is not yet perfect. For this reason, it is critically important for users to quickly respond to takeover requests from autonomous driving agents. Based on literature on framing effects in persuasion, this study focused on the efficacy of message framing and construal level theory. An experiment (N = 78 participants) was conducted using a driving simulator, employing a 2 (message framing: gain vs. loss) × 2 (temporal distance: distant vs. close) between-subjects design. The key findings indicate that gain framing led to higher levels of perceived benefit as well as compliance and behavioral intention. In contrast, loss framing resulted in higher levels of perceived risk related to danger and prompted quicker behavioral changes, such as lower levels of distraction and faster responses to takeover requests. Conversely, construal level in the messages did not show significant differences and had an impact only on perceived risk and distraction as a moderator. Discussion and implications are provided emphasizing the importance of the messages that autonomous car agents provide.
autonomous vehicle; message framing; construal level; public health
https://doi.org/10.1080/10447318.2025.2477744
While human-like social interactions can enhance trust in and acceptance of automated vehicles (AVs), overuse may hinder these benefits, reflecting the “uncanny valley of mind” effect. We hypothesized that the AV agent’s human-like features—calling drivers by their name (Name) and expressing emotions (Emotion)—enhance trust and acceptance individually but may have adverse effects when combined. A 2 × 2 between-subjects experiment (N = 84) examined these effects. Participants in the Name and Emotion combination were more likely to perceive the experiential mind in the AV compared to the Name or Emotion conditions. However, they were less likely to show behavioral trust in the AV than in the Emotion condition, to perceive the AV as useful than in either the Name or Emotion condition, and to show intention to use the AV than in the Name condition. These findings highlight potential trade-offs in designing social AV interactions.
automated vehicle; trust; user acceptance; anthropomorphism; mind perception; uncanny valley
https://doi.org/10.1080/10447318.2025.2491024
This study aims to identify ways to represent a conversational agent in the digital interface that can enhance older adults’ user experience focusing on both verbal (conversational form) and nonverbal factors (visual presence of conversational agent and background image). A total of 85 older adults participated in an experiment with a 2 (conversational agent: visual presence vs. no visual presence) × 2 (background image: present vs. absent) design, plus an additional condition with neither a conversational form nor manipulation of independent variables. Results highlights the importance of nonverbal factors especially environmental cues. Displaying a background image significantly increased perceived affective trust, while visual presence of the agent did not show any significant effects. Interestingly, there were interaction effects on perceived social presence, usefulness, and satisfaction. Findings also showed that using a conversational form can increase the likability, social presence, and perceived ease of use of the agent.
older adults; conversational agent; nonverbal; visual presence; background image; user experience
https://doi.org/10.1186/s40537-025-01138-1
Injury management is critical in all sports, directly impacting player performance. Baseball players are particularly susceptible to injuries, as players often compete in 5 to 7 games per week, placing continuous strain on their bodies. Among various injuries, Tommy John Surgery (TJS) poses a notable risk for Major League Baseball (MLB) pitchers. Traditional TJS prediction methods required sensors or video-based motion capture, which are impractical during actual games and limited in making predictions too close to the injuries, such as within 30 pitches. To address these challenges, this study proposes a deep learning (DL) framework that utilizes both classification and regression tasks. Using MLB pitching data (2016–2023), the classification model detects injury risk up to 100 days in advance with a high prediction performance of 0.73 F1-score, while the regression model estimates the time remaining until the player’s last pre-surgery game with R2 of 0.79. In addition, to enhance our model’s applicability, we employ an explainable artificial intelligence technique to analyze the impacting mechanical features, such as a lowered four-seam fastball release point, that accelerate UCL deterioration, increasing TJS risk. These findings provide a practical foundation for early intervention strategies, potentially preserving pitcher health and reducing the need for complex surgical procedures.
Injury Prediction; Deep Learning; Explainable AI
https://doi.org/10.1016/j.tele.2024.102227
This study aims to investigate media bias in news articles related to defense and foreign affairs by applying deep learning models and eXplainable artificial intelligence (XAI) techniques. We collected and analyzed seven, representing five major Korean media outlets, from conservative and liberal perspectives. The objective is to classify political bias and identify the specific words that contribute to this classification. We employed the BERT-base model from the Korean Language Understanding Evaluation and used local interpretable model-agnostic explanations for a comprehensive analysis. Our methodology achieved a remarkable accuracy of 98.2% in classifying the political bias of news articles, demonstrating the model’s effectiveness. The findings revealed distinct biases in coverage and statements across the media outlets: conservative outlets were more likely to emphasize threats and use singular references, while liberal outlets preferred peaceful and inclusive language. This study provides valuable insights into how the political biases of news media influence both the topics covered and the language used, even within the same category and time frame, ultimately shaping public perception.
BERT; Explainable AI; Text Classification
https://doi.org/10.1016/j.patcog.2025.111376
In multimodal visual understanding, fusing RGB images with additional modalities like depth or thermal data is essential for improving both accuracy and robustness. However, traditional approaches often rely on task-specific architectures that are difficult to generalize across different multimodal scenarios. To address this limitation, we propose the Cross-modal Spatio-Channel Attention (CSCA) module, designed to flexibly integrate diverse modalities into various model architectures while enhancing performance. CSCA employs spatial attention to capture interactions between modalities effectively, improving model adaptability. Additionally, we introduce a patch-based cross-modal interaction mechanism that optimizes the processing of spatial and channel features, reducing memory overhead while preserving critical spatial information. These refinements significantly simplify cross-modal interactions, increasing computational efficiency. Extensive experiments demonstrate that CSCA generalizes well across various multimodal combinations, achieving promising performance in crowd counting and image segmentation tasks, particularly in RGB-Depth, RGB-Thermal, and RGB-Polarization scenarios. Our approach provides a scalable and efficient solution for multimodal integration, with the potential for broader applications in future work.
Multimodal Learning; Sensor Fusion; Semantic Segmentation; Crowd Counting
https://doi.org/10.1016/j.patrec.2024.10.011
Test-time adaptation (TTA) refines pre-trained models during deployment, enabling them to effectively manage new, previously unseen data. However, existing TTA methods focus mainly on global domain alignment, which reduces domain-level gaps but often leads to suboptimal performance. This is because they fail to explicitly consider class-wise alignment, resulting in errors when reliable pseudo-labels are unavailable and source domain samples are inaccessible. In this study, we propose a prototypical class-wise test-time adaptation method, which consists of class-wise prototype adaptation and reliable pseudo-labeling. A main challenge in this approach is the lack of direct access to source domain samples. We leverage the class-specific knowledge contained in the weights of the pre-trained model. To construct class prototypes from the unlabeled target domain, we further introduce a methodology to enhance the reliability of pseudo labels. Our method is adaptable to various models and has been extensively validated, consistently outperforming baselines across multiple benchmark datasets.
Test-time adaptation; Class-wise alignment; Class prototypes; Continual learning, Image classification
https://doi.org/10.1038/s41598-024-79034-6
Age-related macular degeneration (AMD) is a major cause of blindness in developed countries, and the number of affected patients is increasing worldwide. Intravitreal injections of anti-vascular endothelial growth factor (VEGF) are the standard therapy for neovascular AMD (nAMD), and optical coherence tomography (OCT) is a crucial tool for evaluating the anatomical condition of the macula. However, OCT has limitations in accurately predicting the degree of functional and morphological improvement following intravitreal injections. Artificial intelligence (AI) has been proposed as a tool for predicting the treatment response of nAMD based on OCT biomarkers. Our study focuses on the development and assessment of an AI model utilizing the DenseNet201 algorithm. The model aims to predict anatomical improvement based on OCT images before, and during anti-VEGF therapy. The training process involves two scenarios: (1) using only preinjection OCT images and (2) utilizing both OCT images before and during anti-VEGF therapy for model training. The outcomes of our investigation, involving 2068 images from a cohort of 517 Korean patients diagnosed with nAMD, indicate that the AI model we introduced surpassed the predictive performance of ophthalmologists. The model exhibited a sensitivity of 0.915, specificity of 0.426, and accuracy of 0.820. Notably, its predictive capabilities were further enhanced with the inclusion of additional OCT images taken after the first and second injections during the loading phase. The treatment prediction performance of the model was the highest when using all input modalities (before injection, and after the first and second injections) and concatenation-based fusion layers. This study highlights the potential of AI in assisting individualized and tailored nAMD treatment.
10.6109/jkiice.2024.28.10.1144
In metropolitan subway operations, the implementation of express train services is crucial for facilitating the efficient movement of large passenger volumes. However, preliminary feasibility studies for express train operations often rely on outdated standards from the 1990s, which fail to accurately reflect contemporary needs and conditions. To address these limitations, this study employs a Graph Convolutional Network (GCN) utilizing data from subway stations, user reviews, and the interrelationships between stations. This approach aims to provide a more current and comprehensive method for selecting express stop stations, incorporating user feedback. Additionally, the study evaluates efficiency using Data Envelopment Analysis (DEA). The findings demonstrate a significant increase in efficiency on the subway lines between Japan and New York, thereby validating the potential of using review data in the selection of express stop stations. This paper presents a novel set of criteria for effective stop selection that aligns with user needs, promoting more efficient budget allocation and broader public support.
Subway express; Graph; DEA; Review
10.5909/JBE.2024.29.6.1067
Video anomaly detection has emerged as a prominent deep learning research field due to its extensive applications in security,safety, and quality control. However, existing deep learning-based anomaly detection methods face fundamental limitations, heavyreliance on training data and limited explainability of detection results. To overcome these challenges, we propose a rule-basedzero-shot video anomaly detection framework that integrates object detection and semantic segmentation. Our approach definesexplicit rules based on object-background relationships and accurately interprets scene structure using pre-trained vision models. This enables effective anomaly detection in new environments without domain-specific training. Through experiments on theShanghaitech and NWPU Campus datasets, we demonstrate that our method achieves superior performance to existing approacheswithout additional training.
Video Anomaly Detection; Object Detection; Semantic Segmentation
10.5391/JKIIS.2024.34.6.527
Knowledge distillation has gained attention as a promising technique that canreduce model size while maintaining the performance of large-scale deep learningmodels. However, conventional knowledge distillation methods employ a staticapproach that applies uniform weights to all training samples, failing to consider thedifficulty of problems or characteristics of the teacher model. This study proposes anovel method that adaptively adjusts the intensity of knowledge distillation basedon the entropy of teacher model outputs. The proposed method estimates thedifficulty of each training sample through the output entropy of the teacher modeland accordingly adjusts the balance between knowledge distillation and directlearning. Experimental results on the CIFAR-100 dataset show that our methodachieves additional accuracy improvements of 0.47% and 0.62% on average comparedto conventional knowledge distillation for same and different network architectures,respectively. Notably, the method demonstrates greater performance improvementson samples where the teacher model makes incorrect predictions, proving that it canachieve enhanced generalization capability beyond simple imitation.
Knowledge Distillation; Model Compression; Image Classification
https://doi.org/10.48550/arXiv.2410.15467
Mental health issues have become a critical global concern, with excessive stress being one of the primary contributors. Prolonged stress can lead to serious mental health problems. As such, early detection of stress factors is crucial. Previous studies primarily used text-based super- vised learning models to predict stress factors; however, these models have several limitations. They rely on labeled training data, making it difficult to generalize to new stress factors, and they offer limited interpretability of their predictions. This study proposes a stress detection method based on OpenAI’s GPT-4o large language model (LLM) to address these challenges. This method does not require training data and can flexibly adapt to new stress factors using various prompting techniques, such as Zero-shot, Few-shot, Chain-of-Thought (CoT), and Tree- of-Thought (ToT). Additionally, LLMs provide clear explanations of their reasoning processes, making their predictions more trustworthy. Experimental results on the SAD dataset demonstrate that the LLM-based approach achieves comparable performance to general supervised models without the need for labeled data. Additionally, the LLM shows strong interpretabil- ity and accurately infers stress factors with fine granularity, making it a promising solution for stress detection in mental health applications.
Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)
https://doi.org/10.3837/tiis.2024.12.001
During election periods, many polling agencies survey and distribute approval ratings for each candidate. In the past, public opinion was expressed through the Internet, mobile SNS, or the community, historically, individuals had limited options for gauging approval ratings and primarily relied on traditional opinion polls. Analyzing public opinion expressed on the Internet through natural language analysis allows for determining a candidate's approval rate with comparable accuracy to traditional opinion polls. Therefore, this paper proposes a method of inferring the approval rates of candidates during election periods by synthesizing the political comments of users through internet community posting data. To analyze the approval ratings of the posts, we propose to generate a model that has the highest correlation with the actual polls using data augmentation techniques, using the KcBert, KoBert, and KoELECTRA models.
Opinion polls, social media, natural language processing, election predictions
https://doi.org/10.48550/arXiv.2401.05826
Despite stringent data protection regulations such as the General Data Protection Regulation (GDPR), the California Consumer Privacy Act (CCPA), and other country-specific regulations, many websites continue to use cookies to track user activities. Recent studies have revealed several data protection violations, resulting in significant penalties, especially for multinational corporations. Motivated by the question of why these data protection violations continue to occur despite strong data protection regulations, we examined 360 popular e-commerce websites in multiple countries to analyze whether they comply with regulations to protect user privacy from a cookie perspective.
Web security; privacy; cookie; GDPR; CCPA; tracking.
https://doi.org/10.1080/10447318.2024.2425881
Negative feedback can have detrimental effects on the students’ self-efficacy and learning experience, yet it is inevitable for students who receive low outcomes and need improvement. This study investigates the role of pedagogical agents’ self-disclosure in providing empathy for negative feedback. An online experiment was conducted asking participants (N = 183) to interact with a voice-based pedagogical agent in a between-subjects design: 2 (feedback: positive vs. negative) X 2 (agent: self-disclose vs. non-disclose). The agent instructed students on online learning tasks and provided feedback on their task performance. Our findings showed that the agent’s self-disclosure significantly increased students’ perception of intimacy and cognitive trust toward the agent. A significant interaction effect was observed in intimacy, suggesting that the role of self-disclosure is especially pronounced when negative feedback is provided. A significant mediation effect of cognitive trust was also found between self-disclosure and feedback acceptance.
Pedagogical agent; selfdisclosure; trust; intimacy; feedback acceptance; voice user interface
https://doi.org/10.1057/s41599-024-04195-8
Adolescent violence has been one of the most serious social concerns for the last few decades. With the rapid development and spread of the Internet and dig- ital technologies, online violence has become another major type of adolescent violence. This study investigates the antecedents of South Korean adolescents’ offline and online violence by employing both the theoretical and empirical foun- dations of traditional violence literature. The research model was proposed and constructed based on general strain theory and social ecological theory with considerations of 2,481 middle school first-grade student samples from the 2018 Korean Children and Youth Panel Survey (KCYPS). Structural equation mod- eling (SEM) results presented the direct effects of emotions and indirect effects of social relationships on adolescents’ delinquency, bullying, and online violence. Overall, the findings of the current study are consistent with those of previous studies and theoretical assumptions. Except for the effect of social withdrawal on delinquency, the emotions had significant effects on violence perpetration. Furthermore, relationships with parents, friends, and teachers showed protec- tive effects against negative emotions. Finally, online violence was significantly affected by all types of social relationships. The findings of this study can pro- vide a better understanding of both online and offline violence in adolescents. In addition, since the results were derived from a nationally representative sample, this study can provide practitioners in South Korea with guidance on how to set proper interventions for adolescents’ social and emotional aspects of violence perpetration.
Online violence; Offline violence; General strain theory, Social ecological theory, Structural equation modeling
https://doi.org/10.1038/s41598-024-75995-w
Patients with end-stage kidney disease (ESKD) frequently experience anemia, and maintaining hemoglobin (Hb) levels within a targeted range using erythropoiesis-stimulating agents (ESAs) is challenging. This study introduces a gated recurrent unit-attention-based module (GAM) for efficient anemia management among patients undergoing chronic dialysis and proposes a novel alert system for anticipating the need for red blood cell transfusions. Data on demographic characteristics, dialysis metrics, drug administration, laboratory tests, and transfusion history were retrospectively collected from patients undergoing hemodialysis at Kangwon National University Hospital between 2017 and 2022. After preprocessing, a final dataset of 252 patients was used for model training. Our model functions in two major phases: (1) Hb level prediction and ESA dose recommendation and (2) transfusion alert framework. The GAM model outperformed traditional machine learning algorithms, including linear regression, XGBoost, and multilayer perceptron, in predicting Hb levels (R-squared value=0.60). The model also demonstrated a recommendation accuracy of 0.78 compared to that of clinical experts, indicating a high degree of concordance with the ESA dosing recommendations. Additionally, the model exhibited considerably high accuracy (0.99) for transfusion alarms. Thus, the GAM model holds promise for improving anemia management in patients with ESKD by optimizing ESA dosages and providing timely transfusion alerts.
Anemia; End-stage kidney disease; Artificial intelligence; Transfusion alert; Erythropoiesis-stimulating agents