Journals
https://doi.org/10.1027/1864-1105/a000466
As organizations develop strategies to leverage the power of interactive media for cultivating relationships with their publics, there has been an increasing emphasis on message interactivity, that is, having threaded conversations with individual stakeholders. This can be quite resource intensive in that it involves not only one-time responses to individual questions, but exchanging a series of messages that are contingent upon preceding messages in the thread. While a minority of the stakeholders interact with organizations, often in a public way through messaging channels such as social media, the vast majority simply bear witness to such interactions without personally participating in the interaction. Does simply viewing others’ interactions affect one’s impression of the organization, and does it vary based on the degree of message interactivity? We studied these questions with an experiment which tested the cueing effect of message interactivity. Data from a 3-condition (message interactivity: low vs. medium vs. high) between-participants experiment (N = 252) show that message interactivity serves as a cue that leads publics to have greater trust, control mutuality, commitment, and satisfaction by promoting a perception of conversationality. Findings also include potential negative effects of incorporating message interactivity in organizations’ online platforms. Theoretical and methodological implications are discussed.
message interactivity, relationship management, mobile sites, theory of interactive media effects (TIME)
https://doi.org/10.1080/15213269.2025.2558036
Given that the nature of training data is the primary cause of algorithmic bias, do laypersons realize that systematic misrepre- sentation and under-representation of certain races in the train- ing data can affect AI performance in a way that privileges some races over others? To answer this question, we conducted three between-subjects online experiments (N = 769 in total) with a prototype of an AI system that recognizes emotion-based facial expressions. Our results show that, by and large, training data representativeness is not an effective cue to communicate algorithmic bias. Instead, users rely on AI’s performance bias to perceive racial bias in AI algorithms. In addition, the race of the users matters. Black participants perceive the system to be more biased when all facial images used to represent unhappy emo- tions in the training data are those of Black individuals. This finding highlights a significant human cognitive limitation that should be accounted for when communicating algorithmic bias arising from biases in the training data.
https://doi.org/10.1080/0144929X.2025.2545312
Interactive features have become ubiquitous in mobile apps as they compete for users’ attention. This raises an important question: Is interactivity a ‘dark pattern’, tricking users into giving away their personal information without thinking about consequences? Are some types of interactivity more persuasive than others? We addressed these questions with a 2 (message interactivity: low, high) X 3 (modality interactivity: absence, low, high) between-subjects experiment (N = 216) designed to test hypotheses derived from the Theory of Interactive Media Effects (TIME) and the Heuristic-Systematic Model (HSM) of information processing. Our data show that both modality interactivity and message interactivity had positive effects on attitudes toward the app and intentions to use it by directing users’ attention away from privacy concerns to heightened perceived playfulness. We discuss the ethical implications of the double-edged nature of interactive technology for both users and developers.
Online information disclosure; message interactivity; modality interactivity; perceived playfulness; privacy concerns; interactive design
https://doi.org/10.51548/joctec.7.1.2025.01
This study explores why people use AI-based media, with a focus on identifying major gratifications. Previous studies have pinpointed specific gratifications for specific AI media, but a universal set of gratifications common across most AI interfaces is yet to be identified. As media effects scholars increasingly treat AI as a distinct technology, recognizing common factors shaping user experience with AI is crucial. Guided by the Uses and Gratifications (U&G) theory, we conducted two survey studies (combined N = 1,264), resulting in a reliable and valid 29-item AI Gratifications Scale, that loaded cleanly onto seven factors: anthropomorphism, privacy consciousness, informality, seamlessness, accuracy, fairness, and acknowledgment of limitations. This 7-factor scale was invariant across nine major AI functions. Practical implications of these findings for designing and assessing human-AI interaction are discussed.
https://doi.org/10.1016/j.chbah.2025.100128
People often have anxiety toward artificial intelligence (AI) due to lack of transparency about its operation. This study explicates this anxiety by conceptualizing it as a trait, and examines its effect. It hypothesizes that users with higher AI (trait) anxiety would have higher state anxiety when interacting with an AI doctor, compared to those with lower AI (trait) anxiety, in part because it is a deviation from the status quo of being treated by a human doctor. As a solution, it hypothesizes that an AI doctor’s explanations for its diagnosis would relieve patients’ state anxiety. Furthermore, based on the status quo bias theory and an adaptation of the theory of interactive media effects (TIME) for the study of human-AI interaction (HAII), this study hypothesizes that the affect heuristic triggered by state anxiety would mediate the causal relationship between the source cue of a doctor and user experience (UX) as well as behavioral intentions. A pre-registered 2 (human vs. AI) x 2 (explainable vs. non-explainable) experiment (N = 346) was conducted to test the hypotheses. Data revealed that AI (trait) anxiety is significantly associated with state anxiety. Additionally, data showed that an AI doctor’s explanations for its diagnosis significantly reduce state anxiety in patients with high AI (trait) anxiety but in- crease state anxiety in those with low AI (trait) anxiety, but these effects of explanations are not significant among patients who interact with a human doctor. Theoretical and design implications of these findings and limitations of this study are discussed.
AI anxiety; State anxiety; Explainable AI; Healthcare AI; Medical AI; HAII-TIME
https://doi.org/10.5573/ieie.2025.62.7.56
RGB-D semantic segmentation is a research field that addresses scene understanding challenges that are difficult to solve using only RGB information by incorporating depth data. This study applies prompt learning techniques to RGB-D semantic segmentation, enhancing performance by adding a minimal number of parameters while maintaining the original model structure. In particular, the post-fusion prompt method is a simple yet effective approach that minimizes information loss and maximizes interaction between the two modalities. The superiority of the post-fusion approach over the pre-fusion method was experimentally validated on the NYUv2 and SUN RGB-D datasets. In the case of the NYUv2 dataset, our method outperformed MultiMAE (Multimodal Multitask Masked Autoencoders), a representative multimodal learning approach, by approximately 2.2% in mIoU. These findings suggest new possibilities for prompt learning in the fusion process of RGB and depth information.
Multimodality, RGB-D, Segmentation, Prompt learning
https://doi.org/10.1136/bmjopen-2024-097236
Introduction: Stress is a major health issue in contemporary society, and mindfulness-based approaches reduce stress and anxiety but face practical barriers to consistent practice; this protocol evaluates a Virtual Reality (VR)-based observation meditation programme with an artificial intelligence (AI) coach (‘Otti’) that delivers real-time empathic, tailored prompts to support present-focused attention and emotion regulation in university students in the United States. A single-centre randomised controlled trial in Pennsylvania will assess immediate psychophysiological effects and user acceptability after a single 15 min session following a standardised Stroop stressor in a university laboratory setting.
Methods and analysis: An a priori power analysis (f=0.25, α=0.05, power=0.80) supports recruitment of 34 students (n=17 per group) in a single-centre randomised controlled design comparing AI-coached VR observation meditation to a no-treatment leisure control within a 30 min visit. Participants complete pre-intervention surveys Perceived Stress Scale-10 (PSS-10), Depression Anxiety Stress Scales (DASS-21), State–Trait Anxiety Inventory (STAI-State, STAI-Trait) and baseline heart rate/Heart Rate Variability (HRV) via smartwatch, undergo the 15 min intervention or control, then complete postintervention surveys and repeated heart rate/HRV recording; effects will be tested using repeated-measures analysis of variance, with heart-rate data exported and preprocessed per the prespecified plan. Primary outcomes include perceived stress (PSS-10), emotional state (DASS-21, STAI-State, STAI-Trait), physiological stress response (heart rate/HRV) and participant satisfaction via a structured postintervention survey (usability, perceived effectiveness, comfort).
Ethics and dissemination: The study received IRB approval from The Pennsylvania State University Institutional Review Board (PSU CATS IRB: STUDY00025978; ClinicalTrials.gov: NCT06704282), and all participants will provide written informed consent prior to procedures. Findings will be disseminated via open access publication, conference presentations and stakeholder-focused briefs, with an anonymised primary-outcome dataset available on reasonable request in line with BMJ Open policies and Standard Protocol Items: Recommendations for Interventional Trials (SPIRIT)/International Committee of Medical Journal Editors (ICMJE) guidance.
Background: Panic attack prediction remains a critical challenge in mental health care due to the high interindividual variability of physiological responses and the limitations of subjective psychological assessments.
Objective: This study aims to develop a multimodal deep learning framework that integrates real-time physiological signals from wearable electrocardiogram (ECG) monitors and psychological assessments to improve the accuracy of panic attack prediction. Methods: We adapted the ConvNetQuake architecture, originally designed for seismic detection, to extract temporal patterns from ECG signals. The model was pretrained on the PTB-XL ECG dataset and fine-tuned using wearable ECG data collected from adult participants. In parallel, psychological profiles based on the Diagnostic and Statistical Manual of Mental Disorders, 4th Edition criteria and Panic Disorder Severity Scale assessments were encoded as auxiliary inputs. The multimodal frame- work was evaluated using standard performance metrics.
Results: The proposed model achieved an accuracy of 71.43%, precision of 83.72%, recall of 70.59%, and F1 score of 76.60% in detecting heart rate variability anomalies associated with panic episodes. Experimental comparisons demonstrated that the integration of physiological and psychological modalities significantly outperformed unimodal baselines in prediction reliability.
Conclusions: This study provides empirical support for wearable-based early warning systems for panic attacks. The proposed approach demonstrates the feasibility of just-in-time digital interventions and underscores the potential of wearable artificial intelligence in advancing affective computing and digital psychiatry.
electrocardiogram; heart rate variability; panic attack prediction mental health; wearable devices; digital mental health
Internet gaming disorder (IGD) affects 3% of the global population and poses an increasing risk due to advancements in technology. However, there is currently no definitive treatment for this condition. IGD is not a primary disorder but rather a result of “self-prescription” in response to emotional stressors. Unlike conventional mental health treatments that focus on the disorder itself, it is crucial to provide alternative activities that can alleviate negative emotions. This paper extends the concept of the self-medication hypothesis and integrates it with cognitive models of cognitive behavioral therapy and mindfulness-based cognitive therapy. In addition, it introduces the mindfulness-based cognitive therapy–game (MBCT-G), a program designed to explore alternative activities through gaming, focusing on the processes of response and reward, which are not typically emphasized in traditional treatments. This study serves as the theoretical foundation for the development of MBCT-G. MBCT-G aims to train individuals in positive coping strategies that alleviate psychological distress, offering a novel approach to treating self-prescription disorders such as IGD.
cognitive behavior therapy; psychosocial intervention; video games; internet gaming disorder; internet addiction; mindfulness; mental health
https://doi.org/10.1080/10447318.2025.2593550
Human-AI collaborative systems are increasingly explored as tools for promoting men- tal well-being and supporting personal development. We present POCKET-MIND, a personalized digital journaling system powered by a Large Language Model (LLM) that facilitates both emotional exploration and goal pursuit through a novel Dual- Prompt Framework. Unlike traditional journaling apps that treat emotional reflection and goal tracking as separate tasks, POCKET-MIND integrates these dimensions by generating adaptive prompts that help users meaningfully connect their feelings with their personal aspirations. In a one-week exploratory study with 30 young adults, pre- liminary findings suggest that POCKET-MIND may support emotional articulation, self- reflection, and goal-directed behaviors. While the study had a relatively small sample size, the findings highlight the potential of Human-AI collaborative journaling for per- sonal mental health support. This work contributes to Human-Computer Interaction (HCI) by offering early design insights into adaptive conversational systems that per- sonalize reflective practices and foster user growth through interactive experiences.
Human-AI interaction; personalized systems; large language models (LLMs); adaptive interaction
http://doi.org/10.1177/21582440251395922
Emotional support dialog systems face computational linguistic challenges as they require a deep understanding of explicit utterances and implicit emotional needs. In particular, existing models have shown limitations in effectively capturing subtle emotional contexts, which are essential for providing meaningful emotional support. To address this, we propose Generative Retrieval-Enhanced Emotional Support Conversations (GREEN), an emotional support dialog model using generative retrie- val. Inspired by docID, GREEN introduces a Residual Identifier (ResID), enabling the dynamic identification of emotional con- text and appropriate support strategies from seeker utterances. By approaching emotional support as a context prediction task, our model works to understand both the explicit meaning of utterances and the underlying emotional needs of seekers. GREEN achieves significant improvements over SOTA models on ESConv with over 25% gains in response diversity metrics, 8.3% in content quality (BLEU-4), and 9.8% in strategy prediction accuracy. Our approach integrates generative retrieval with ResID-based context analysis, advancing emotional support dialog systems. For balanced reporting, we note current limita- tions—ResID stability under quantization/clustering and ambiguity when misidentification occurs—and plan to improve semantic matching and identifier design with broader real-world validation.
emotional support conversation (ESC), generative retrieval, Response ID (ResID)
https://doi.org/10.1371/journal.pone.0325697
Understanding the multiple impacts of green spaces on individual health and overall qual- ity of life is a key factor in urban planning and public health promotion. This study inte- grated smartphone Wi-Fi and GPS location data, survey data, and green space data to analyze the relationships between green space visitation patterns and sociodemo- graphic characteristics, health, and green space perceptions of 1,715 residents of the Seoul metropolitan area in South Korea. Green space visitation patterns of urban res- idents were categorized into Non-Visitors (rarely visited green spaces), Weekday Visi- tors (weekday visits), Weekend Visitors (weekend visits), and Frequent Visitors (weekday and weekend visits). The health status of residents in each group was assessed using the EQ-5D-5L scale, which evaluates overall mental and physical health. The analysis indicated variations in educational background across groups, with the Non-Visitors and Frequent Visitors showing differing distributions. In addition, the Weekend Visitors group had the best mental and physical health, which were significantly different from the Non- Visitors group. Perceptions of green space were significantly more positive for Weekend Visitors and Frequent Visitors than for Non-Visitors. These results suggest that green space usage patterns can be segmented not only by frequency of visits, but also by when and whether they are visited. It is also worth noting the differences in green space visita- tion by educational background, highlighting the need for environmental education pro- grams and campaigns to mitigate these environmental inequalities. The positive effect of weekend visits, in particular, highlights the value of green spaces for leisure and relax- ation. This finding suggests that urban planning can benefit city residents by providing high-quality, easily accessible green spaces.
http://doi.org/10.6109/jkiice.2025.29.3.297
Wearable devices monitor physiological signals like ECG and HRV to support mental health management by providing continuous monitoring of critical metrics. However, panic attack prediction remains in its early stages due to challenges in data collection, variability of physiological signals, and accurately quantifying psychological factors, which are often subjective. This study trains the ConvNetQuake model using the PTB ECG dataset and integrates ECG data from wearable devices with DSM-IV and PDSS survey results to enhance prediction reliability. The model detects HRV anomalies with an accuracy of 71.43%, precision of 83.72%, recall of 70.59%, and an F1 score of 76.60%. By combining physiological signals and psychological data, the proposed approach enables accurate panic attack prediction, supporting early intervention, personalized treatment strategies, and timely care. This integration of multimodal data highlights the transformative potential of wearable technology in improving the quality of life for individuals with panic disorder.
Wearable Devices, Electrocardiogram (ECG), Heart Rate Variability (HRV), Panic Attack Prediction, Mental Health
http://doi.org/10.6109/jkiice.2025.29.4.555
Online gambling can easily expose anyone to the risk of addiction due to its high accessibility, anonymity, diverse games, and immediate compensation structure. This study proposes digital therapeutics to reduce the likelihood of addiction recurrence. To this end, several previous studies were reviewed to confirm the effects of warning messages and visual stimuli on addiction suppression, and gambling site discrimination techniques based on text analysis. A personalized addiction prevention system was proposed by implementing a prototype that detects gambling sites in real time upon user access and presents customized warning phrases and images. The prototype confirmed its value as a digital treatment by helping users immediately recognize gambling-related risk signals and voluntarily suppress addiction behavior, based on non-face-to-face interviews with actual online gambling addiction recoverors. This study also holds academic and social significance by presenting active, personalized intervention as a digital treatment strategy, moving away from passive, blocking-oriented policies.
Online gambling, Deep learning, Large Language Model(LLM), Digital Therapeutics(DTx), Warning system
http://doi.org/10.6109/jkiice.2025.29.3.440
Recommendation systems play a crucial role in enhancing user experience and maximizing business performance across various domains. This study proposes a re-ranking method based on sequential recommendation systems (SASRec, BERT4Rec) utilizing large language models (LLMs) to improve recommendation performance. The proposed methodology demonstrates enhanced performance through the Amazon review dataset and verifies its effectiveness in real business environments using beauty platform Olive Young review data. Additionally, to ensure cost efficiency, small LLMs (sLLMs) such as Llama 3.2 1B were employed, and techniques like Chain-of-Thought (CoT) reasoning and knowledge transfer were applied to optimize the re-ranking performance of sLLMs. This approach effectively models the logical flow of user behavior and significantly improves the quality of recommendation lists. By balancing cost and performance, this study validates the practicality of LLM-based re-ranking across diverse datasets, presenting new possibilities for recommendation systems. It provides valuable insights into enhancing personalized recommendation systems and guiding the design of next-generation recommendation systems.
Sequential Recommendation; Recommendation Reranking; Large Language Models; Small Large Language Models; Knowledge Distillation
http://doi.org/10.6109/jkiice.2025.29.3.440
Human experiences and values are shaped by cultural backgroud, which greatly affects behavior and perspective. Large Language Model(LLM) also internalize specific cultural biases according to the training data. This study quantitaively evaluates the cultural tendencies of LLMs such as GPT-3.5, GPT-4, Gemini Pro, LLaMA3.2 7B in six cultural dimension theory, and proposes a Personalized Bias Framework that can adjust the output to meet user nees. This allows this study to systematically analyze the cultural biases of LLMs and enable user-centered, customized outputs. It also lays the foundation for realizing Responsible AI and presents the possibility of being expandable to various types of biases in the future.
Cultural Bias; Hofsttede's Cultural Dimensions; Responsible AI; Personalized AI
http://doi.org/10.6109/jkiice.2025.29.11.1435
This study proposes a novel multimodal approach that integrates electroencephalogram (EEG) and speech data to generate personalized spatial audio feedback tailored to the user’s emotional state, with the ultimate aim of establishing its utility as a therapeutic tool. EEG signals, which provide insight into the functional states of the brain, are recognized as biosignals capable of capturing real-time affective responses. Speech data complement this by offering acoustic features such as prosody, pitch, and speech rate that are indicative of emotional expression. By leveraging the complementary characteristics of these two modalities, the proposed system is designed to assist children with Autism Spectrum Disorder (ASD) in recognizing and regulating their emotional states more effectively. The application of spatial audio technology allows emotional fluctuations to be intuitively reflected through dynamic auditory feedback. Importantly, this research marks a pioneering effort in constructing a multisensory feedback environment.
Affective Computing; Brain-Computer Interface (BCI); EEG-Speech Analysis; Multimodal Emotion Recognition; Spatial Audio Processing
https://doi.org/10.1016/j.cose.2025.104364
Voice phishing (vishing) is a sophisticated phone scam that causes significant financial harm to victims. Recently, vishing attacks have become more effective due to the use of vishing malware installed on victims’ devices. Conventional anti-malware solutions, which rely on static analysis of app code and permissions at install time, are circumvented by vishing malware that requests additional code and permissions after installation. We introduce VishielDroid, a novel system for real-time detection of vishing malware on Android devices. By dynamically tracking apps’ runtime permission requests, a critical indicator of malicious behavior specific to vishing malware, VishielDroid outperforms state-of-the-art systems in detection accuracy. Using only 98 features, VishielDroid achieved an F1-score of 99.78% with systematic testing, surpassing other solutions that achieve lower F1-scores (69.27% to 80.25%). The system demonstrated superior robustness across various scenarios: maintaining high performance with reduced training data and imbalanced datasets, achieving a 99.57% F1-score with a reduced feature set despite evasion attempts, and operating effectively across Android versions 8.1 to 12 with minimal modifications. We validated VishielDroid’s practicality through deployment on real devices, confirming marginal memory and battery consumption overheads.
Voice phishing; Phishing detection; Mobile security
https://doi.org/10.5909/JBE.2025.30.6.1
Graph Transformers apply attention to unordered graphs by using node positional encodings (Laplacian or random‐walk based) and injecting edge biases (distance, centrality, relational). However, existing models remain node‐centric and fail to capture global topology, while edge representations are limited to simple connectivity or distance features, making it difficult to quantify each edge’s impact on the graph’s spectral structure. To overcome these limitations, we propose Spectral Edge Encoding (SEE), which combines spectral decomposition of the graph Laplacian with the Rayleigh quotient from perturbation theory to compute how each edge perturbs low‐frequency eigenvalues and convert these changes into global edge‐sensitivity embeddings. We observe improved AUROC on MoleculeNet datasets—including BBBP, ClinTox, and SIDER—demonstrating that the graph transformer effectively learns global structural information.
Edge encoding; Spectral; Graph transformer
https://doi.org/10.5909/JBE.2025.30.6.1
We propose a hybrid recommendation framework that marries the local message-passing strengths of graph neural networks (GNNs) with the global context modeling of Transformers. While GNNs effectively aggregate neighborhood structure, limited propagation depth hampers long-range reasoning; conversely, Transformers capture long-range dependencies via self-attention but do not natively encode graph topology. Our model first accumulates structural signals through graph-based message passing and then refines user and item representations with dedicated Transformer encoders. To inject topology, we introduce spectral positional encoding derived from Laplacian eigenvectors and jointly leverage low- and high-frequency components: low frequencies capture smooth, community-level structure, whereas high frequencies highlight abrupt, local variations such as sparse interactions and boundary effects. This dual-band design balances global and local cues, mitigating oversmoothing and preserving multi-scale information. Experiments on public benchmarks demonstrate consistent gains over strong baselines, with improvements in Recall@K and NDCG@K, validating the effectiveness and generality of the proposed approach.
Recommendation System; GNN; Transformer; Spectral Encoding
https://doi.org/10.1016/j.tele.2025.102306
The immoral use of algorithms by platform companies has damaged to consumer rights, yet some consumers continue to spend money on platforms that discriminate against users. The existing literature does not fully understand consumers’ complete motivations for using platforms that engage in algorithmic price discrimination. To explore the intrinsic mechanism and boundary conditions of the impact of platform algorithmic price discrimination on consumer purchase intention, 494 consumers who have used online shopping in the last year in China from January 2024 to May 2024 were selected to participate in a multi-time point survey. Based on the integration perspective of the rational choice theory and the expectancy violation theory, from the perspective of the moral reasoning process and boundary conditions, the mediating mechanism between algorithmic price discrimination and consumer purchase intention was examined. Re- sults show that algorithmic price discrimination harms consumer purchase intention; moral decoupling mediates the relationship between algorithmic price discrimination and consumer purchase intention; Competence trust and goodwill trust have significant moderating effects between algorithmic price discrimination and moral decoupling. The conclusions obtained from this study not only reveal the mechanism through which algorithmic price discrimination affects consumer purchase intention, and provide important strategic insights into how platform regulators can safeguard consumer rights, and how platform managers can effectively remedy the negative impact caused by the issue of algorithmic price discrimination.
Algorithmic price discrimination; Moral decoupling; Consumer trust; Consumer purchase intention
https://doi.org/10.1186/s40537-025-01237-z
VKnock-In event prediction is one of the most crucial tasks in Equity-Linked Securties (ELS) investment. Simply relying on the contract terms is insufficient for reliable predictions. To address this limitation, this study integrates macroeconomic features from the Federal Reserve Economic Data Monthly Database (FRED-MD) and the Quar- terly Database (FRED-QD) with contract terms, thereby capturing broader economic influences. Furthermore, to refine the work on these macroeconomic signals, we intro- duce a Time-Feature Blender (TFBlender). Built on attention mechanisms, the TFBlender operates along two paths: time and features. On the time-step token path, it captures both short- and long-term patterns in data, while on the feature token path, multi- head attention analyzes interactions among diverse features. TFBlender achieves a Knock-In F1 of 0.896 and an AUROC of 0.908, accurately detecting Knock-In events while minimizing false alarms. This predictive capability provides investors with early insights into potential ELS risks, enabling more proactive decision making. Additionally, applying SHAP reveals the macroeconomic factors that drive TFBlender’s predictions, helping practitioners focus on key inputs and optimizing resources for more efficient modeling. By comprehensively integrating economic features with specialized attention mechanisms, the proposed framework enhances detection reliability, representing a significant advance in ELS risk management.
ELS; Knock-In prediction; FRED; TFBlender; Hybrid time series attention model
https://doi.org/10.1080/21670811.2025.2553157
Virtual reality (VR) journalism can offer immersive experiences, enhancing the audience’s empathy and issue involvement. In the literature, however, the term VR refers to different types of tech- nology, which may obscure the clear effects of VR journalism. Guided by the framework for immersive virtual environment (FIVE), this paper investigated the effects of VR journalism focusing on two different features of VR: device-focused (i.e., head-mounted display (HMD)) and content format-focused (i.e., 360° video) by conducting a 3 (device-focused: HMD, computer, smartphone) X 2 (content-focused: 360° video, fixed view video) between-subjects experiment (N=171). Results showed that participants in the 360° video condition, compared to those in the fixed video condition, experienced a higher level of social presence, which, in turn, increased their empathic concern toward the characters and led them to be more involved in the videos’ topic. The HMD condition, on the other hand, did not show significant differences in helping participants engage in the characters and the topic, compared to computer and smartphone conditions. Theoretical and practical implications of the different features of VR are discussed.
Virtual reality; journalism; 360° view; head-mounted display; social presence; empathic concern
Although autonomous vehicles have revolutionized the transportation landscape by enabling driving without direct human intervention, it is not yet perfect. For this reason, it is critically important for users to quickly respond to takeover requests from autonomous driving agents. Based on literature on framing effects in persuasion, this study focused on the efficacy of message framing and construal level theory. An experiment (N = 78 participants) was conducted using a driving simulator, employing a 2 (message framing: gain vs. loss) × 2 (temporal distance: distant vs. close) between-subjects design. The key findings indicate that gain framing led to higher levels of perceived benefit as well as compliance and behavioral intention. In contrast, loss framing resulted in higher levels of perceived risk related to danger and prompted quicker behavioral changes, such as lower levels of distraction and faster responses to takeover requests. Conversely, construal level in the messages did not show significant differences and had an impact only on perceived risk and distraction as a moderator. Discussion and implications are provided emphasizing the importance of the messages that autonomous car agents provide.
language disorders; multimodal; artificial intelligence
https://doi.org/10.1080/10447318.2025.2514257
Although autonomous vehicles have revolutionized the transportation landscape by enabling driving without direct human intervention, it is not yet perfect. For this reason, it is critically important for users to quickly respond to takeover requests from autonomous driving agents. Based on literature on framing effects in persuasion, this study focused on the efficacy of message framing and construal level theory. An experiment (N = 78 participants) was conducted using a driving simulator, employing a 2 (message framing: gain vs. loss) × 2 (temporal distance: distant vs. close) between-subjects design. The key findings indicate that gain framing led to higher levels of perceived benefit as well as compliance and behavioral intention. In contrast, loss framing resulted in higher levels of perceived risk related to danger and prompted quicker behavioral changes, such as lower levels of distraction and faster responses to takeover requests. Conversely, construal level in the messages did not show significant differences and had an impact only on perceived risk and distraction as a moderator. Discussion and implications are provided emphasizing the importance of the messages that autonomous car agents provide.
autonomous vehicle; message framing; construal level; public health
https://doi.org/10.1080/10447318.2025.2477744
While human-like social interactions can enhance trust in and acceptance of automated vehicles (AVs), overuse may hinder these benefits, reflecting the “uncanny valley of mind” effect. We hypothesized that the AV agent’s human-like features—calling drivers by their name (Name) and expressing emotions (Emotion)—enhance trust and acceptance individually but may have adverse effects when combined. A 2 × 2 between-subjects experiment (N = 84) examined these effects. Participants in the Name and Emotion combination were more likely to perceive the experiential mind in the AV compared to the Name or Emotion conditions. However, they were less likely to show behavioral trust in the AV than in the Emotion condition, to perceive the AV as useful than in either the Name or Emotion condition, and to show intention to use the AV than in the Name condition. These findings highlight potential trade-offs in designing social AV interactions.
automated vehicle; trust; user acceptance; anthropomorphism; mind perception; uncanny valley
https://doi.org/10.1080/10447318.2025.2491024
This study aims to identify ways to represent a conversational agent in the digital interface that can enhance older adults’ user experience focusing on both verbal (conversational form) and nonverbal factors (visual presence of conversational agent and background image). A total of 85 older adults participated in an experiment with a 2 (conversational agent: visual presence vs. no visual presence) × 2 (background image: present vs. absent) design, plus an additional condition with neither a conversational form nor manipulation of independent variables. Results highlights the importance of nonverbal factors especially environmental cues. Displaying a background image significantly increased perceived affective trust, while visual presence of the agent did not show any significant effects. Interestingly, there were interaction effects on perceived social presence, usefulness, and satisfaction. Findings also showed that using a conversational form can increase the likability, social presence, and perceived ease of use of the agent.
older adults; conversational agent; nonverbal; visual presence; background image; user experience
https://doi.org/10.1186/s40537-025-01138-1
Injury management is critical in all sports, directly impacting player performance. Baseball players are particularly susceptible to injuries, as players often compete in 5 to 7 games per week, placing continuous strain on their bodies. Among various injuries, Tommy John Surgery (TJS) poses a notable risk for Major League Baseball (MLB) pitchers. Traditional TJS prediction methods required sensors or video-based motion capture, which are impractical during actual games and limited in making predictions too close to the injuries, such as within 30 pitches. To address these challenges, this study proposes a deep learning (DL) framework that utilizes both classification and regression tasks. Using MLB pitching data (2016–2023), the classification model detects injury risk up to 100 days in advance with a high prediction performance of 0.73 F1-score, while the regression model estimates the time remaining until the player’s last pre-surgery game with R2 of 0.79. In addition, to enhance our model’s applicability, we employ an explainable artificial intelligence technique to analyze the impacting mechanical features, such as a lowered four-seam fastball release point, that accelerate UCL deterioration, increasing TJS risk. These findings provide a practical foundation for early intervention strategies, potentially preserving pitcher health and reducing the need for complex surgical procedures.
Injury Prediction; Deep Learning; Explainable AI
https://doi.org/10.1016/j.tele.2024.102227
This study aims to investigate media bias in news articles related to defense and foreign affairs by applying deep learning models and eXplainable artificial intelligence (XAI) techniques. We collected and analyzed seven, representing five major Korean media outlets, from conservative and liberal perspectives. The objective is to classify political bias and identify the specific words that contribute to this classification. We employed the BERT-base model from the Korean Language Understanding Evaluation and used local interpretable model-agnostic explanations for a comprehensive analysis. Our methodology achieved a remarkable accuracy of 98.2% in classifying the political bias of news articles, demonstrating the model’s effectiveness. The findings revealed distinct biases in coverage and statements across the media outlets: conservative outlets were more likely to emphasize threats and use singular references, while liberal outlets preferred peaceful and inclusive language. This study provides valuable insights into how the political biases of news media influence both the topics covered and the language used, even within the same category and time frame, ultimately shaping public perception.
BERT; Explainable AI; Text Classification
https://doi.org/10.1016/j.patcog.2025.111376
In multimodal visual understanding, fusing RGB images with additional modalities like depth or thermal data is essential for improving both accuracy and robustness. However, traditional approaches often rely on task-specific architectures that are difficult to generalize across different multimodal scenarios. To address this limitation, we propose the Cross-modal Spatio-Channel Attention (CSCA) module, designed to flexibly integrate diverse modalities into various model architectures while enhancing performance. CSCA employs spatial attention to capture interactions between modalities effectively, improving model adaptability. Additionally, we introduce a patch-based cross-modal interaction mechanism that optimizes the processing of spatial and channel features, reducing memory overhead while preserving critical spatial information. These refinements significantly simplify cross-modal interactions, increasing computational efficiency. Extensive experiments demonstrate that CSCA generalizes well across various multimodal combinations, achieving promising performance in crowd counting and image segmentation tasks, particularly in RGB-Depth, RGB-Thermal, and RGB-Polarization scenarios. Our approach provides a scalable and efficient solution for multimodal integration, with the potential for broader applications in future work.
Multimodal Learning; Sensor Fusion; Semantic Segmentation; Crowd Counting
https://doi.org/10.1016/j.patrec.2024.10.011
Test-time adaptation (TTA) refines pre-trained models during deployment, enabling them to effectively manage new, previously unseen data. However, existing TTA methods focus mainly on global domain alignment, which reduces domain-level gaps but often leads to suboptimal performance. This is because they fail to explicitly consider class-wise alignment, resulting in errors when reliable pseudo-labels are unavailable and source domain samples are inaccessible. In this study, we propose a prototypical class-wise test-time adaptation method, which consists of class-wise prototype adaptation and reliable pseudo-labeling. A main challenge in this approach is the lack of direct access to source domain samples. We leverage the class-specific knowledge contained in the weights of the pre-trained model. To construct class prototypes from the unlabeled target domain, we further introduce a methodology to enhance the reliability of pseudo labels. Our method is adaptable to various models and has been extensively validated, consistently outperforming baselines across multiple benchmark datasets.
Test-time adaptation; Class-wise alignment; Class prototypes; Continual learning, Image classification
https://doi.org/10.1038/s41598-024-79034-6
Age-related macular degeneration (AMD) is a major cause of blindness in developed countries, and the number of affected patients is increasing worldwide. Intravitreal injections of anti-vascular endothelial growth factor (VEGF) are the standard therapy for neovascular AMD (nAMD), and optical coherence tomography (OCT) is a crucial tool for evaluating the anatomical condition of the macula. However, OCT has limitations in accurately predicting the degree of functional and morphological improvement following intravitreal injections. Artificial intelligence (AI) has been proposed as a tool for predicting the treatment response of nAMD based on OCT biomarkers. Our study focuses on the development and assessment of an AI model utilizing the DenseNet201 algorithm. The model aims to predict anatomical improvement based on OCT images before, and during anti-VEGF therapy. The training process involves two scenarios: (1) using only preinjection OCT images and (2) utilizing both OCT images before and during anti-VEGF therapy for model training. The outcomes of our investigation, involving 2068 images from a cohort of 517 Korean patients diagnosed with nAMD, indicate that the AI model we introduced surpassed the predictive performance of ophthalmologists. The model exhibited a sensitivity of 0.915, specificity of 0.426, and accuracy of 0.820. Notably, its predictive capabilities were further enhanced with the inclusion of additional OCT images taken after the first and second injections during the loading phase. The treatment prediction performance of the model was the highest when using all input modalities (before injection, and after the first and second injections) and concatenation-based fusion layers. This study highlights the potential of AI in assisting individualized and tailored nAMD treatment.
10.6109/jkiice.2024.28.10.1144
In metropolitan subway operations, the implementation of express train services is crucial for facilitating the efficient movement of large passenger volumes. However, preliminary feasibility studies for express train operations often rely on outdated standards from the 1990s, which fail to accurately reflect contemporary needs and conditions. To address these limitations, this study employs a Graph Convolutional Network (GCN) utilizing data from subway stations, user reviews, and the interrelationships between stations. This approach aims to provide a more current and comprehensive method for selecting express stop stations, incorporating user feedback. Additionally, the study evaluates efficiency using Data Envelopment Analysis (DEA). The findings demonstrate a significant increase in efficiency on the subway lines between Japan and New York, thereby validating the potential of using review data in the selection of express stop stations. This paper presents a novel set of criteria for effective stop selection that aligns with user needs, promoting more efficient budget allocation and broader public support.
Subway express; Graph; DEA; Review
10.5909/JBE.2024.29.6.1067
Video anomaly detection has emerged as a prominent deep learning research field due to its extensive applications in security,safety, and quality control. However, existing deep learning-based anomaly detection methods face fundamental limitations, heavyreliance on training data and limited explainability of detection results. To overcome these challenges, we propose a rule-basedzero-shot video anomaly detection framework that integrates object detection and semantic segmentation. Our approach definesexplicit rules based on object-background relationships and accurately interprets scene structure using pre-trained vision models. This enables effective anomaly detection in new environments without domain-specific training. Through experiments on theShanghaitech and NWPU Campus datasets, we demonstrate that our method achieves superior performance to existing approacheswithout additional training.
Video Anomaly Detection; Object Detection; Semantic Segmentation
10.5391/JKIIS.2024.34.6.527
Knowledge distillation has gained attention as a promising technique that canreduce model size while maintaining the performance of large-scale deep learningmodels. However, conventional knowledge distillation methods employ a staticapproach that applies uniform weights to all training samples, failing to consider thedifficulty of problems or characteristics of the teacher model. This study proposes anovel method that adaptively adjusts the intensity of knowledge distillation basedon the entropy of teacher model outputs. The proposed method estimates thedifficulty of each training sample through the output entropy of the teacher modeland accordingly adjusts the balance between knowledge distillation and directlearning. Experimental results on the CIFAR-100 dataset show that our methodachieves additional accuracy improvements of 0.47% and 0.62% on average comparedto conventional knowledge distillation for same and different network architectures,respectively. Notably, the method demonstrates greater performance improvementson samples where the teacher model makes incorrect predictions, proving that it canachieve enhanced generalization capability beyond simple imitation.
Knowledge Distillation; Model Compression; Image Classification
https://doi.org/10.48550/arXiv.2410.15467
Mental health issues have become a critical global concern, with excessive stress being one of the primary contributors. Prolonged stress can lead to serious mental health problems. As such, early detection of stress factors is crucial. Previous studies primarily used text-based super- vised learning models to predict stress factors; however, these models have several limitations. They rely on labeled training data, making it difficult to generalize to new stress factors, and they offer limited interpretability of their predictions. This study proposes a stress detection method based on OpenAI’s GPT-4o large language model (LLM) to address these challenges. This method does not require training data and can flexibly adapt to new stress factors using various prompting techniques, such as Zero-shot, Few-shot, Chain-of-Thought (CoT), and Tree- of-Thought (ToT). Additionally, LLMs provide clear explanations of their reasoning processes, making their predictions more trustworthy. Experimental results on the SAD dataset demonstrate that the LLM-based approach achieves comparable performance to general supervised models without the need for labeled data. Additionally, the LLM shows strong interpretabil- ity and accurately infers stress factors with fine granularity, making it a promising solution for stress detection in mental health applications.
Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)
https://doi.org/10.3837/tiis.2024.12.001
During election periods, many polling agencies survey and distribute approval ratings for each candidate. In the past, public opinion was expressed through the Internet, mobile SNS, or the community, historically, individuals had limited options for gauging approval ratings and primarily relied on traditional opinion polls. Analyzing public opinion expressed on the Internet through natural language analysis allows for determining a candidate's approval rate with comparable accuracy to traditional opinion polls. Therefore, this paper proposes a method of inferring the approval rates of candidates during election periods by synthesizing the political comments of users through internet community posting data. To analyze the approval ratings of the posts, we propose to generate a model that has the highest correlation with the actual polls using data augmentation techniques, using the KcBert, KoBert, and KoELECTRA models.
Opinion polls, social media, natural language processing, election predictions
https://doi.org/10.48550/arXiv.2401.05826
Despite stringent data protection regulations such as the General Data Protection Regulation (GDPR), the California Consumer Privacy Act (CCPA), and other country-specific regulations, many websites continue to use cookies to track user activities. Recent studies have revealed several data protection violations, resulting in significant penalties, especially for multinational corporations. Motivated by the question of why these data protection violations continue to occur despite strong data protection regulations, we examined 360 popular e-commerce websites in multiple countries to analyze whether they comply with regulations to protect user privacy from a cookie perspective.
Web security; privacy; cookie; GDPR; CCPA; tracking.
https://doi.org/10.1080/10447318.2024.2425881
Negative feedback can have detrimental effects on the students’ self-efficacy and learning experience, yet it is inevitable for students who receive low outcomes and need improvement. This study investigates the role of pedagogical agents’ self-disclosure in providing empathy for negative feedback. An online experiment was conducted asking participants (N = 183) to interact with a voice-based pedagogical agent in a between-subjects design: 2 (feedback: positive vs. negative) X 2 (agent: self-disclose vs. non-disclose). The agent instructed students on online learning tasks and provided feedback on their task performance. Our findings showed that the agent’s self-disclosure significantly increased students’ perception of intimacy and cognitive trust toward the agent. A significant interaction effect was observed in intimacy, suggesting that the role of self-disclosure is especially pronounced when negative feedback is provided. A significant mediation effect of cognitive trust was also found between self-disclosure and feedback acceptance.
Pedagogical agent; selfdisclosure; trust; intimacy; feedback acceptance; voice user interface
https://doi.org/10.1057/s41599-024-04195-8
Adolescent violence has been one of the most serious social concerns for the last few decades. With the rapid development and spread of the Internet and dig- ital technologies, online violence has become another major type of adolescent violence. This study investigates the antecedents of South Korean adolescents’ offline and online violence by employing both the theoretical and empirical foun- dations of traditional violence literature. The research model was proposed and constructed based on general strain theory and social ecological theory with considerations of 2,481 middle school first-grade student samples from the 2018 Korean Children and Youth Panel Survey (KCYPS). Structural equation mod- eling (SEM) results presented the direct effects of emotions and indirect effects of social relationships on adolescents’ delinquency, bullying, and online violence. Overall, the findings of the current study are consistent with those of previous studies and theoretical assumptions. Except for the effect of social withdrawal on delinquency, the emotions had significant effects on violence perpetration. Furthermore, relationships with parents, friends, and teachers showed protec- tive effects against negative emotions. Finally, online violence was significantly affected by all types of social relationships. The findings of this study can pro- vide a better understanding of both online and offline violence in adolescents. In addition, since the results were derived from a nationally representative sample, this study can provide practitioners in South Korea with guidance on how to set proper interventions for adolescents’ social and emotional aspects of violence perpetration.
Online violence; Offline violence; General strain theory, Social ecological theory, Structural equation modeling
https://doi.org/10.1038/s41598-024-75995-w
Patients with end-stage kidney disease (ESKD) frequently experience anemia, and maintaining hemoglobin (Hb) levels within a targeted range using erythropoiesis-stimulating agents (ESAs) is challenging. This study introduces a gated recurrent unit-attention-based module (GAM) for efficient anemia management among patients undergoing chronic dialysis and proposes a novel alert system for anticipating the need for red blood cell transfusions. Data on demographic characteristics, dialysis metrics, drug administration, laboratory tests, and transfusion history were retrospectively collected from patients undergoing hemodialysis at Kangwon National University Hospital between 2017 and 2022. After preprocessing, a final dataset of 252 patients was used for model training. Our model functions in two major phases: (1) Hb level prediction and ESA dose recommendation and (2) transfusion alert framework. The GAM model outperformed traditional machine learning algorithms, including linear regression, XGBoost, and multilayer perceptron, in predicting Hb levels (R-squared value=0.60). The model also demonstrated a recommendation accuracy of 0.78 compared to that of clinical experts, indicating a high degree of concordance with the ESA dosing recommendations. Additionally, the model exhibited considerably high accuracy (0.99) for transfusion alarms. Thus, the GAM model holds promise for improving anemia management in patients with ESKD by optimizing ESA dosages and providing timely transfusion alerts.
Anemia; End-stage kidney disease; Artificial intelligence; Transfusion alert; Erythropoiesis-stimulating agents