Multimodal Privacy & Security
Artificial intelligence increasingly operates across rich, multimodal inputs—text, images, audio, video. This complexity introduces unprecedented risks to individual privacy. Our project envisions a future where powerful AI systems can harness diverse data streams while maintaining rigorous, built-in privacy safeguards. We aim to ensure that users retain control over their data, even as AI becomes more perceptive and predictive.
This project develops end-to-end privacy protection pipelines tailored to the reality of multimodal AI environments. It treats privacy not just as a legal requirement, but as a design principle woven into the fabric of intelligent systems.
We begin by building annotation protocols to detect sensitive information across images, texts, and audio recordings. These labeled datasets inform the training of modality-specific detection models with high precision (AUC ≥ 95%).
Using the labeled data, we develop content-aware privacy-preserving transformations: facial blurring, voice modulation, and text redaction. These masking techniques are dynamically applied based on sensitivity levels determined by model inference.
Privacy threats often emerge not from single data points, but from correlations across modalities. We design semantic similarity algorithms to trace how sensitive information may be revealed through indirect associations—e.g., image captions leaking names or locations. These systems allow for dynamic monitoring and automatic intervention.
All detection and transformation processes are linked to a customizable privacy management system that adapts to jurisdictional and organizational data protection rules (e.g., GDPR, Korean Personal Information Protection Act). Users can configure privacy thresholds and audit logs for compliance assurance.