Multimdoal
Encoder
Visualizer

Video Processing
Explanation

App
Demonstration

Above is a video explaining and demonstrating the web application I developed for the research at DSLab that was sponsored by ETRI (Electronics and Telecommunications Research Institute) for the project on Development of Previsional Intelligence Based on Long-term Visual Memory Network.

We did a research on multimodal video analysis for social forecasting on famous individuals using the video dataset that we developed. The research is about predicting whether a video would generate a spike in Google Search Volume data of the famous individuals appearing by such analysis. This tool visually demonstrates how the extraction of multimodal features is performed on a video.

Video dataset

Pie chart of the types of famous individuals

LLM Baseline results

Prior to switching to multimodal based inputs, I gave in video transcripts to an LLM (Gemini flash 2.5) to predict public salience of the public figures so that we could verify the feasibility of the research task.

Multimdoal Encoder Visualizer

Video Processing Explanation

App Demonstration

Video dataset

LLM Baseline results

Multimdoal
Encoder
Visualizer

Video Processing
Explanation

App
Demonstration