CAREER: System Research to Enable Practical Immersive Streaming: From 360-Degree Towards Volumetric Video Delivery

Abstract

Immersive video technologies allow users to freely explore remote or virtual environments. For example, with 360-degree videos, users can view the scene from any orientation; with volumetric videos, users can control not only the view orientation but also the camera position. Such highly immersive content has applications in entertainment, medicine, education, manufacturing, and e-commerce, to name just a few areas. However, current video streaming infrastructure cannot fully support these emerging formats' high bandwidth, low latency, and high storage requirements. This project aims to address challenges in efficient transmission and storage of immersive video streams by proposing new, efficient representations of this content. These representations can both be adapted to fit users' viewing behaviors and can be generated at the low-latencies required for real-time streaming applications. If successful, the proposed research will enable higher quality immersive streaming than is possible with current systems, further enabling useful immersive streaming experiences.

This project investigates techniques for improving efficiency of two specific immersive streaming applications. For real-time 360-degree video, the project will build a system to generate area-of-focus projections in real-time. These area-of-focus projections are selected to align the high-quality focus area with a predicted user view. Low-latency generation is achieved using graphics processing units at nearby edge or cloud servers. For volumetric video streaming, this project aims to create both a storage- and bandwidth-efficient representation of the video. The representation consists of both area-of-focus versions of the video at a selected set of points within the volume as well as patches to cover dis-occluded pixels, allowing high-quality video to be delivered to users positioned anywhere in the scene. The proposed system further uses a Hypertext Transfer Protocol Version 2 (HTTP/2) transmission approach to support bandwidth-efficient delivery of this representation. To more precisely measure the user's true experience of immersive streams, this project will also create immersive video streaming datasets and investigate a new quality metric. This new metric will use novel approaches to correlate user-perceived qualities with visual artifacts introduced due to encoding and network transmission.

Artifacts produced as a result of this project, including publications, code, and datasets, will be made publicly available at https://yaoliu-yl.github.io/ImmersiveStreaming/. These artifacts will be maintained for at least five years after completion of the project.

Datasets

👁️NavGS: A 6-DoF Navigation Dataset and Record-n-Replay Software for Real-World 3DGS Scenes in VR
👁️NavGS (EyeNavGS) is the first publicly available free-world 6-DoF navigation dataset featuring head/eye tracking from 46 participants in 12 real-world 3DGS scenes (using Meta Quest Pro). The 3DGS scenes are corrected for scene tilt and scale for a perceptually-comfortable VR experience.
Dynamic 6-DoF Volumetric Video Generation: Software Toolkit and Dataset
A software toolkit and dataset for dynamic 6-DoF volumetric video generation, supporting research in immersive video. Both the dataset and the software for generating the dataset are available.

Publications

SGSS: Streaming 6-DoF Navigation of Gaussian Splat Scenes [paper] [source code]

Mufeng Zhu, Mingju Liu, Cunxi Yu, Cheng-Hsin Hsu, Yao Liu

Proceedings of the 16th ACM Multimedia Systems Conference (Full Research Paper) (MMSys 2025)

Stellenbosch, South Africa, March 31 - April 4, 2025

SGSS: Streaming 6-DoF Navigation of Gaussian Splat Scenes

EVASR: Edge-Based Salience-Aware Super-Resolution for Enhanced Video Quality and Power Efficiency [paper] [source code]

Na Li, Zichen Zhu, Sheng Wei, Yao Liu

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM)

Accepted on December 2024

Dynamic 6-DoF Volumetric Video Generation: Software Toolkit and Dataset [paper] [project page]

Mufeng Zhu, Yuan-Chun Sun, Na Li, Jin Zhou, Songqing Chen, Cheng-Hsin Hsu, Yao Liu

Proceedings of the 26th IEEE International Workshop on Multimedia Signal Processing (MMSP 2024)

Lafayette, IN, October 2-4, 2024

RoIRTC: Toward Region-of-Interest Reinforced Real-Time Video Communication [paper] [source code]

Shuoqian Wang, Mengbai Xiao, Yao Liu

Proceedings of the 2024 IEEE International Conference on Multimedia and Expo (ICME 2024)

Niagara Falls, Canada, July 15-19, 2024

A Comparative Study of K-Planes vs. V-PCC for 6-DoF Volumetric Video Representation [paper] [repository]

Na Li, Mufeng Zhu, Shuoqian Wang, Yao Liu

Proceedings of the 16th International Workshop on Immersive Mixed and Virtual Environment Systems (MMVE 2024)

Bari, Italy, April 15, 2024

VertexShuffle-Based Spherical Super-Resolution for 360-Degree Videos [paper] [source code]

Na Li, Yao Liu

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM)

Accepted on February 2024

VQBA: Visual-Quality-Driven Bit Allocation for Low-Latency Point Cloud Streaming [paper]

Shuoqian Wang, Mufeng Zhu, Na Li, Mengbai Xiao, Yao Liu

Proceedings of the 31st ACM International Conference on Multimedia (Full Research Paper) (MM 2023)

Ottawa, Canada, October 29-November 3, 2023

EVASR: Edge-Based Video Delivery with Salience-Aware Super-Resolution [paper] [source code]

Na Li, Yao Liu

Proceedings of the 14th ACM Multimedia Systems Conference (Full Research Paper) (MMSys 2023)

Vancouver, Canada, June 7 - 10, 2023

Exploring Spherical Autoencoder for Spherical Video Content Processing [paper]

Jin Zhou, Na Li, Yao Liu, Shuochao Yao, Songqing Chen

Proceedings of the 30th ACM International Conference on Multimedia (MM 2022)

Lisbon, Portugal, October 10 - 14, 2022

A Smartphone Thermal Temperature Analysis for Virtual and Augmented Reality [paper]

Xiaoyang Zhang, Harshit Vadodaria, Na Li, Kyoung-Don Kang, Yao Liu

Proceedings of the 3rd International Conference on Artificial Intelligence and Virtual Reality (AIVR 2020)

Virtual/Online Event, December 14-18, 2020

A Smartphone Thermal Temperature Analysis for Virtual and
Augmented Reality

SphericRTC: A System for Content-Adaptive Real-Time 360-Degree Video Communication [paper] [source code]

Shuoqian Wang, Xiaoyang Zhang, Mengbai Xiao, Kenneth Chiu, Yao Liu

Proceedings of the 28th ACM International Conference on Multimedia (Full Research Paper) (MM 2020)

Seattle, WA, October 12-16, 2020

Quick Links

Abstract

Datasets

Publications