CAREER: System Research to Enable Practical Immersive Streaming: From 360-Degree Towards Volumetric Video Delivery

Abstract

Immersive video technologies allow users to freely explore remote or virtual environments. For example, with 360-degree videos, users can view the scene from any orientation; with volumetric videos, users can control not only the view orientation but also the camera position. Such highly immersive content has applications in entertainment, medicine, education, manufacturing, and e-commerce, to name just a few areas. However, current video streaming infrastructure cannot fully support these emerging formats' high bandwidth, low latency, and high storage requirements. This project aims to address challenges in efficient transmission and storage of immersive video streams by proposing new, efficient representations of this content. These representations can both be adapted to fit users' viewing behaviors and can be generated at the low-latencies required for real-time streaming applications. If successful, the proposed research will enable higher quality immersive streaming than is possible with current systems, further enabling useful immersive streaming experiences.

This project investigates techniques for improving efficiency of two specific immersive streaming applications. For real-time 360-degree video, the project will build a system to generate area-of-focus projections in real-time. These area-of-focus projections are selected to align the high-quality focus area with a predicted user view. Low-latency generation is achieved using graphics processing units at nearby edge or cloud servers. For volumetric video streaming, this project aims to create both a storage- and bandwidth-efficient representation of the video. The representation consists of both area-of-focus versions of the video at a selected set of points within the volume as well as patches to cover dis-occluded pixels, allowing high-quality video to be delivered to users positioned anywhere in the scene. The proposed system further uses a Hypertext Transfer Protocol Version 2 (HTTP/2) transmission approach to support bandwidth-efficient delivery of this representation. To more precisely measure the user's true experience of immersive streams, this project will also create immersive video streaming datasets and investigate a new quality metric. This new metric will use novel approaches to correlate user-perceived qualities with visual artifacts introduced due to encoding and network transmission.

Artifacts produced as a result of this project, including publications, code, and datasets, will be made publicly available at http://www.cs.binghamton.edu/~yaoliu/ImmersiveStreaming/. These artifacts will be maintained for at least five years after completion of the project.

Publications

Dynamic 6-DoF Volumetric Video Generation: Software Toolkit and Dataset [paper] [project page]
Mufeng Zhu, Yuan-Chun Sun, Na Li, Jin Zhou, Songqing Chen, Cheng-Hsin Hsu, Yao Liu
Proceedings of the 26th IEEE International Workshop on Multimedia Signal Processing (MMSP 2024)
Lafayette, IN, October 2-4, 2024
RoIRTC: Toward Region-of-Interest Reinforced Real-Time Video Communication [paper] [source code]
Shuoqian Wang, Mengbai Xiao, Yao Liu
Proceedings of the 2024 IEEE International Conference on Multimedia and Expo (ICME 2024)
Niagara Falls, Canada, July 15-19, 2024
A Comparative Study of K-Planes vs. V-PCC for 6-DoF Volumetric Video Representation [paper] [repository]
Na Li, Mufeng Zhu, Shuoqian Wang, Yao Liu
Proceedings of the 16th International Workshop on Immersive Mixed and Virtual Environment Systems (MMVE 2024)
Bari, Italy, April 15, 2024
VertexShuffle-Based Spherical Super-Resolution for 360-Degree Videos [paper] [source code]
Na Li, Yao Liu
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM)
Accepted on February 2024
VQBA: Visual-Quality-Driven Bit Allocation for Low-Latency Point Cloud Streaming [paper]
Shuoqian Wang, Mufeng Zhu, Na Li, Mengbai Xiao, Yao Liu
Proceedings of the 31st ACM International Conference on Multimedia (Full Research Paper) (MM 2023)
Ottawa, Canada, October 29-November 3, 2023
EVASR: Edge-Based Video Delivery with Salience-Aware Super-Resolution [paper] [source code]
Na Li, Yao Liu
Proceedings of the 14th ACM Multimedia Systems Conference (Full Research Paper) (MMSys 2023)
Vancouver, Canada, June 7 - 10, 2023
Exploring Spherical Autoencoder for Spherical Video Content Processing [paper]
Jin Zhou, Na Li, Yao Liu, Shuochao Yao, Songqing Chen
Proceedings of the 30th ACM International Conference on Multimedia (MM 2022)
Lisbon, Portugal, October 10 - 14, 2022
A Smartphone Thermal Temperature Analysis for Virtual and Augmented Reality [paper]
Xiaoyang Zhang, Harshit Vadodaria, Na Li, Kyoung-Don Kang, Yao Liu
Proceedings of the 3rd International Conference on Artificial Intelligence and Virtual Reality (AIVR 2020)
Virtual/Online Event, December 14-18, 2020
SphericRTC: A System for Content-Adaptive Real-Time 360-Degree Video Communication [paper] [source code]
Shuoqian Wang, Xiaoyang Zhang, Mengbai Xiao, Kenneth Chiu, Yao Liu
Proceedings of the 28th ACM International Conference on Multimedia (Full Research Paper) (MM 2020)
Seattle, WA, October 12-16, 2020