Keynote Speakers

Ivan Tashev - MMSP2020

Dr. Ivan Tashev

Partner Software Architect
Microsoft Research, USA

Capture, Representation, and Rendering of 3D Audio

Abstract

Spatial audio is a set of emerging technologies, with main application in augmented and virtual reality, but it quickly finds its way up to so everyday activities, such as listening of stereo sound. The design of virtual and augmented reality devices requires realistic rendering of the 3D audio, which creates the need for technologies and algorithms for the entire chain of capture, process, and representation of spatial audio.

In this talk we will discuss:

  • Definition and brief history of the spatial audio technologies
  • Techniques for capturing of 3D audio using specialized microphone arrays, and its device independent representation using either high order ambisonics, or as sound objects and diffuse component
  • We will discuss human spatial hearing, defined by the Head-Related Transfer Functions (HRTF). As they are individual for each human being, personalization of the HRTF in modern spatial audio systems is important component of providing the best experience
  • Rendering of the spatial audio can be done using headphones, or through a set of loudspeakers. We will discuss potential scenarios and applications of both approaches
  • At the end, we will give examples of end-to-end applications with spatial audio and will draw conclusions

Biography

Dr. Ivan Tashev received his Master’s degree in Electronic Engineering in 1984 and PhD in Computer Science in 1990 from Technical University of Sofia, Bulgaria. He was Assistant Professor in the Department of Electronic Engineering of the same university in 1998, when moved to Microsoft in Redmond, USA. Currently Dr. Tashev is a Partner Software Architect and leads the Audio and Acoustics Research Group in Microsoft Research Labs in Redmond, USA. Ivan Tashev is a senior member of IEEE since 2006, member of Audio Engineering Society since 2006. Serves as member and associate member of IEEE SPS Audio and Acoustics Signal Processing Technical Committee and IEEE SPS Industrial DSP Standing Committee. Since 2012 he is adjunct professor in the Department of Electrical Engineering of the University of Washington in Seattle, USA.

Dr. Tashev published two scientific books as the sole author, wrote chapters in two other books, authored or coauthored more than 70 publications in scientific journals and conferences. Ivan Tashev is listed as inventor of 50 USA patent applications, 31 of them already granted. The audio processing technologies, created by Dr. Tashev, have been incorporated in Microsoft Windows, Microsoft Auto Platform, and Microsoft Round Table device. Dr. Tashev served as the leading audio architect for Kinect for Xbox and Microsoft HoloLens. More details about him can be found in his web page.

Hua_IMG_2352_crop

Prof. Hong Hua

Professor of Optical Sciences
University of Arizona
Tuscon, USA

Progresses, challenges, and opportunities of head-mounted light field displays for Mixed Reality

Abstract

Despite the high promises and the tremendous progress made recently toward the development of head-mounted displays (HMD) for both virtual and augmented reality displays, developing HMDs that offer uncompromised optical pathways to both digital and physical worlds without encumbrance and discomfort confronts many grand challenges, both from technological perspectives and human factors. In this plenary address, I will particularly focus on the recent progress, challenges and opportunities for developing head-mounted light field displays (LF-HMD), which are capable of rendering true 3D synthetic scenes with proper focus cues to stimulate natural eye accommodation responses and address the well-known vergence-accommodation conflict in conventional stereoscopic displays.

Biography

Hong Hua, Fellow of SPIE and OSA, is currently a Professor with the James Wyant College of Optical Sciences (OSC), The University of Arizona. She has over 25 years of experiences in designing and developing near-to-the-eye display technologies and developing virtual reality and augmented reality applications. Her current research interests include various head-worn displays and light field 3D displays, optical engineering, collaborative virtual and augmented environments, and human-computer interaction.

Prof. Thrasyvoulos (Thrasos) N. Pappas

Professor at Electrical Engineering and Computer Science Department
Northwestern University
Evanston, Illinois

Perceptual Texture Analysis for Multimedia Processing

Abstract

Texture is an important attribute for both human perception and signal analysis. It provides important clues for object detection and material recognition. Visual texture similarity is important for image and video quality, compression, and content-based retrieval. Texture analysis is also important for sense substitution (visual to acoustic-tactile conversion), multimodal interfaces for virtual reality and immersive environments, product design, surveillance and security, environmental monitoring, and medical applications.

We have proposed a new class of structural texture similarity metrics (STSIMs) that account for human visual perception and the stochastic nature of textures. They rely entirely on local image statistics and allow substantial point-by-point deviations between textures that according to human judgment are similar or essentially identical. We have identified three operating domains for evaluating the performance of objective texture similarity metrics, each with different performance goals and testing procedures. We have also proposed ViSiProG (Visual Similarity by Progressive Grouping), a new procedure for collecting subjective similarity data.

Our current focus is on material identification and characterization. Each material can be characterized by limited number of exemplars that reflect different environmental conditions. The characterization can also be based on specific attributes, such as roughness, glossiness, and spectral composition, which provide strong clues about material properties and can be estimated over a wide range of conditions.

Biography

Thrasos Pappas received the Ph.D. degree in electrical engineering and computer science from MIT in 1987. From 1987 until 1999, he was a Member of the Technical Staff at Bell Laboratories, Murray Hill, NJ. He joined the ECE Department at Northwestern in 1999. His research interests are in human perception and electronic media, and in particular, image and video quality and compression, image and video analysis, content-based retrieval, medical image analysis, model-based halftoning, tactile and multimodal interfaces.

Prof. Pappas is a Fellow of the IEEE, SPIE and IS&T. He has served as Vice President-Publications (2015-17) for the Signal Processing Society of IEEE, Editor-in-Chief of the IEEE Transactions on Image Processing (2010-12), elected member of the Board of Governors of the Signal Processing Society of IEEE (2004-07), chair of the IEEE Image and Multidimensional Signal Processing Technical Committee (2002-03), and technical program co-chair of ICIP-01 and ICIP-09. From 1997 to 2018, he has served as co-chair of the SPIE/IS&T Conference on Human Vision and Electronic Imaging. He is currently one of the two founding Editors-in-Chief of the IS&T Journal of Perceptual Imaging.