3rd International Conference

Digital Culture & AudioVisual Challenges

Interdisciplinary Creativity in Arts and Technology

Online, May 28-29, 2021

Speech Controlled Navigation and Interactions in Web Virtual Reality Environments
Date and Time: 29/05/2021 (10:45-12:30)
Despoina Tsavalou, Vasileios Komianos

In this paper a virtual environment where interaction with the environment and various elements occurs by using speech recognition, is presented. By creating applications that utilize this technology the users can easily navigate. Additionally, speech recognition for user input is shown to be an efficient approach for disabled users (Darabkh, 2018). The presented approach allows users to navigate in virtual reality environments by using speech commands. The application is used in use case tests and evaluation experiments in order to assess its effectiveness. The rest of this extended abstract is structured as follows. Section II, describes the objective of the presented research work. Section III presents the methods employed while conducting this research work. Section IV, discusses the conclusion of the present work and draws the guidelines for its extension and future work.

The core objectives of this work are: (i) to develop methods for interaction and navigation in virtual reality based on speech recognition (ii) to diagnose factors that negatively affect users experience; (iii) to suggest approaches to virtual reality navigation and (iv) to explore the effectiveness of interaction/navigation in vr based on speech interaction/recognition.

The method employed for the present work includes: (i) the related literature review; (ii) the research for the available technologies and approaches; (iii) the design and development of the application and (iv) the testing and evaluation of the developed application.
The literature review provides valuable information on speech interaction design guidelines (Murad, 2018, 2019) which can be applied in the considered application. The research for the available technologies shows that speech recognition functionality in browsers (Adorf, 2013) is an effective solution for experimentation. Nevertheless, the fact that is not yet adopted by the majority of the web browsing applications as well as its dependency on server-provided services does not make it yet suitable for wide usage.
Α web application integrating speech recognition functionality is developed to capture and process the user's voice commands in order to enable speech controlled interaction in virtual reality environments. The application uses the Web Speech API which is accessible by the Javascript programming language and for which there are many resources with easy to follow instructions. For the implementation of the virtual reality environment the framework A-Frame is used. The set of available voice commands consists of the functionality that is required by users in order to interact in the virtual reality environment.
The application is used in order to assist the recognition of use cases which will provide further guidelines for future advancements. In addition, the designed tests aim to provide information regarding interaction issues that may affect user experience.


This work presents a virtual reality web application integrating speech recognition functionality in order to provide speech interaction with a virtual laboratory environment. The application is used in order to assess the speech interaction effectiveness for virtual laboratory experimentation and educational practices. A set of tests is designed in order to shed light on these issues. Future advancements regarding the considered application include the addition of more elements to ensure a robust and fulfilling user experience. Additionally, ways to create a more optimal, inclusive and user-friendly environment will be explored aiming to provide approaches to make interaction easy.

Darabkh, K. A., Haddad, L., Sweidan, S. Z., Hawa, M., Saifan, R., & Alnabelsi, S. H. (2018). An efficient speech recognition system for arm‐disabled students based on isolated words. Computer Applications in Engineering Education, 26(2), 285-301.
Adorf, J. (2013). Web speech API. KTH Royal Institute of Technology.
Murad, C., Munteanu, C., Cowan, B. R., & Clark, L. (2019). Revolution or Evolution? Speech Interaction and HCI Design Guidelines. IEEE Pervasive Computing, 18(2), 33-45.


The Special Session
“Reflections: Bridges between Technology and Culture, Physical and Virtual”
is supported by:

Text To SpeechText To Speech Text ReadabilityText Readability Color ContrastColor Contrast
Accessibility Options