This game is using a multimodal approach. It combines normal first person movement with trained machine learning modals. Players can activate their webcam to let the program detect the presented objects. It is also possible to replay Sarias Song from Zelda - Ocarina of Time by saying the respectable notes. Once players have solve these two tasks, they can escape the room. This game is using the Babylon engine, the p5 and ml5 libraries and is a proof of concept for my masters degree.