For my specialization course at The Game Assembly, I decided to focus on sound propagation in a 3D environment. The objective was to break down sound propagation into manageable components and variables, allowing designers to simulate how audio waves move through a simple 3D game environment.
Length of Path
For this I used Firelight Technologies audio middleware FMOD and our custom-made engine VOLT. However, the work could be adapted to work with other audio middlewares or game engines.
But how does it work?
The Portal Point components, or simply the Portals, serve as the coordinates in the game world where sound can travel to. Designers place these points directly in the world or attach them to prefabs. Each point receives a unique ID upon creation, making it easier for designers to access and debug them.
A Portal Edge creates a connection between different nodes. Each Portal Point has a list of edges, which holds the ID of the node it is connected to, as well as the distance between the two points.
The Graph is a collection of nodes that form a network for audio traversal. It includes functions for creating, changing, or deleting nodes and edges. Technically, it is an unweighted undirected graph. Although the edges hold a length, which could be considered a weight, this is not used when pathing in the graph structure.
To determine if the sound can reach the player, a Breadth First Search (BFS) algorithm is used to find the shortest path from the audio source to the listener. The BFS provides a list of all portals that the audio must traverse to reach the player. By iterating through this list and adding all distance lengths together, we obtain the first variable to send to FMOD. This length variable is also used later when calculating the new virtual position of the audio source (more on that later).
So why use the BFS algorithm?
Since this portal system is intended for use indoors and within a limited space, the audio source is usually relatively close to the listener. This increases the likelihood of the algorithm finding a path. The Breadth First Search (BFS) algorithm visits each node in a level-by-level order, starting from a given node and visiting all nodes that are at the next level before moving on to subsequent levels. In contrast, Depth First Search (DFS) visits each node in a depth-wise manner, prioritizing moving to a subsequent level if one exists, and only turning back once a leaf node is reached. As a result, BFS is more suitable for finding the shortest path between two portals, while DFS is ideal for traversing in a sequential way.
Why not Dijkstra's algorithm?
On a basic level, BFS and Dijkstra's algorithm are the same thing, with the main difference being that BFS works on unweighted graphs while Dijkstra's algorithm works on weighted graphs. Running Dijkstra's algorithm on a graph with all weights set to 0 would be the same as running a BFS algorithm. Although my nodes do have a weight that isn't used in pathfinding, the algorithm could easily be changed to Dijkstra's algorithm to take this into consideration.
In terms of time complexity, Dijkstra's algorithm has a higher time complexity compared to BFS, mainly because it needs to evaluate the edge weights. In the worst-case scenario, where all edge weights are different, Dijkstra's algorithm has a time complexity of O(E log V), while BFS has a time complexity of O(V+E).
However, it's worth noting that the time complexity of an algorithm doesn't always translate to better performance in practice. The efficiency of an algorithm also depends on other factors, such as the size and complexity of the graph being traversed, the specific implementation of the algorithm, and the hardware used to execute it.
In this specific case of traversing a relatively small indoor environment, where the nodes are not too far apart, BFS is likely to perform well and give us the desired result without the overhead of computing edge weights. However, if we were dealing with a larger and more complex environment, or if we needed to take into account edge weights or other factors, Dijkstra's algorithm might be more appropriate.
Breadth First Search Node Graph Pathing
What is Diffraction?
Diffraction is the perturbation of an incident waveform by a boundary edge, causing a secondary wave front that emanates spherically outwards from the edge of diffraction, or simply sound waves bending around an object.
The Shadow Zone
The space affected by diffraction is divided into three distinct zones:
the Reflection zone, the View zone, and the Shadow zone.
In the shadow zone, we can clearly observe how the edge mimics a point-source sound emitter with the waveform dispersing outwards. This is the primary zone of interest when simulating diffraction. Referring back to the definition of diffraction, we can think of diffraction as lower frequency waves being able to bend further around corners than higher frequency waves.
Image from Game Audio Programming 2 by Guy Somberg
Diffraction in games.
Image from Game Audio Programming 2 by Guy Somberg
How do we use this in games?
In order to incorporate diffraction into games, we need to deviate slightly from the theory. The figure above depicts an emitter placed behind an object that blocks the sound from reaching the listener. In this case, the diffraction angle refers to the angle into the shadow zone of the emitter, and this is the variable that we can send to our audio middleware. This provides sound designers with creative freedom and expression when it comes to diffraction. However, in reality, this is a simplified abstraction since we are disregarding many theoretical factors. Nonetheless, it is still plausible, spatially continuous, and relatively easy to compute for simple shapes.
To find the diffraction angle, we need to determine the positions of the emitter (but in this case: a portal) and the listener, as well as identify the diffraction edge.
To build the diffraction edge, we send a ray cast back and forth between the listener and the first node in the node graph that is obstructed from the listener. From the intersecting rays and the collider, we extract the normals. With this information, we can calculate the index that is closest to the closest portal. This should give us all the data we need to construct the diffraction edge.
Diffraction Point & Angle
To complete the calculation of the diffraction angle, we must determine the diffraction point along the edge. To achieve this, we find the closest point on the edge to the ray between the emitter and the listener. From this point, we calculate the angle by computing the arc cosine of the dot product between the incident ray direction (diffraction point - portal position) and the diffracted ray direction (listener position - diffraction point).
Isn't math fun!
To emulate that the audio is coming from around the corner, we need to calculate a new "virtual" position of the audio emitter. To do this, we simply use the diffraction point we just calculated and the length variable we calculated in the previous section (Portals). The new virtual position becomes the direction from the listener to the diffraction point with the length variable added. This allows the audio middleware to calculate the attenuation correctly and gives the impression that the sound is coming from a position behind the diffraction edge.
The Result from the Diffraction Calculations.
Obstruction & Occlusion
Occlusion is a phenomenon in which both the direct and the reflected sound waves are blocked by some form of geometry in the scene. This can occur when sound waves are unable to penetrate through a wall or other solid object, resulting in complete silence on the other side.
On the other hand, obstruction refers to the situation where only the direct sound is blocked, while the reflected sound is still able to propagate around obstacles. This can result in a quieter, but still audible sound on the other side of an obstacle.
In reality, there are several components that affect how sound is obstructed. However, for this work, we will abstract it down to a percentage value, providing sound designers with more creative freedom. To calculate this value, we simply need points around the listener and audio emitter. We then cast a ray to each of these points and calculate the percentage of obstructed rays to the total sum of rays.
For this calculation, both the listener and audio emitter have a sound width variable that controls how far the points will extend from their original location. This new point is calculated as a right-angled triangle.
The Formula For Calculating a right Angled Triangle
Lastly, we check the material tag and send it to the middleware. Different materials absorb and reflect different frequencies of sound, so we allow the designers to modify the obstructed/occluded audio in the middleware according to their preference.
Counting the Rays For Abstracted Obstruction
At the journey's end
So, was it a success? In my honest opinion, it requires a lot more to achieve convincing audio propagation. But that is one of the things I love about audio programming; the challenges are unique in their own way. Challenges that need their own unique solutions. Solutions I want to discover!
If I were to continue the development of this system, there are some problems that I want to try to solve:
Finding All Paths
In the current system, it only accounts for whether the audio can reach us via the single shortest path. But this is not how audio works in reality. Sound waves travel to us through all unobstructed paths that reach the listener. This means that we either need to change the algorithm to find all paths or switch to a different algorithm entirely (probably to some depth-first search). However, solving the pathfinding is only half the problem. How would one deal with the new multiple virtual positions and circular pathways? I don't know, but it is an interesting problem to solve.
Room by Room
Another aspect of the system is the user experience of integrating it into the game world. Currently, someone has to place all the portals one by one, ensuring that there is a line of sight between them that forms a path, tag all the environment with the appropriate material, and ensure that the colliders are correct for diffraction. This is an unnecessarily tedious task. If I were to continue developing this system, I would simplify it so that the designer only needs two components: a "Room Component," which is a collider/marker for a single room. This would hold variables for the material of the roof, floor, and walls. The second component, called the "Portal Component," would serve as a marker for portals, such as doors, windows, or other openings in the walls, and would intersect between two rooms.
In its current state, there is only one network that contains all the portals in a level. To improve performance, I plan to break this up into smaller graph units. Each building in the game would have its own portal graph. These smaller graphs could create and delete themselves in runtime, freeing up memory when the player isn't nearby.
I mentioned earlier that game audio has its own unique set of problems that need to be solved, and these problems are often underappreciated by the majority of players. While game audio recognition has improved over the years, it still remains one of the unsung heroes of game development. Nevertheless, I want to contribute to this field and create new game audio experiences, even if they are muted by a significant number of players :(
Hope you've learned something and if you have any questions feel free to contact me!
Thank you for reading!
Get in Touch
I am part of The Game Assembly’s internship program. As per the agreement between the Games Industry and The Game Assembly, neither student nor company may be in contact with one another regarding internships before April 19th. Any internship offers can be made on April 26th, at the earliest.