Here’s a video of a fluid simulation I made:
This post is going to talk about Fluid Simulation using Smoothed Particle Hydrodynamics(SPH) implemented using OpenCL 1.2.
If you don’t want to read any of this and get right to the code, here it is in “SPH_v1”.
This post is not intended to be a tutorial, but a demonstration of my implementation, though I will include links to the sources I used and hopefully that will prove helpful to someone else.
SPH:
Wikipedia defines SPH as “a computational method used for simulating the dynamics of continuum media”, which is a fancy way of saying that it is an algorithm that can be used to model anything that flows or has fluid-like behavior. (and probably some other stuff too, but that description covers most of it)
The nomenclature was first introduced by two papers, Gingold and Monaghan et al. and Lucy et al. both in 1977.
The paper that you’ll need to read in order to understand it’s applications to fluid simulation in video games/interactive media is the M. Muller, D. Charypar, and M. Gross et al. which can be found here.
For some background in fluid sim. in general, there are generally two approaches through which a fluid medium is described and hence simulated.
- Eulerian approach: This treats the fluid as a grid with the resolution of the grid defining how many points into the field are sampled and thus the resultant quality of the fluid simulation. This is simple to implement and algorithms like the Shallow Water Equations make use of it to great effect and run cheap while doing it. The limitation however is with imposing boundary conditions for grid-based solutions and the requirement of a small timestep in order for the simulation to not “explode”.
- Lagrangian: This treats the fluid as a collection of discrete particles where each particle has its own mass, position and velocity. The solver performs an all-pairs interaction force calculation, modeling two forces and using the principle of superposition (read : adding them together) in order to arrive at the final force acting on each particle. These forces are the force due to pressure and the force due to viscosity. Surface tension and external forces like gravity can also be included in this force calculation in order to allow for interaction with the fluid system.
The M. Muller paper describes surface tension, but this implementation does not include it. The SPH explanation requires understanding a few things like smoothing kernels and the Navier-Stokes equation, but if you want a source that skips all that and directly describes the code to implement it, here’s another link that I found extremely helpful.
OpenCL:
OpenCL is a compute-layer language that runs on the GPU in order to implement functionality that can benefit from being executed in parallel. OpenCL programs are called “kernels” and each kernel runs on a processor in the GPU, in groups. The hardware specifics are quite complicated but suffice to say that it’s kind of like a shader that isn’t meant to render anything, but rather perform computations that involve a lot of math, which GPUs excel at.
Incidentally, fluid simulations require a lot of math, making them a prime candidate for algorithms that would benefit from parallel execution.
I’ve chosen OpenCL as opposed to the alternative (NVIDIA’s proprietary language CUDA) because I wanted a portable solution that wouldn’t be locked to any single vendor. However that decision also dictated my choice of which version of OpenCL to use (v1.2) as that is the latest version of OpenCL that NVIDIA has provided support for on all their hardware (for reference OpenCL is at v2.2 right now).
The sources I used in order to learn OpenCL are:
- https://simpleopencl.blogspot.com/2013/06/tutorial-simple-start-with-opencl-and-c.html
- http://enja.org/2010/07/20/adventures-in-opencl-part-1-5-cpp-bindings/
- https://github.com/enjalot/EnjaParticles
- https://www.khronos.org/files/opencl-1-2-quick-reference-card.pdf
It can be a bit of a headache to get OpenCL to work, but the result is worth it, as the maximum number of particles I could achieve with all calculations on the CPU (with 30 or above FPS) was around 1K, but once I switched all computations to the GPU I was able to max out at 16K particles or so and maintain an appreciable framerate. (On my GitHub page it says 4K but that was with an older PC, right now I am running on an i7 with 16GB of RAM, and a GTX 970 with 3.5GB of VRAM.
Areas for improvement:
- My implementation still uses a brute force all-pairs interaction force calculation which is a definite place to be optimized by using spatial partitioning of some sort.
- I was looking into extending this into 3D and implementing a grid hashing solution.