I just pushed a new "Lite" version of my spherical harmonics shader library for HLSL: https://github.com/TheRealMJP/SHforHLSL
The Lite version does not use templates, and is compatible with FXC or other compilers that don't support templates and operator overloads.
The Lite version does not use templates, and is compatible with FXC or other compilers that don't support templates and operator overloads.
Comments
IMO, since if you were projecting a cubemap on the GPU you would probably want to do it in parallel access many threads. Likely 1 thread per texel, followed by N parallel sum/reduction passes. Or possibly use fp atomics if available.
Thank you for this fantastic library and all your hard work!