All the calculations could be done before hand and stored and then the only thing left in the delayed draw is to set the buffer.
I haven’t looked at the code yet so not sure how much if any it will save though.
Could also group pixels that are far away from eachother into a single call, while a compromise i think it will maintain the effect.
Unless op used grok to make this how is it “their” tool if you use an open source model?