openGL ES: While loop causing a win32 error

The following code throws an error i cant seem to figue out:

varying vec2 v_vTexcoord;
varying vec4 v_vColour;

 //only relevant uniforms shown
uniform float angle;
uniform vec2 reso;

void main()
	vec2 wallPos = v_vTexcoord;
	vec2 dirVec = vec2(cos(angle)/reso.x,sin(angle)/reso.y);
	while (floor(wallPos.x*reso.x+0.5)/reso.x > 0.0 && 
	       floor(wallPos.x*reso.x+0.5)/reso.x < 1.0 && 
		   floor(wallPos.y*reso.y+0.5)/reso.y > 0.0 && 
		   floor(wallPos.y*reso.y+0.5)/reso.y < 1.0)
		wallPos = vec2(wallPos.x-dirVec.x,wallPos.y-dirVec.y);
	//Rest of my code after this//

The Error reads:

Win32 function failed: HRESULT: 0x887a0005

Call GR_D3D_Device-> CreateBlendState at line 467 in file \StateManagerM.cpp

As further background, im sure that the issue is localized to this while since using just the contained block once throws no error (but wrong results ofc.), also in a specific test the following input variables give the error as well:

angle: 2.1817
reso: 1932,1068
v_vTexcoord: in the range of 0.0-1.0 ofc
//dirVec post first two lines in void main() using above variables
dirVec = 0.0002969,0.0007670

I have little experience in dealing with opengl-win32 errors so any advice would be appreciated.

My first thought is that the while loop is not being broken but in my mind it very much looks like it should be, other than that could the win32 error be pointing to a memory usage issue cause by large while loop repeats? The maximum loops that could occur is approx 2200.

Edit: Attatched entire fragment shader including (what i believe to be) non error related code:

It’s possible that such a high loop count is causing a time limit to be exceeded.

In any case: why use a loop? Are you planning on having something more substantial in the body of the loop at some point? If not, it would be better to simply calculate the number of steps required rather than iterating.

That was my first thought, im pretty inexperienced with opengl, is a potentially 2200 long loop outside of safe bounds?

In any case: why use a loop? Are you planning on having something more substantial in the body of the loop at some point? If not, it would be better to simply calculate the number of steps required rather than iterating.

Basically im creating an automated light shaft system for a 2D game and this is my first attempt. First another shader creates a texture buffer with data on where light should begin/end based on “angled scans”, this works fine. The shader above uses the while loop to calculate the wall position that each pixel corresponds to hence which data in the texture buffer it needs to use to decide whether or not to draw light.

What would be the method of determining which data in the texture buffer to use aside from a while loop? If i didnt explain it will i can draw up a crappy paint diagram or something lol

I would consider that to be exceptionally high.

Well, this code:

appears to be finding the intersection of a ray with the unit square. In which case, you would get approximately the same result from

float tx = dirVec.x > 0.0 ? (1.0-wallPos.x)/dirVec.x : -wallPos.x/dirVec.x;
float ty = dirVec.y > 0.0 ? (1.0-wallPos.y)/dirVec.y : -wallPos.y/dirVec.y;
wallPos += min(tx, ty) * dirVec;

The final result will differ slightly due to the rounding to the nearest multiple of reso, but that’s easy enough to incorporate if necessary. There’s no point in iterating individual steps unless you’re actually doing something (e.g. a texture lookup) at each step; if you just need the point where it hits the boundary, you can calculate that by determining which wall it hits. The sign of dirVec.x/y tells you whether it will end at 0 or 1, then you can calculate the distance for each axis and use whichever one is smaller.

My bad for taking so long to respond, been busy so i didnt get a chance to keep working till now.

The code you gave works great, thanks for helping out!

I do have a separate problem to do with the other fragment shader so ill just add it here rather than make a new post:

The below code doesn’t give the output i need. The base texture im using is 1920*1056 and every pixel is either (0,0,0,1), (0.5,0.5,0.5,1) or (1,1,1,1) RGBA. The idea is to take a specific starting pixel and progress at a certain angle in a raycasting style method. As it iterates through the scan (max 256 iterations, scaled with a specific variable, but most are much less) It checks the relevant pixel to gather whether it is white, black or grey. If it’s black it can “start a lightshaft” by adding the start data to a 1D texture and if it is white and a light has been started, it will end in a similar fashion. That process can be repeated twice (limited but it should be all i need). if the pixel to check leaves the texture (outside of [0,1]) it stops iterating.

varying vec2 v_vTexcoord;
varying vec4 v_vColour;

uniform int size;
uniform int stepNum;
uniform float angle;
uniform vec2 reso;
uniform vec3 cornerStart;

const float PI = 3.14159265359;

void main()
	float stepDist = sqrt(2.0)/float(stepNum);	
	vec2 pixelToStartOn;
	if (cornerStart.z == 1.0)
		pixelToStartOn = vec2(cornerStart.x,cornerStart.y+sign(sin(angle))*(2.0*v_vTexcoord.x));
		pixelToStartOn = vec2(pixelToStartOn.x+sign(cos(angle))*abs(pixelToStartOn.y-clamp(pixelToStartOn.y,0.0,1.0)),clamp(pixelToStartOn.y,0.0,1.0));
		pixelToStartOn = vec2(cornerStart.x-sign(cos(angle))*(2.0*v_vTexcoord.x),cornerStart.y);
		pixelToStartOn.y += sign(sin(angle))*abs(pixelToStartOn.x-clamp(pixelToStartOn.x,0.0,1.0));
	pixelToStartOn = vec2(clamp(pixelToStartOn.x,0.0,1.0),clamp(pixelToStartOn.y,0.0,1.0));
	vec2 pixelToCheck = pixelToStartOn;
	float tester = 1.0;
	float startPoint1 = tester;
	float endPoint1 = tester;
	float startPoint2 = tester;
	float endPoint2 = tester;
	for (int i = 0; i < stepNum; i++)
		float red = texture2D(gm_BaseTexture,pixelToCheck).r;
		if (startPoint1 == tester)
			if (red <= 0.25) startPoint1 = float(i)/float(stepNum);
		else if (endPoint1 == tester)
			if (red >= 0.75) endPoint1 = float(i)/float(stepNum);
		else if (startPoint2 == tester)
			if (red <= 0.25) startPoint2 = float(i)/float(stepNum);
		else if (endPoint2 == tester)
			if (red >= 0.75) 
				endPoint2 = float(i)/float(stepNum);
		pixelToCheck = vec2(pixelToCheck.x+cos(angle)*(stepDist),pixelToCheck.y-sin(angle)*(stepDist));
		if (pixelToCheck.x != clamp(pixelToCheck.x,0.0,1.0) || pixelToCheck.y != clamp(pixelToCheck.y,0.0,1.0))
			endPoint2 = float(i)/float(stepNum);
	gl_FragColor = vec4(startPoint1,endPoint1,startPoint2,endPoint2);

The end result is the texture map having the rgb values all empty and the A value is moves progressively from 0 to about 0.2 from left to right across the 1D texture.

Like i said before im pretty new to opengl and i’m kinda at my wits end debugging this haha, any help would be awesome.

I don’t see anything obvious, but a couple of suggestions:

First, don’t use == for comparing floats, particularly with OpenGL ES (which doesn’t offer many guarantees when it comes to precision). I’d suggest setting using a negative value as the “unset” sentinel, and testing with <0.0.

Second: ensure that the texture filters are GL_NEAREST. GL_LINEAR may cause black/white pixels to read as grey, and mipmaps probably won’t work here (also, if you use a mipmap filter without defining all mipmap levels, texture lookups return zeros).

Also: replace [var]angle[/var] with a unit vector (i.e. vec2(cos(angle),sin(angle))) so that you aren’t performing many redundant sin/cos calls in the shader. That’s just an optimisation; it shouldn’t affect the overall behaviour.

Implemented all those suggestions, output is still the same :confused:

With debugging, this is a snapshot of how running would proceed (according to my calculations) with v_vTexcoord = [0.5,0]:

reso: [1920,1056]
stepNum: 256
angle: 5.23599
cornerStart: [0,1,1]

sin(angle) = -0.86602
cos(angle) =  0.50000

stepDist = sqrt(2)/stepNum = 0.00552
pixelToStartOn = [0,1+sign(sin(5.23599))*(2.0*0.5)] = [0,1+(-1)*(1)] = [0,0]
pixelToStartOn(second line) = [pixel.x+sign(0.5)*abs(pixel.y-clamp(pixel.y)),clamp(pixel.y)] = [0+1*abs(0-0),0] = [0,0]
pixelToCheck(0)   = [0      ,0      ]
pixelToCheck(1)   ~ [0.00276,0.00478]
pixelToCheck(128) ~ [0.35328,0.61184]
pixelToCheck(255) ~ [0.70656,1.22368]

with PixelToCheck(n), n indicates the nth iteration of the for loop

hmm just an idea im having: The texture im “drawing onto” my 1D map is much larger. so in this case im drawing a 19201056 onto a 2561. Is it possible that because of the difference the usage on v_vTexcoord is giving mismatched results? A more generalized question would be: when i perform a texture lookup, is the coordinates given relative to the destination texture or is it absolute?

Side question: what is the policy here on bumping? I’m really stumped on this but also kinda need to get it done soon. I checked the rules post but didn’t see anything. Is it the typical 48 hrs?.

Texture sampling functions (texture(), texture2D() etc) take normalised coordinates: (0,0) is one corner, (1,1) is the opposite corner.

The size of the destination only matters insofar as it affects the number of discrete values which v_Texcoord will take. It isn’t going to affect stepNum or stepDist.

I’m not aware of any official policy. But it rarely helps. Most regular visitors will view all new threads; if they don’t respond, it’s because they don’t have anything to say on the matter, and bumping isn’t likely to change that.

I get that this is a pretty huge question, but do you have a recommendation for how to code this as a whole? how would you do it?

If you didn’t follow, the gist is that I’m ray-casting across an image, using the color values of that image to determine where the lighting effect should start and end, then outputting it to a 1D texture. Don’t need to worry about drawing it for now, just that process of storing the data. At the moment im indifferent to the limitation of only being able to comprehend the data of two shafts along a ray.

Don’t need anybody to write my entire fragment shader, just a rough through-line to compare to what i’ve done.

There isn’t anything fundamentally wrong with your approach. At least, not theoretically; it’s possible that you’re triggering a bug with the driver (256x1 probably isn’t a common framebuffer size).

Although: if I was doing this myself, I might try transforming the texture by rendering into a 2D texture with the desired rotation then performing the scan horizontally (possibly using a parallel prefix sum). But I don’t have much experience with ES or mobile GPUs, so I can’t say whether that would actually be a useful approach without trying it.

Ive hit a dead end so i may try just re-building the whole thing. Ill try that approach out, at the least it will circumvent all the vec3 cornerStart code

whew, another update, and ill flesh it out a bit more, if i cant find a solution i might make a new pages since this issue is different to the original.

This is the new code:

varying vec2 v_vTexcoord;
varying vec4 v_vColour; //unused but w/e

uniform int stepNum;

void main()
	float tester = 1.0;
	float startPoint1 = tester+0.1; //+0.1 to EXTRA make sure the multiple float comparison rings true
	float endPoint1 = tester+0.1;
	float startPoint2 = tester+0.1;
	float endPoint2 = tester+0.1;
	vec2 pixel = vec2(0,v_vTexcoord.x);
	for (float i = 0.0; i < float(stepNum); i+=1.0)
		vec4 data = texture2D(gm_BaseTexture,pixel);
		float lum = (data.r+data.g+data.b)/3.0;
		if (startPoint1 >= tester)
			if (lum <= 0.25) startPoint1 = i/float(stepNum);
		else if (endPoint1 >= tester)
			if (lum >= 0.75) endPoint1 = i/float(stepNum);
		else if (startPoint2 >= tester)
			if (lum <= 0.25) startPoint2 = i/float(stepNum);
		else if (endPoint2 >= tester)
			if (lum >= 0.75) 
				endPoint2 = i/float(stepNum);
		pixel = vec2(pixel.x+(1.0/float(stepNum)),pixel.y);
	gl_FragColor = vec4(startPoint1,endPoint1,startPoint2,endPoint2);

Where stepNum is 256.

This code uses this texture as a source:

And draws it onto a 256x1 empty texture. The outcome is:
which is 100% white with alpha of 1.

Extra information would be im using GMS2 as an environment, which essentially uses opengl ES 2.0 i believe, though certain functions are locked by the environment. To my knowledge, texture filtering is controlled by the environment, and i have it switched to linear.

^ill Post that as a forum post with no success after like 24/48 hrs

“100% white with alpha of 1” is what you’d get if none of the tests ever triggered. E.g. if stepNum was 0, or lum was always 0.5.

Try storing different variables in gl_FragColor. E.g. v_TexCoord and 1.0/log(stepNum). Given that you have GML in the way, it’s particularly important to check that the inputs are what you think they are. Also try storing the minmum/maximum/average value of lum.

You can’t single-step a shader in a debugger, so you have to come up with other techniques for debugging.

well this is interesting, when i use “gl_FragColor = vec4(vec3(v_vTexcoord.x),1.0);” i get this

its kinda hard to see but the image does get slightly lighter on the right hand side, but it is far from white. from the look of it it appears v_vTexcoord.x goes from about 0 to 0.2ish.

my GML code looks like this:

				var shd_stepNum = shader_get_uniform(shd_lightShaftScan,"stepNum");

Which is the most bog standard “set target, set shader, set an int, draw, reset, reset”

edit: shaftLightStepReso is 256 btw. which is verified to be true in the GMS2 debugger.

Are you setting the viewport before rendering to the 256x1 texture?

By itself, that doesn’t explain the all-ones result, so investigate other inputs (and the source texture).

How do you mean? dont view ports only apply to what section of the room gets drawn and where onto the app surface? Both the surfaces drawn aren’t the app surface. also when you draw a surface onto a surface in GMS, the coordinates of where to draw are relative to the dest surface not the room.

well if its scanning only a fraction of the way down and it needs to encounter black to do anything that could explain why (atleast for this specific rotation).

Ill check the other variables in a sec.

The viewport determines how clip-space X/Y are mapped to the framebuffer. If the viewport is larger than the framebuffer, then only a portion of the scene will actually get drawn.

You typically need to set the viewport whenever you change framebuffers, as the viewport dimensions are normally based upon the framebuffer dimensions. It may be that GM handles this automatically.

i believe it does, since ive uses a similar scale of surfaces (256 or 512x1 and 1024x1024 or larger) with another graphical systme and that works fine. Infact it uses what should be a less complicated shader.

gl_FragColor = vec4(vec3(1.0/log(float(stepNum))),1.0);

gl_FragColor = vec4(vec3(minLum),1.0);
float minLum = 1.0; // initializer outside of for loop
if (lum < minLum) minLum = lum; //to get the minimum in the line, inside the for loop

they are vertically extended so easier to read.
The second should not be uniform, the min in each row should either be grey or black. with about a quarter/half being black. But that could be related to the v_vTexcoord prob still.

Ill keep searching for an evironment-side problem.


If i squish the source texture down to be the same width (not height) of the 1D texture, it no longer has the same v_vTexcoord limited problem -.-
This looks more or less like i wanted, if you map it to the rotated surface it seems to be accurate.

This is still confusing as all hell since i have another system doing virtually the same thing that does not encounter this problem.

Unless im missing some other factor this could suggest that the calculation of v_vTexcoord for the destination surface is scaled with the source. But that could just be something GMS is doing.

edit: Ill keep testing the system tomorrow, just in case changing the scale of the image has some loss of fidelity issues,