Question about postfix_expression inside function_header

andy3 · May 21, 2021, 8:43pm

The GLSL grammar has this production:

function_identifier :
  type_specifier
  postfix_expression

And a postfix_expression can be a lot of things:

postfix_expression:
    primary_expression
    postfix_expression LEFT_BRACKET integer_expression RIGHT_BRACKET
    function_call
    postfix_expression DOT FIELD_SELECTION
    postfix_expression INC_OP
    postfix_expression DEC_OP

When parsing GLSL and building the AST, let’s say we’re parsing a.length()

If I follow the grammar, then it looks like function_call should match this lexeme. And that means that when building the function_call AST node, that the entire identifier for that function should be a.length.

This feels weird that the identifier for a function could include the left hand side of the dot operator. I would instead expect this to parse as a top level postfix_expression, where the expression of the postfix is a, and the postfix part is a function_call with the identifier of .length.

The grammar doesn’t seem to support this second AST where the top level is a postfix_expression even though that feels more natural to me.

Additionally, it seems like a function identifier could be a nested postfix_expression, so when parsing something like (a()).length() is even weirder, because the entire identifier of the function becomes ``(a()).length` with the Khronos grammar.

I’m fairly new to parsing. Is the grammar supposed to produce the more “natural” (to me) AST of top level postfix_expression, or is it supposed to produce the nested AST of a top level function_call where the identifier can be an arbitrarily nested postfix_expression?

GClements · May 22, 2021, 12:08am

A grammar isn’t a complete decision procedure which determines whether or not a given input constitutes a valid program. Just because there’s a valid parse, it doesn’t mean that the input constitutes a valid program. If the input is a valid program, then the grammar determines how it’s parsed.

Clearly the grammar can produce the latter case. I can’t see how it could produce the former unless FIELD_SELECTION can include length(), but that would result in an ambiguous parse.

To be honest, this approach seems rather lazy. It would have been cleaner to handle methods separately.

system · November 21, 2021, 12:08am

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.