Question about postfix_expression inside function_header

The GLSL grammar has this production:

function_identifier :
  type_specifier
  postfix_expression

And a postfix_expression can be a lot of things:

postfix_expression:
    primary_expression
    postfix_expression LEFT_BRACKET integer_expression RIGHT_BRACKET
    function_call
    postfix_expression DOT FIELD_SELECTION
    postfix_expression INC_OP
    postfix_expression DEC_OP 

When parsing GLSL and building the AST, let’s say we’re parsing a.length()

If I follow the grammar, then it looks like function_call should match this lexeme. And that means that when building the function_call AST node, that the entire identifier for that function should be a.length.

This feels weird that the identifier for a function could include the left hand side of the dot operator. I would instead expect this to parse as a top level postfix_expression, where the expression of the postfix is a, and the postfix part is a function_call with the identifier of .length.

The grammar doesn’t seem to support this second AST where the top level is a postfix_expression even though that feels more natural to me.

Additionally, it seems like a function identifier could be a nested postfix_expression, so when parsing something like (a()).length() is even weirder, because the entire identifier of the function becomes ``(a()).length` with the Khronos grammar.

I’m fairly new to parsing. Is the grammar supposed to produce the more “natural” (to me) AST of top level postfix_expression, or is it supposed to produce the nested AST of a top level function_call where the identifier can be an arbitrarily nested postfix_expression?

A grammar isn’t a complete decision procedure which determines whether or not a given input constitutes a valid program. Just because there’s a valid parse, it doesn’t mean that the input constitutes a valid program. If the input is a valid program, then the grammar determines how it’s parsed.

Clearly the grammar can produce the latter case. I can’t see how it could produce the former unless FIELD_SELECTION can include length(), but that would result in an ambiguous parse.

To be honest, this approach seems rather lazy. It would have been cleaner to handle methods separately.

This topic was automatically closed 183 days after the last reply. New replies are no longer allowed.