.spec request: XML

Not a request for OpenGL, but on the .spec files:

Rather than using the current format (which I cannot find a public document), but to use XML [since there are SO many XML parsers out there and tiny too] so that one can easily make the header files oneself or other bits. This idea was floated in the 4.1 feedback forum by someone else, but I’d like to post it here.

This is an old idea but I guess you are right to spawn it again. It’s a crazy amount of work for who will do it but I just want to say: +1

I dont think its THAT much work (just converting to xml).
Problem is, existing toolchain khronos guys use would have to be rewritten as well, and this may be a pain.

As nice as that would be, there is one hurdle that needs to be dealt with. It’s easy to take the function and enum data from the spec files and transform it into an arbitrary form. My own GL extension code is a series of Lua scripts that, as part of its operations, stores the specification data as text Lua scripts files. Kinda like JSON is really just a JavaScript file that generates an object containing its data. It’d be easy to transform this intermediate form into XML, JSON, or any other format you could come up with.

The problem is that the spec files don’t just contain function and enumeration information. They contain a lot of “passthru” statements, which are expected to be regurgitated verbatum when processing the files. That is, the files themselves are designed around the backend parsing tools that generate the extension headers. The .spec format itself is not arbitrary; it has specific expectations on how the user will be using its data.

Anyone could easily come up with what would be a good, readable, editable, and parseable XML format for most of the spec data. But for the passthru stuff, I don’t know.

Problem is, existing toolchain khronos guys use would have to be rewritten as well, and this may be a pain.

Please. Given a solution to the above problem, the formats should be entirely interchangable. That is, they should be able to regenerate the .spec files from the .xmlspec files, and then run their existing toolchain.

hey contain a lot of “passthru” statements, which are expected to be regurgitated verbatum when processing the files.

Is this a problem?
As i see, most of the ‘passthru’ in gl.spec are comments, other are #include/typedefs that could be just lumped together.

My own GL extension code is a series of Lua scripts that, as part of its operations, stores the specification data as text Lua scripts files.

One reason why i would prefer XML to other formats is because you can have a schema that everyone can validate against, before putting inconsistent stuff in the file …

As i see, most of the ‘passthru’ in gl.spec are comments, other are #include/typedefs that could be just lumped together.

Some of them contain what could be data, depending on how one coded the conversion scripts to generate header files. I did as you suggested in my code, but I’m not sure it would allow for a clean back-and-forth conversion between the XML and .spec formats.

One reason why i would prefer XML to other formats is because you can have a schema that everyone can validate against, before putting inconsistent stuff in the file

I wasn’t saying that my format was better or anything. Lua is a bare-bones scripting language, so it doesn’t come with an XML parser. So it was easier to just write the data structure out as a Lua file. And since this was just an intermediary form for my own personal needs, it’s not like it mattered very much.

The XML description would be for more than just generating a C-header: also have information to generate bindings beyond C, such bindings will need to know how many bytes a function is expected to write when passed a pointer. For all those except those explicitly mentioned in GL_robustness, that data is “sort-of-ish but may or may not be correct” in the current spec files.

My request is in no ways original, and the issues if one digs through the forums are a little bit more than “just get the function prototypes so I can write my C-wrapper for debugging”.

Right now though, the spec format is implicitly from a set of awk (and perl?) scripts that are really old. That cannot be a good thing.

such bindings will need to know how many bytes a function is expected to write when passed a pointer. For all those except those explicitly mentioned in GL_robustness, that data is “sort-of-ish but may or may not be correct” in the current spec files.

So you seem to be focused on this particular aspect of the spec files: the ability to tell what gets written. Well, here’s my question. If the current spec files actually have this data, but it’s unreliable because the ARB doesn’t reliably update it, why would you expect this to be different in a new format?

XML doesn’t magically make the information users enter into it more reliable. If the spec maintainers don’t properly use the tools they already have, if they can’t be relied upon to keep the spec functioning as-is, why would you expect this to change with an XML format?

There are 2 issues at heart here:

  1. parsing the spec file. This is the thrust of my unoriginal request. XML though ugly for a human to read is simple to parse, simple to add fields, etc. XML human friendly editors are abound.

  2. That some of the data in the .spec files is flaky is likely a product of the fact that some of the fields are not really used by anyone or anything usually.

Now, if the XML files are completely generated by scripts, then (2) will not be fixed (duh), but likely is that a fair amount of hand editing and nudging will be needed to get the XML from the .spec, such hand editing will likely end up fixing the “bugs” in the spec file.

In truth I am more focused on getting the syntax/format of the spec files to be something well defined beyond “it’s an input to some really old scripts”. XML is nice in that XML is likely going to be around for a long time with a well defined standard and syntax. Along the lines of making the XML bits is deciding what is needed to describe a GL entry point. Beyond C, we need much more than just a function declaration, just deciding what to put in the XML and then generating the XML data will likely force a check of the unreliable data in the .spec file.

but likely is that a fair amount of hand editing and nudging will be needed to get the XML from the .spec, such hand editing will likely end up fixing the “bugs” in the spec file.

But that really is the point. If the ARB doesn’t have the bandwidth to go through the .spec files and fix the “bugs” in them as it currently stands, what makes you think that they can use a non-automated solution for generating an XML format from the .spec files? As you point out, this will require, at a minimum, the same amount of work needed to fix the .spec files in the first place.

That’s why it must be an automated process, with reversibility; otherwise, it’s simply too much work to get done.

I know it’s not really what you wanted, but here’s a first-pass at an XML definition of the spec files. It comes with a Relax NG schema to validate it.

The only information it doesn’t contain from the spec files is the enumerator type grouping information from the enum.spec files. That was incomplete anyway, and not updated.

Update: The ARB updated the .spec files to GL 4.1 + extensions. I have therefore run my scripts over them and generated new XML extensions.

They can be found at the same download link above.