COLLADA write so slow

Licu · September 22, 2006, 4:25am

We are trying to export large 3D data (millions of polygons) to COLLADA. The files are 50-150MB. There are few objects exported but they are very large.

We are using COLLADA DOM 1.4.1

We’ve managed to speed up COLLADA database construction code by preallocating the arrays, but saving (DAE::Save) is so slow. It takes around 5 minutes to write a 100 MB file which is quite strange (our application save it in few seconds and it is using an XML format too). This makes COLLADA export quite unusable. Is there a way to improve the save. Can we expect optiomizations in the future ?

Thanks

alorino · September 22, 2006, 6:22pm

I’ll look into this asap.

-Andy

Licu · September 29, 2006, 1:30am

Any news on progress? We really depend on this optimization since usage of COLLADA is quite impossible for us in its current state due to large data sets we are using.

Thanks

Kirin · September 29, 2006, 2:38am

How about writing an XSL-conversion to convert your XML format into COLLADA. This could speed up your whole process.

alorino · September 29, 2006, 11:55am

Sorry. Yes I have made progress.
I am going to be updating the sourceforge subversion soon and making a new release package sometime next week.

I don’t think it is going to make your files save in a ew seconds like you say your custom XML does. But I did make the write about 30-40% faster. I tested with a model that is 90mb, with just under a million polygons. The program just did a DOM load followed by a DOM save. The DOM used to take ~1min 45secs. Now it takes ~1min 15secs.

-Andy

alorino · September 29, 2006, 1:10pm

Also I wanted to say. Almost all of the time for save is taken up by sprintf and libxml xmlwriter functions. I don’t think I will be able to optimize the save more than it is.

-Andy

Licu · October 2, 2006, 12:15am

To Kirin, is not about converting from our format to COLLADA. We provide exporters/importers to/from COLLADA. The application is commercial, is not an in-house tool.

Alorino, thanks for the update, let us know when the repository is updated. As for sprintf, I understand you cannot optimize that, but i suggest writting your own convertion (since you know what type of data is there, example for float_array you know you have a list of floats) and send the data as string to the xml lib. For our own format we are using tinyxml and we do just data, send preformated data. Again, I don’t know if this is possible for COLLADA and the way is working with libxml.

Thanks

alorino · October 2, 2006, 5:36pm

Thats a very good idea. And looking into the code it seems the DOM has places for exactly that kind of thing. (I am the lead programmer of the DOM for the last year but I am not the original author, lots of stuff that is there isn’t so apparant as to why until something like this comes along)

I will work on doing that sometime soon. I don’t really know exactly when but its a great idea that will speed up the DOM load and save 10 fold. Also scanf and printf aren’t 1 to 1 mapped with xml schema types, and that causes some problems. Thats an issue I have run into today that is solved by the same solution.

-Andy

Licu · January 3, 2007, 11:54pm

Any news on the optimization progress? As it is now (from the COLLADA SVN) the write is still too slow for large data and this makes COLLADA usage really unpractical for some applications. Now that we’ve completed the import too I find out that reading documents is also very slow. Any progress on this matter is really appreciated. Otherwise we need to find other export/import solutions for our application.

Thanks

alorino · January 4, 2007, 1:21pm

Hello,
I have a little bit of news about making things faster.

I have implemented an “experimental” feature to the COLLADA DOM. What it does is externalize <float_array> and <int_array> <source> data. It stores it as 32bit binary data in a separate file (with a .raw extension). The DOM can do this automatically on save and will convert it back into standard COLLADA <float_array> or <int_array>s on load. It has a considerable file size gain and performance gains. (load gains a lot more than save unfortunately)

Heres some data I got using a 90MB model with a simple test program built with the release DLL build of the DOM (no animation so its all model data)

Normal load - 22 seconds
Load then save normal - 80 seconds
Load then save with raw - 65 seconds
Raw load - 9 seconds

The size of the .dae and .raw together end up being 70.5MB

Unfortunately using this .raw format would sacrifice the ability to interchange with other tools. Only tools built with the DOM would read them correctly (but they would all have to be rebuilt with the newest DOM svn) We will ask FeelingSoftware to adopt this .raw extension but if they do it will take a while before their implementations support it. If you only need this for internal tools this may be a viable solution in your toolchain.

I have not had time to work on replacing scanf and printf in the DOM. Since the latest release of the DOM and Refinery I have been working on a different COLLADA project and have not yet had time to go back and work on that. When that gets done, theoretically, there will be a large gain in both load and save which will also make the .raw support even faster.

The COLLADA DOM is open source and would be happy to take contributions that yeild considerable performance gains winkwink*

-Andy