Optimising Meshes and Animations.

Processing all animations on the GPU is pretty cool (would anyone like a blog post on this?) but it does lead to some issues when you try and use the 3D engine in a game. These issues have raised their nasty heads during integration of the Bullet Physics/Collision library.

When an MS3D model is loaded each joint is collapsed to the origin then transformed at render time based on the joints current rotation/translation key-frame. Since I do all these transformations in a vertex shader the CPU never sees the transformed vertices and therefore does not know what shape the mesh has taken this frame. How is it possible to do collision detection on the CPU if the CPU does not know the bounds of any of the animated objects?

Calculating this data during load time is possible but much too slow. One of my animated meshes is made up of 2177 vertices, 58 joints and 1000 frames. To calculate all required data during load I need to

  • Calculate 58000 (58 joints * 1000 frames) transformation matrices.
  • Transform 2177000 (that’s 2.1 million) vertices (2177 vertices * 1000 frames) to calculate the bounds of the mesh for each frame.

This is pretty slow especially when you have multiple different animated objects. In my engine the mesh in question takes 21.267 seconds to process. I’m sure I could optimize the code involved and bring that time down but don’t think that it would be possible to get the improvements required to make it usable (e.g less than 1 second to load).

How does one get around this? In a previous post I think I stated that the fastest way to do something is not to have to do it at all. With that thought in mind its worth considering, for a given animated mesh, what parts of the output data vary each time it is processed? Pretty quickly you realize that the answer is NONE. Every time I process a certain animated mesh, the output should be the same as the previous time I processed that mesh and it will be the same the next time I process. This means that allot of the work can be calculated offline and stored in a simple to read format for access at run-time.

I have created my own raw binary 3d format and have a conversion tool which allows me export .ase, .obj, .3ds, .md3, .md5 and .ms3d. For simple mesh’s that do not have animations (.ase and .obj) the vertex and index data is output along with any materials used for rendering and the bounds information for the static object. When the mesh is animated I append joint information (transformation matrix for each joint for each frame) and the bounds information for each frame. Loading now takes 0.168 seconds. No code optimization would have achieved that performance.

The performance does come at a cost and that cost is size. The original MS3D file is 1.9mb but my new format has allot more information and bloats the file to 4.4mb. This may not seem like allot to a desktop programmer but when you’re targeting a mobile device it can be.  At the time of writing Apple have a cap of 50mb download size from the ‘App Store’ over 3G/4G.  Ideally I want to keep my app below this size so as not to deter casual, impromptu downloaders from accessing my game.

The plan of action is to compress the assets before packaging them into the app. (The mentioned 4.4mb file compresses to 1.8mb) A decision still has to be made with regards the extraction process. Should the assets be extracted to disk and saved uncompressed the first time the user plays the game or should they remain compressed and be extracted in memory each time the user plays? Some analysis of memory usage and processing time needs to be completed before a solution is chosen.