Friday, February 15, 2013

(Mostly) Painlessly Migrating a 3D Game Engine to TypeScript

Introduction

At Turbulenz we are developing a platform for high quality online games.  Part of our technology offering includes a JavaScript library of about 100,000 lines of code, covering many areas of game functionality from rendering APIs that wrap WebGL, to an optimized 3D physics engine, and interfaces to online services.

A production-quality game using the engine can run to well over the same amount of code again (Our pre-alpha game PolyCraft already has about 85,000 lines of runtime JavaScript). The burden of maintaining JavaScript projects of this scale is already known to be significant, and this has been echoed in our experience over 3 years of developing our game engine, as well as several 3D and 2D game titles.

Catching coding errors early can make a huge difference to programmer productivity.  We regularly use static analysis tools such as jshint, as well as automated testing to try and identify problems as code changes are made. However, the lack of type information in JavaScript limits the class of problems that tools can identify, and automated testing can never feasibly cover all execution paths.

TypeScript is one of several projects that attempt to introduce static typing, allowing offline tools to identify a larger class of errors and to provide richer functionality in the IDE.  This article describes the method we used to migrate our JavaScript code base to TypeScript, while keeping development active.

Static Typing Solutions

TypeScript is not the only project that attempts to tackle the problems associated with weak typing in JavaScript. I cannot claim it is better than any of the other solutions out there, but there were a few factors that made us commit developer time to checking out TypeScript.
  • Ease of migration. As a small company with a reasonably large existing code base, it is difficult to justify and a full port to another language.  Even if development resources were not a problem, a full port would be a big commitment to make before being sure our code was compatible with static typing.  With TypeScript we were able to proceed in steps, gradually merging changes into the mainline as we went along.
  • Complexity. This is not so much the code as configurations.  A subsection of our engine has multiple implementations. In the default case everything runs directly in the browser using HTML5 and related standards. For older browsers with limited support for modern standards we provide a plugin that includes a native implementation of some of our lowest-level libraries. TypeScript provided a clear way to deal with multiple implementations of the same interface, where one of those implementations was hidden away as native code.
  • Timing. There happened to be a lot of news and activity around TypeScript just as we were discussing the shortcomings of JavaScript both internally and with external developers. This had some influence on our decision to try static typing at the time we did, and to do it with TypeScript.
Again, I'm not claiming that TypeScript is the only way to address these issues.  If conditions had been different we may have been tempted by one of the other available solutions. Particularly if we were just starting out with a new code base.

Migration Path

Initially, it was not clear whether TypeScript would be entirely suitable for our project.  Many questions needed answering:  Was it stable?  Was it compatible with how we define and instantiate JavaScript classes?  Was our API and code even amenable to static typing, or had we embraced dynamic typing to the point of no return?

It was important to be able to quickly try it out and catch any show-stopping problems early.  At the time there were relatively few public accounts of TypeScript being used in production, so we could not rule out the possibility of later finding either a bug or language feature that made it impossible for us to proceed.  In the worst case we would need to revert everything back to pure JavaScript.

We were also keen to eliminate any impact on developers in terms of interface and performance.  I was already reasonably sure that we had enough control over the generated JavaScript to avoid sacrificing performance, but it was not clear whether changes to the public API would also be necessary.

Finally, there was the issue of integration.  Development of the engine would have to continue while we simultaneously “ported” the code.  Handling development changes and keeping merge conflicts (and mistakes) to a minimum would be vital until we were confident enough to adopt TypeScript in the main code line.

As TypeScript is a super set of JavaScript, it seemed to offer as smooth a migration path as we could expect from any solution. After becoming reasonably familiar with the language and the compiler we ended up migrating in the following way, at each stage validating that we were doing the right thing.

Step 1 - A trivial build

The first step I took was to simply rename each of our .js files to .ts and create a trivial build step to convert the each of these new .ts files to JavaScript.  The code was left unchanged at this point (apart from a few small workarounds for bugs in the TypeScript compiler), the JavaScript library had the same layout as before and everything could be merged back into the main code line.

The advantage of this apparently trivial step was that regular development could continue while we gradually expanded the type definitions in the same files.  This approach is only really viable because TypeScript extends rather than replaces JavaScript, and it’s actually very important.  If we were porting to a whole new language we would have to hand pick changes to the original files into the ported version, and could only switch development over to the new files once a port was complete and fully tested.  No doubt some static typing solutions allow the port to be done in sections, merging individual modules back to the main line as they are ready, but having an almost effort free way of keeping all changes in the same files meant we could rely on the version control software to handle almost all of the merges automatically.

A secondary advantage of switching to .ts files and introducing a build step was to give developers a chance to iron out any problems with workflow and IDEs.  Previously, no build step was required, and for our developers working on the engine the code that appeared in the browser debugger was the original JavaScript source itself

In practice, for various reasons (including limited available time for the project) we ended up keeping much of the work in a branch for long periods of time.  In hindsight, we could have integrated earlier, but it felt like too big a leap into the unknown at that stage.  Thankfully, git was aware of the file renaming and handled the integrations from the main line extremely well. If the change management and integrations had not been as easy as they were I am quite sure our investigation into static typing would have ended here.

Step 2 - Type checking

As this stage, the build did not catch any errors apart from syntax errors.  The next step was to enable all the type checks. When this type checking build passed we could be more confident about the type-correctness of the code in our engine.

For the simple build I added an --ignoretypeerrors flag to the compiler to make it only output error messages and exit with non-zero values if there was a syntax error in the code.  For the type checking build I had to add --failonerror, which stops the compiler writing output files if there are type errors in the code.  Without this, build systems may not try re-run the compiler during a rebuild after errors have occured.  (The default compiler always outputs a .js file but exits with 1 unless there are no type errors.  Our custom version is available here)

In this stricter mode it was no longer possible to blindly build each .ts file into a corresponding .js file. Since some files relied on types declared in other files, the build had to be aware of dependencies between different parts of the code and build everything in the correct order.  As a simple (fictional) example, if the 'engine' module references classes in the 'platform' module, 'platform' must be built (and the 'platform.d.ts' file generated) before 'engine' can be compiled.

This may seem like an over complicated approach.  Indeed, for many projects it would be possible to just build all .js files in a single step. In our case we were forced to divide up the code in this way for a couple of reasons:
  • Some of the code represented the 'canvas' (HMTL5) implementation of the low-level engine that may or may not be replaced by our browser plugin.
  • Building all the .ts files at once would result in a single huge .js file. It is important that our engine remain modular so that game developers can include only the parts that they need and keep the download size of their final code as small as possible.
This stricter build mode is referred to as 'modular' in our system, since it allows developers to express the build as modules and dependencies between them. It then resolves the list of .ts and .d.ts files required for each module and handles building everything in the correct order.  As an example of a module specification, the following is an extract from the build files for our SDK:
utilities_src :=

platform_src := platform.ts
platform_deps := utilities

jsengine_src := $(wildcard jsengine/*.ts)
jsengine_deps := platform
From this, the system automatically determines that in order to build jsengine, say, it must generate the type declarations for utilities and platform, and use both of those declaration files in the build command for jsengine.

Step 3 - Adding static type information

After Step 2, we had two builds that could be run side-by-side. A "crude" build that generated the production version of our library (with exactly the same file layout and functionality as the original JavaScript), and a stricter build which would, when it passed, generate .js files and .d.ts declaration files for each module.

Note that the crude build could always be relied upon to produce working JavaScript, so throughout this process all samples and applications could be built, and the generated code could be tested in exactly the same way as before.

As one would expect, the modular build initially failed at a very early stage. I had already added a lot of type information while experimenting with the compiler and adding build steps, but a lot more was required to fix all the compiler errors.  So the next step was to get this stricter build to run without errors.

The plan was to make the modular build pass as soon as possible by adding only the minimal amount of type information.  We could then enable type checks on the build machine and ensure that at least any new code would be consistent with existing type declarations.  Later go back and add more and more type information as time allowed.

It turned out to be a lot of work just to make this build pass, and some of the changes were invasive enough that I decided to keep them out of the main line until the build succeeded.  Potentially this stage could have been achieved by just adding 'interface' declarations for each of our classes, and I initially start with this approach. However, I quickly switched to doing wholesale conversions to TypeScript classes.  This generated much better .d.ts files and avoided a few problems with symbols not being found by the compiler.

The compiler may be much more robust now, but at the time everything was more stable using classes in .ts files.  It was something that we would end up doing anyway, and in most cases it was possible to do this without reordering functions within a file, so integrations from the main line remained surprisingly smooth.

The details here may be informative for people who are new to TypeScript, but some may want to skip the remainder of this section.

Our original classes took this form:
function MyClass() { };
MyClass.prototype.method = function ()
{
    return this.y;
};
MyClass.create = function()
{
    var myClass = new MyClass();
    myClass.x = 123;
    return myClass;
};
Note that there are two member variables x and y mentioned here. The following is the minimal declarations that will satisfy TypeScript when building the code above:
interface MyClass
{
    x: number;
};
declare var MyClass :
{
    create(): MyClass;
    new(): MyClass;
    prototype: any;
};
(We also needed to make the MyClass constructor function return this.)

The declare statement deals with the global constructor MyClass. Here the static create function and the constructor are declared, as is the existence of the prototype, which at this stage is given type 'any' for simplicity. The interface refers to instances of the class. We have had to declare the x member, referenced in the static create function, but not the y member referenced in the method. (This is presumably because the prototype is marked as any and therefore could have any members.)

The advantage of this form is that it can be dropped at the top of the file and does not usually require any changes to the existing body of code. In theory we could just flesh these declarations out as required and keep the TypeScript and old JavaScript separated for the time being. Integrations become trivial and if we decided to move back to JavaScript it would just be a matter of deleting a few lines.

However, the declaration of the global constructor did not make it into the generated .d.ts file, and in some cases TypeScript could not find certain methods or properties when compiling dependent code. This may have been fixed in the latest version of the compiler, but switching to classes was the simplest way to this these problems at the time and made everything much more reliable. The resulting code:
class MyClass
{
    x: number; y: number;
    method(
)
    {
        return this.y;
    };
    static create()
    {
        var myClass = new MyClass();
        myClass.x = 123;
        return myClass;
    };
};
was a bit more of a commitment in terms of code changes. The main code body has to change slightly, and we now have to declare all members.  However, we have not had to re-order any functions, so comparing with the original it's clear that most upstream changes can be merged with relatively little hassle.

So I made the leap to classes for almost all of our code and by the time the modular build finally passed I had made more changes than I had hoped, but a large amount of the final type information was in place and the whole process had revealed several bugs. The next step was to build our application code against the generated declarations to catch any remaining holes in the type information.

Step 3.5 - IDE support

As it became clearer that TypeScript was effective at catching problems and reducing the cost of code maintenance, we started to look more seriously at the logistics of developing our game engine entirely in TypeScript.  Inevitably, the issue of IDE support came up very quickly.

The TypeScript plugin for Visual Studio provides excellent tooling when configured correctly, however our developers work across Windows, Linux and Mac OS X, and use a variety of editors. Editors that provide good support for JavaScript do not always have the same level of support for TypeScript. In many cases TypeScript support is in development for these editors, so we can expect the situation to improve drastically in the future. The TypeScript site has links to integrations for Vim, Emacs and Sublime Text which gives developers a reasonable amount of choice. Our build system can be launched in a syntax checking mode (to builds just enough to validate a given file and output the errors only for that file) which can be used for on-the-fly syntax checking (as with emacs flymake) or for validate-on-save checks.

Debugging is also more difficult with TypeScript since the browser is reading and executing generated code rather than the original source. However the TypeScript compiler can already generate code maps for browsers that support them, so we can expect this situations to improve as well.

So for now there has been a slight decrease in functionality of the development environment for some of our engineers, depending on the editors used, but overall productivity should improve as we have stronger checks much earlier in the development workflow.

Step 4 - Building applications

This part of the process is still going on. We currently have a fairly complete set of .d.ts files that will be shipped with the next version of our SDK, and we are gradually transitioning all of our sample and application code to TypeScript.

It has caught several bugs in the application code, as well as a few missing pieces of the TypeScript declarations. One common problem was optional parameters. A method declared:
method(requiredParam: Type1, optionalParam: Type2): number
{
    …
};
will not be flagged unless client code tries to call it without the optional parameter. Therefore to actually find many of these issues we must build with as much application and test code as possible.

Once our TypeScript declarations are released in the SDK, we hope that developers will try building their games against them and give us feedback.  If everything continues to go well then we certainly plan to migrate more of our projects to TypeScript.

Tuesday, February 12, 2013

Emacs etags with TypeScript

As an emacs user working with TypeScript, the current lack of etags support is frustrating.  I found that the following command actually produces somewhat usable TAGS files:

    find . -iname '*.ts' | \
      etags -l c++ -r '/interface[ \t]+\([^ \t]+\)/\1/' -

That is, the files are parsed with the default C++ parser, and we use a regex to also extract tags for interfaces.  It's far from perfect will enable quick jumping to class and interface definitions.

Thursday, October 27, 2011

The move to Mercurial

oyatsukai recently migrated all code from the old Subversion repository to Mercurial, and there has been a little bit of interest about what we did and why / how we did it.

Why move?
In 2011 this probably doesn't even need an explanation.  Subversion still doesn't handle branching in a sensible way and distributed version control is just so much more flexible.

With the core engine technology evolving quickly to handle more game titles, we were aware we needed to start using branches in some form, if only to maintain releases and keep them shielded from changes in the shared libraries.  I was using git-svn with our old svn repo, and keeping local release branches of Final Freeway for android.  It was time to start doing this across other games and platforms, keeping everything on the server where it should be.

Why Mercurial (hg)?
As a pure revision control system for code, my first choice would have been git.  Partly because I already know it, but mainly because it's just better.  It's true that, barring some subtle differences like how branches are handled, under he hood the two systems seem to be very similar and what you can do with one generally seems possible with the other.  But what stands out for me is that with git the default commands do exactly what you want in the common case.  git pull does what I generally want to do when I sync from the server.  On the other hand, hg pull will usually leave you in a state you don't want to be in unless you add the correct combination of flags.  It can't be good design that the most common operations tend to require extra parameters.  However, it was clear that either git or hg would be a better solution than svn.

Of course, it wasn't just a simple case of choosing the system we personally prefer.  Since oyatukai is a game developer there are a lot of binary assets to deal with.  This complicates things because neither git nor hg are particularly good with large binaries.  We also didn't want artists to have to deal with anything more complex than SVN (which can already be a burden for them sometimes) and the hg and git GUI interfaces are not that great.

The feature of hg that made us choose it was the ability to reference external subversion repositories.  Git has 'submodules' for pulling in external repositories, but they have to be git repositories.  But the hg subrepos feature is just excellent, allowing you to include hg, svn or git repositories.

Final Solution
So we went with hg on the basis that, like git, it gives us proper support for branching, merging, rebasing, etc and also allows us to keep art in an svn repository.

Since svn can sync sub-sections of a repository, we have always given artists a svn url that only includes the stuff they need - the art data and the tools to edit and preview their changes.  We use hg subrepos to pull this same url into the source tree for developers, and as far as the artists are concerned nothing changed.  Not having to migrate them over to another system really saved some headaches.

A side-effect of using this system is that there is a bit of separation between art data and code.  A given hg revision includes the version number of the assets in svn, and when there are art changes someone has to manually tell hg to reference a new revision.  However, this means that we can adopt new art into the development repo at whatever frequency we like.  Until we are happy with the state of the art repo, the development line keeps referencing a known good version of the asset data.  Similarly, when we make changes to the art tools we can 'release' them to the artists when they are ready by just submitting them to the svn subrepo, and the artists don't have to be concerned about any intermediate code changes.

I think overall it's working well.  Moving from svn to hg required learning a new interface, but with several titles in development and more recently projects that involve sharing sections of code with other developers, we are making good use of the flexibility of the system.  It was certainly worth the effort.