Gant Software Systems

Managing Large T4-Based Code Generation Projects

This past weekend, I spent a great deal of time reworking some parts of our data access layer for Agulus that have been problematic in the past, mostly by trying to get rid of places where we are using ExecuteScalar and places where we are are working with DataTables, as both places are very sensitive to changes in the underlying database schema and the errors don’t surface until runtime, which really stinks. I managed to figure out how to get all the metadata I needed to replace this functionality (some things could be easier….) and proceeded to start editing my templates to achieve this goal. As I did so, I started reflecting on how much I’ve learned, mostly the hard way, about how to manage larger sets of T4 templates while keeping things maintainable. I’ve not seen a lot of guidance floating around on how to deal well with this stuff, so here’s a list of a few things I’ve figured out (so far). Most of the guidance below sounds very much like the guidance you’d expect to see when building something using ASP.NET MVC, which if you think about it, makes a lot of sense since both share many similar concerns. None of these are earth-shattering, but people tend to forget that code that generates code should be maintained at the same quality level as code that is actually being shipped.

  1. Throw exceptions as soon as possible. Debugging templates is….not a great experience, at least compared to debugging most things in visual studio. It also frequently doesn’t work at all if you plan to output multiple files and have to interact with visual studio. One of the best ways to avoid this is to make sure you are detecting invalid state as early as possible, and then throwing an exception with a descriptive error message immediately.
  2. Log things heavily, and build in the option to dump the log into comments in the head of your output file. This was probably the biggest win for me, although I’m still working on getting everything logged properly. It’s really nice to have a switch you can flip to have this stuff dumped out so you can actually see what’s going on. Be sure to put it in the header of the output file, as putting it in the footer means that it won’t be rendered if the code generation process crashes (which somewhat defeats the point of having logging).
  3. Use class feature blocks and build actual classes that represent the data that will be rendered in the view. More than likely, you are generating code from some sort of metadata. In my case, that data came from SQL Server and wasn’t really in an optimal format for being rendered to views. When I started out, I tried to use it the way it came from the datasource, rather than sticking it into an object model that was optimized for what I was doing. This backfired rather drastically. It’s far easier to go ahead and build appropriate classes and work with those in the rendered templates than it is to try and clean up funky metadata. Which brings up the next point.
  4. Accept that metadata is probably going to be crap at some level, no matter how clean it appears at first. When you start out, you might assume that your metadata is clean, built exactly like you would want it to do what you are trying to do, and that it will tolerate the extensions you plan to make over time. The operative word here is “assume”. Like a new house, it’s easy to see only the surface appearance until you live there. My best advice here is to sanitize ALL the data before you render anything and check it heavily for problems before doing anything with it.
  5. Have a master control file that accomplishes the larger steps of reading in metadata, delegates the processing of that metadata into appropriate objects, and renders the templates in the proper order. This helps significantly with being able to reason about what is going on in your code at a higher level so that you can deal with structural problems separately from parsing, rendering, and metadata cleanup problems. It also makes it easier to move major pieces around as required and makes it easier to profile and determine what is slowing down the generation process.
  6. Break separate sections of output into their own .ttinclude files. You can break up output into separate files, wrapping the component in question in a function. This tends to make it easier to call the template in a loop from the main control file and makes it a lot easier to find particular pieces of problematic output when they arise.
  7. Avoid conditional statements in rendering code where possible. Much like placing logic inside of MVC views, placing logic inside of rendered templates tends to result in pain. In my case, I found that I tended to do a lot of conditional statements inside of code blocks. Usually these are better replaced by variables on the backing class for the template being rendered, as soon as they are used more than once. This avoids a lot of irritating little errors.
  8. Avoid linq queries in rendering code where possible. I also found that when rendering a template with a loop that if I only wanted to run it for a certain subset of items that it is usually better to expose the set in question as a variable on the backing class, rather than using a linq statement to derive it. If found that over time as I added more functionality that the same queries repeated themselves in multiple places. Again, this was a place where introducing variables really helped clean up the code.
  9. Don’t use a variable for part of a chunk of text. I also discovered a lot of cases where I was using a variable name as part of a larger chunk of text. A good example would be where I did something like declaring a typed list, where the generic parameter type was provided by a template variable. Instead of doing that, I would generally suggest treating the larger piece as a variable and handling it accordingly, building up its value from the other properties on the object.
  10. Put template blocks (the bit between the <#+ and #>) at the very beginning of a line, without whitespace proceeding them if they start a line. That drastically reduces how screwed up the whitespace is in the output code, which makes it a lot easier to deal with.
  11. Judiciously use Resharper’s warning suppression on generated methods. You should look over the generated code and make sure that it meets good coding standards. However, given that you are generating code, it’s frequently possible to end up with some big methods at which resharper (and other tools’) analytics will choke. You may need to carefully suppress warnings just to keep from having a tremendous amount of warnings in the generated section (which can make you or your team start ignoring warnings).
  12. Don’t forget coding techniques that don’t require code generation. Base classes, inheritance, interfaces, generics, callbacks, and even reflection are still very valid ways of promoting code reuse and abstraction. I strongly suggest not using code generation to replace any of them, as the compiler is much better at working with the above than the T4 editor/debugger is at making code generation simple.
  13. Don’t forget that you can use partial classes and empty virtual methods on base classes. When working with generated code, it is often very helpful to have a means of overriding generated behavior. I’ve typically found that the easiest way for me to accomplish this usually involves having an empty method (or default behavior) implemented in a base class, that I can then override in a partial, non-generated class. Whatever you do, make sure that you don’t have to apply overrides by default. That gets old very quickly when you are generating code.
  14. Make a working sample of the output before messing with T4. It can be hard enough to get code working right if it’s annoying enough that you don’t want to write it in the first place. You don’t want to mix that with the difficulty of a templating system.

As I said, I’ve been climbing the learning curve a lot with using T4 templates to cut out swathes of irritating and repetitive code. Thus far, the above notes have represented a lot of guidelines that I’ve come to over time in an organic fashion. I hope they help you too.