Temporary file directories now come in two flavours
Published 2013-12-05 by
We’re now providing web applications with two variants of temporary file directories: one that is shared between boxes and a faster one that is stored locally on each node of a freistilbox cluster.
Our main goal with freistilbox is giving website developers maximum performance at minimum effort. Storage access is an important aspect in web performance tuning and it’s a good idea to avoid expensive disk operations whenever possible. That’s why we decided to store temporary files created by web applications locally. It’s obvious that writing to a local disk is by far faster than shipping them over a network connection to a shared storage.
We chose that approach under the premise that temporary files are only created and used during a single content request, for example for uploading a file, aggregating CSS code or compressing data. Under this condition, it doesn’t matter when the next request is handled by another cluster box that has its own separate temporary file directory.
Support requests we got over the recent weeks were a clear indication that this premise was wrong. As it turns out, there are situations where temporary files are expected to persist beyond the lifetime of a single content request.
As an example, there was the customer who noticed that batch operations of the “views_export” Drupal module delivered incomplete data. We found out that the batch process saves intermediate results to the temp space. Since the batch runs were distributed over their boxes, so were the result files. At the end of the batch process, the box doing the final run only found the files that were created on this particular box and so returned corrupted data.
In order to make sure that even temporary files are visible to all boxes in a consistent way, we decided to relocate them to the shared file storage. The default temporary file directory available at ../tmp, relative to the document root, is now shared.
Obviously this has a significant impact on performance: the data still has to be written to and read from disk, but now on multiple separate storage servers; and the data transfer over the network comes on top. For those customers that don’t need a shared temporary file directory but depend on speedy file handling, we now also provide an alternative in the form of ../tmp_local.
Oh, and we also put a cleanup process in place that makes sure that temporary files stay true to their name: Files that haven’t been touched for a week are removed automatically.