Optimising Zend_Config

Zend Framework is a popular MVC framework written in PHP. There has been substantial development on it over the past couple of years, and it is now often used in enterprise environments to power websites belonging to large blue chip companies. However, as a site grows in terms of size, complexity and traffic, the limitations of Zend become increasingly apparent.

One such limitation that I’ve recently seen is the performance of Zend_Config. This class underpins the mechanism by which developers provide configuration to the application, and all this config is passed around in the form of Zend_Config objects. There are two problems with this:

  1. Zend_Config uses a lot function calls
  2. It is easy to misuse Zend_Config

For the purposes of this article, I’m referring specifically to config INI files, using Zend_Config_Ini to parse. This is a very common format for Zend applications to use – it is familiar to developers and infrastructure support teams – and is the case where the performance issues become apparent.

Parsing INI files

Zend_Config_Ini is a class that can parse INI files and return Zend_Config objects. It will take an INI file with a structure such as:

foo.bar.abc = 123
foo.bar.def = 456
bar = "hello world!"

and return a Zend_Config object which can be accessed like:

  1. $var = $config->foo->bar>abc; // 123
$var = $config->foo->bar>abc; // 123

This makes it really easy and convenient to use, hence why it is so popular.

PHP has native support for parsing INI files, using parse_ini_file() and parse_ini_string(). However, neither of these functions support dot-separation of the keys, so this is implemented in Zend’s code. To make matter’s worse, Zend’s implementation of this uses recursion within the PHP, meaning it can be quite a heavy method.

Zend_Config_Ini also makes full use of sections within INI files, to allow developers to create different environments and apply overrides, for example:

[production]
db.foo.hostname = livedb
db.foo.username = username
db.foo.password = password

[development : production]
; brings in all of the settings from production
db.foo.hostname = developmentdb ; just override the hostname

Again, this is a really useful feature because it means you can very quickly set up new environments for development, testing, etc. However to do this, it has to parse the whole of the production and development sections, and then merge them together. As with parsing the file in the first place, this merge is heavy.

It’s worth noting that often production doesn’t extend anything else, so this problem isn’t as apparent in live, but it’s still important to consider it if you have, for example, a CLI environment extending production.

Obviously a large site would likely cache the config, but this still doesn’t help the uncached experience. Pre-warming a cache can help, but isn’t ideal. However, caching and uncaching a large Zend_Config object in APC is still heavier than an array, and a JSON-encoded or serialized array is far more rapidly and cheaply processed than a Zend_Configobject.

Magic Getters

Zend_Config uses magic getters to access the data, so when profiling it, you will see it spawn lots of these function calls. When you use object-access notation on a Zend_Config object (e.g. $config->foo;), this actually hits the magic get method ($config->__get('foo');) which in turn calls the non-magic get method ($config->get('foo');). Obviously, this process adds in unnecessary function calls, and hence additional execution time.

toArray

I mentioned earlier that one of the problems with Zend_Config is how people use it. A good example of this is its toArray() method. If you profile (using, for example, XHProf) a call to this method on a large config object, you will see just how many function calls it causes, and the large amount of time it takes.

A simple method like this is often used without consideration – and using it on the whole config object inside a loop can cause significant performance implications.

Optimising

It goes without saying, having seen the above, that a simple optimisation is just to ensure that toArray() is only used when necessary, and on as small a chunk of the config object as possible.

But there are many more optimisations that can be made. The environment we use is running PHP 5.3, which means we can use PHP’s array_replace_recursive() function. Using this, and some carefully thought out code, we replaced Zend’s INI parser with the following function. This only works within the realm of one section, and assumes you are starting with the associative array provided by parse_ini_file() or parse_ini_string():

  1.     public static function parseIniDataToArray(array $data) {
  2.         $config = array();
  3.  
  4.         foreach ($data as $k => $v) {
  5.             $ks = array_reverse(explode('.', $k));
  6.             foreach ($ks as $kss) {
  7.                 $v = array($kss => $v);
  8.             }
  9.             $config[] = $v;
  10.         }
  11.  
  12.         return (0 === count($config)) ? array() : call_user_func_array('array_replace_recursive', $config);
  13.     }
    public static function parseIniDataToArray(array $data) {
        $config = array();

        foreach ($data as $k => $v) {
            $ks = array_reverse(explode('.', $k));
            foreach ($ks as $kss) {
                $v = array($kss => $v);
            }
            $config[] = $v;
        }

        return (0 === count($config)) ? array() : call_user_func_array('array_replace_recursive', $config);
    }

Once we have this nested array, we can pass it into Zend_Config to create our config object.

However, given what we’ve seen about the performance overheads of Zend_Config in general, we opted to move away from it all together. Initially, the plan was to use a normal array, but object-access notation (and some type-hinting of the object) was littered throughout the codebase.

Instead, we opted for an interim solution that would provide some backwards compatibility with Zend_Config, but also support migration towards an array for even better performance in future. Using the following class solves both these problems:

  1. class Config
  2.     extends ArrayObject
  3. {
  4.  
  5.     public function __construct(array $array) {
  6.         foreach ($array as $key => &$value) {
  7.             if (is_array($value)) {
  8.                 $value = new static($value);
  9.             }
  10.         }
  11.         parent::__construct($array, ArrayObject::ARRAY_AS_PROPS);
  12.     }
  13.  
  14.     public function toArray() {
  15.         $opt = (array) $this;
  16.         foreach ($opt as &$val) {
  17.             if ($val instanceof self) {
  18.                 $val = $val->toArray();
  19.             }
  20.         }
  21.         return $opt;
  22.     }
  23.  
  24. }
class Config
    extends ArrayObject
{

    public function __construct(array $array) {
        foreach ($array as $key => &$value) {
            if (is_array($value)) {
                $value = new static($value);
            }
        }
        parent::__construct($array, ArrayObject::ARRAY_AS_PROPS);
    }

    public function toArray() {
        $opt = (array) $this;
        foreach ($opt as &$val) {
            if ($val instanceof self) {
                $val = $val->toArray();
            }
        }
        return $opt;
    }

}

The advantage of this is that the config can be object in the existing way, as an object:

  1. $var = $config->foo->bar->abc; // 123
$var = $config->foo->bar->abc; // 123

or using array access, meaning that code can be slowly migrated towards array access notation instead of doing it all in one massive change:

  1. $var = $config['foo']['bar']['abc']; // 123
$var = $config['foo']['bar']['abc']; // 123

It’s worth noting that converting it to an array is still not cheap (although cheaper than before), however this only needs to be done when specifically type-hinted – something we are trying to circumvent in other ways.

To put all of this in perspective, doing the above changes (new INI file parser and custom config object with array access notation) reduced the number of function calls in the uncached bootstrap process by 75%, and the time by nearly 60%.

I would recommend anyone using Zend Framework for a complex or performance-critical site to look carefully at the overhead of Zend_Config. It is often loaded during the bootstrap of the application, so even trivial pages hitting the framework can incur this overhead. Although its effects can be mitigated by caching, using the techniques in this article can help improve the uncached experience too.

11 thoughts on “Optimising Zend_Config

    • Hey Matt, could you **please** post a screenshot of your Zend_Config cumulative exec time? You can do this via Zend Studio profiling, not sure via other methods. It would be a great testament for your cause ;o

  1. Pingback: Matt Knight Blog: Optimierung Zend_Config | PHP Boutique

  2. This breaks some compatibility/features of Zend_Config, as its just not parse_ini_file()/parse_ini_string().

    You could gain the same performance boost by using an opcode-cache (e.g. XCache) for the parsed Zend_Config object. This keeps support for sections, etc and with minor additions you may even get a extendable Config-Object which supports multiple config-files, optionally with sections and super fast Config-Creation, as there is in fact no Zend_Config execution needed, when you’re creating you’re cache-object as a deployment-target.

  3. I can definitely see why you are looking at it this way – we do cache our configuration which actually helps out greatly and essentially removes many of the complexities that you see here. A cached copy is always a faster copy. We keep anything from a configuration file inside of the shared memory to ensure the speed.

  4. Matt, you should be caching the results of Zend_Config on production. This is what every sane person should be doing for high traffic volume web apps. Large software project shouldn’t be building configuration data at runtime on every request. It should be stored away in memory.

    The subtle attack on Zend Framework in terms of performance is unnecessary and, frankly, cheapens the tone of this discussion. Remember that Zend Framework requires PHP 5.2 or greater and the majority of the team is spending most of their time working on Zend Framework 2 at the moment.

  5. You are so completely right :-)

    When we discovered the performance implications of Zend_Config (thanks to XHProf), we decided to use Zend_Config only to generate a config cache. The cache generator uses Zend_config to read the INIs and generate native PHP arrays

  6. Just to address some of the comments above about caching….

    I completely agree that caching is essential. No site with any significant traffic should run without config caching – as I mentioned at the end of the first section in the article. However, this doesn’t address the issue of the uncached (assuming no pre-warming) experience.

    The situation I was talking about was two-fold – firstly how to parse the INI files more efficiently, and secondly why using an array (or ArrayObject) is better than a Zend_Config object.

    It’s a very valid point that Zend is aimed at PHP 5.2 or greater, hence why I wrote this as an article discussing an alternative to Zend_Config_Ini, and not submitting it as a patch. There are many users of Zend_Framework 1.x out there, using PHP 5.3, for whom this could significantly improve the performance of their application.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code lang=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" extra="">