« Blog Overview

Benchmarking PHP Resource Loading: JSON vs. Serialized vs. Raw PHP vs. Mysql

During recent discussions with one of our customers the subject of performance was raised. They want us to build a site for them which they expect will have to cope with up to 100 requests per second and wondered how they could ensure performance would remain good. Apart from advising them to get a beefy server, we came up with the idea of offloading some of the database requests, which were essentially read-only, to files which would then be parsed by the PHP code. This sounded like a good idea, but we had to admit that we had no idea what exactly the performance implications of this would be. Since we don't like not knowing the answers to such questions, we wrong some test code to measure this.


We took a 17KB JSON file, converted it to a serialized PHP representation, an APCu cache and also to raw PHP code (essentially a PHP array structure). In order to be able to measure how MySql stacked up in this comparison, we also added a SELECT statement from a table containing a few rows of page data (not quite 17KB but close). We then let executed the following code to get an idea of what the performance implications of the different loading strategies are: 


        $num = 100000;
        $start = microtime(true);
        for ($i=0; $i<$num; $i++) {
            $s = file_get_contents("../test/languages.json");
            $arr = json_decode($s, true);
        }
        $end = microtime (true);
        print "Json Time: " . ($end - $start) . " sec";
        print "Json Time: " . number_format(($end - $start)/$num, 8) . " sec";

        $start = microtime(true);
        for ($i=0; $i<$num; $i++) {
            $s = file_get_contents("../test/languages.serialized");
            $arr = unserialize ($s);
        }
        $end = microtime (true);
        print "Serialize Time: " . ($end - $start) . " sec";
        print "Serialize Time: " . number_format(($end - $start)/$num, 8) . " sec";

        $start = microtime(true);
        for ($i=0; $i<$num; $i++) {
            $s = include ("../test/languages.php");
        }
        $end = microtime (true);
        print "PHP Array Time: " . ($end - $start) . " sec";
        print "PHP Array Time: " . number_format(($end - $start)/$num, 8) . " sec";

        apcu_add ('test', $s);
        $start = microtime(true);
        for ($i=0; $i<$num; $i++) {
            $s = apcu_fetch ('test');
        }
        $end = microtime (true);
        print "APCu Time: " . ($end - $start) . " sec";
        print "APCu Time: " . number_format(($end - $start)/$num, 8) . " sec";

        $start = microtime(true);
        for ($i=0; $i<$num; $i++) {
            $pages = \App\Model\Page::get();
        }
        $end = microtime (true);
        print "SQL Time: " . ($end - $start) . " sec";
        print "SQL Time: " . number_format(($end - $start)/$num, 8) . " sec";

So basically we're loading each data item 100K times using the given loading method. Here's how the different techniques stack up:



MySql JSON Serialized APCu Raw PHP
Total Time 34.18 sec 10.37 sec 3.51 sec 2.68 sec 1.14 sec
Time per iteration 0.00034182 sec 0.00010375 sec0.00003511 sec 0.00002686 sec 0.00001144 sec

So there you go. Switching from MySql to raw PHP would get us approximately a 30-times performance improvement for that specific request. This of course, is only possible for read-only data which does not have to be queried using SQL filters, transforms, or anything else of that nature. While for most things, this is not a practical approach, for certain types of workloads, this may be a strategy worth investigating. APCu is a close second and may be a better choice for most common workloads since you're not generating dynamic PHP code (which is a bit scary).



« Blog Overview