Category Archives: PHP

Optimizing Doctrine in long running Jobs

I was recently building a long running PHP (CLI) job that exported large amounts of data from a MySQL database using the Doctrine DBAL. One problem I noticed was that the more data it exported, the more memory it consumed. Which was odd because it only reads from the database, accumulates new objects, sends them off to another service and then discards them. Analyzing memory consumption is not that easy, especially with larger programs that shuffle a lot of data around.

A simple yet effective method is to output the current memory usage at different places in the code:

echo memory_get_usage(true) . PHP_EOL;

This confirmed that indeed the memory was increasing over the runtime of the job. I suspected Doctrine was the culprit because the Entity Manager keeps all Entities it fetches from the Database in memory in case they are needed later. This is called the Unit of Work. I added some more debug output that periodically prints the class names of the entities Doctrine has under management and how many of each:

// Get Doctrine from the Symfony Dependency Injection Container
$doctrine = $this->getContainer()->get('doctrine.orm.entity_manager');

// Check what's actually inside the unit of work
$unitOfWork = $doctrine->getUnitOfWork();
echo 'Total number of entities: ' . $unitOfWork->size() . PHP_EOL;
foreach ($unitOfWork->getIdentityMap() as $entity => $map) {
    echo $entity . ' : ' . count($map) . PHP_EOL;
}

Some Entities indeed had close to a million instances in Doctrine. I knew at which point I wouldn’t need some of them any more, so I decided to remove all Entities of specific types from Doctrines unit of work:

$doctrine->clear('Path\To\Namespace\Entity');

This did decrease the memory consumption but not nearly as much as I had hoped or expected. I remembered that it is possible to get all SQL Queries that Doctrine has ever executed somehow. With lots of big entities and elaborate relationships between them, this could quickly amount to several millions of very big SQL queries, each stored as a string. It is of course possible to disable the SQL Logger:

$doctrine->getConnection()->getConfiguration()->setSQLLogger(null);

It turned out that this solved my memory problem. The SQL Logger seemed to indeed consume lots of RAM. An additional nice side-effect of this was that the entire job now ran about 2x faster than before. So disabling SQL Logging in Doctrine seems to generally be a good idea. You can still leave it enabled when the software runs in debug mode or with a verbose flag, if you need it for debugging purposes.

What language would/do you use to build a web application?

There was an interesting poll on Hacker News a couple of days ago that I decided to bookmark for later and check out again after the results had mostly stabilized. The question asked was “What Language would/do you use to build a web application?“. It’s been almost 2 weeks since that poll so I think it’s safe to assume the results won’t change much anymore. Here they are in a nice bar graph, courtesy of Google Docs:

Web Application Development Languages

To a regular reader of Hacker News, the results aren’t really surprising. Especially Python is very popular among users there. A little surprise at first sight may be that PHP came in so close to the top, at third place. When reading Hacker News, you could get the impression that everybody hates PHP. But I think that’s a very common misconception about PHP. Not many people hate the language, it’s just that the people who do tend to be very vocal about it.

Now, this is just the snapshot of an opinion and you cannot derive any trends whatsoever, but personally I guess that Java and C# are descending, slowly followed by PHP, Ruby and Python and that languages like JavaScript and Go are rising towards the top. It would be interesting to have a similar poll in a year or so.

Also, I have no idea who the one person is that would/does use Visual Basic to build a web application, but I salute that brave soul.