Mission-critical integer increment operations in PHP.
--
Sometimes you might want a simple database system that uses the OS file system to increment or decrement a balance or score, but in PHP and other web server languages this can be problematic.
Why would you want to avoid using a widely used database system such as MySQL? In some cases a fully-fledged database system may be overkill, say you have a simple web game such as Neopets (oh the cringe of the late ’90s) and you just wish to make sure all player balances are accounted for correctly when handled by a PHP thread spawned by NGINX or Apache, for example, using no external calls to database systems, just the OS file system to do the heavy lifting. After all the file system has a system of indexing files in folders and it’s usually pretty efficient, you can have each player balance stored in a folder named “balances” and each file in the folder representing a player name. This can be simpler and faster in some cases, akin to a cached Redis query without the Redis server. But the crux of the problem is that when you use a server-side scripting language such as PHP these little threaded processes that spinup, per remote request to your server, cannot linger around long enough to ensure data had been written successfully otherwise they risk filling the allocated thread pool and worse, making your site slow and unresponsive. These threads should execute and end as fast as possible and just reading and writing to file in the blind faith that it will not fail very often will have terrible results with a significantly high failure rate.
The answer to this problem in PHP is to spawn a new process to handle the file increment/decrement tasks for you, and then let the PHP thread close as usual in record-breaking time. To do this the PHP exec() function can be used, but more on that later.
This means that we are going to need a specialised process that we can call which takes two arguments, the file we intend to increment and a signed integer of how much we are incrementing by. So the expected prototype we will be calling if our program was named inc
would be; inc balances/player_john -3
or inc balances/player_sally +3
respectively.
I had this problem myself when working on a prototype game StarTrader.io and as a result back in February created such a program in C which I have shared below;
This program can be compiled using the command gcc increment.c -o inc
if you don’t have the GCC compiler installed you can install it on Ubuntu/Debian using the command sudo apt install gcc
.
Once compiled you will want to copy it to your /usr/bin folder using the command sudo cp inc /usr/bin/inc
.
To execute this program from PHP so that the exec() function does not linger until the process ends you will need to use it like so;exec(‘nohup /usr/bin/inc /var/www/html/blanaces/player_john -3 > /dev/null 2>&1 &’);
To understand this command you will need to look into the nohup command which is a way of launching a Linux process so that it does not stop until it has been completed and the > /dev/null 2>&1
argument which will discard the standard output and error output. The final ampersand (&) symbol tells the shell to run the process in the background. The full combination will launch the process and immediately return to the shell prompt signalling to the exec() command that the execution completed so that it does not linger around blocking up the PHP-FPM thread pool or similar.
This newly spawned process will then linger around forever until the intended file-based integer increment/decrement operation has succeeded, there can be multiple of these processes running for a single file and due to the file lock/unlock none of the increment operations will tread on one another’s toes so to speak. As long as your disk is not completely borked, you should not expect to see one of these processes linger around ‘forever’.
This program also stores the data in file as strings so reading them into PHP is simply; $balance = intval(file_get_contents(“balances/player_john”));
. Otherwise, you would need to use the PHP unpack function, it’s a preference and generally, for these kinds of languages, I would assume most developers prefer human-readable strings for convenience.
Why do I unlock and close the file handle at the beginning of each while iteration just to open it back up and lock it again and not just once before the while loop starts? Well, say for example that someone were to modify the file while our file handle is open, even if it is locked a user or process could ignore the lock and modify the file causing the file handle we originally opened to invalidate. It’s again preference if you are sure that your file handle will never invalidate you can save on some performance by restructuring the code but is the risk to reward ratio worth it? Probably not. But otherwise, then this git has the alternate variant.
However, on the faw (failed write) condition, we cannot unlock the file as the file currently contains a corrupt value and risks another process incrementing that corrupt value which would result in a catastrophic failure, so it is vital that we keep looping in a locked state until that write succeeds and the file contains its valid value for the next process to increment upon. It is possible that the file handle could become invalidated during this final faw == 1
write loop, thus causing the process to become locked into an infinite loop and the file to be written with corrupt data indefinitely. In which case you would need to have made a backup of the original file value at the successful read() part of program execution and add a timeout to the faw == 1
loop so that after 𝑥 tries the program will restore the files original state and exit. But, in our case, we’re taking that risk because it is incredibly rare this would happen. One could also add a timeout to the faw == 0
part of the loop too but, that’s even less likely to never succeed. But you can never be too careful when it’s mission critical! So here is variant 3, which is variant 2 but with the sleep() function replaced with a doStikeout() function which will trigger an evil goto statement when the strikeout exceeds its limit causing the program to exit, apart from when the write() fails, on this instance the program tries what will be an almost certainly futile attempt to write back the original file state because write() was already failing it is very unlikely the original file state write() will succeed either… but why not try? Really, you should never unlock the file until the write succeeds hence having added the ALLOW_WRITE_FAIL
definition, ordinarily, the process might keep the file locked until it finally restores at-least the original state and then schedule the increment to reoccur at a later epoch (there is no time, only ticks lol).
Either way, we don’t want to go too crazy here, so we just make a decision to fail on a strikeout or loop in a locked state until the write succeeds.
Generally speaking, you can run variant 1 and very likely never see the process lock up in your life time. It’s just food for thought at the end of the day.
It brings us back to the reward to risk ratio, sometimes a particular condition could be so rare that it is comparable to the rarity of non-ECC memory flipping a random bit and corrupting the program execution resulting in a corrupt output file or just losing our increment entirely. We could add redundancy checks for a case such as this too, but is the high penalty to the performance worth the marginally reduced risk? Those are decisions you have to make based on the scenario you are working with.
If you are interested in learning more about how computer memory can randomly flip bits I personally found this YouTube video and this Hackaday article quite entertaining concerning the matter of cosmic rays causing the flipping of bits in computer memory.