How a benign web crawler found and exploited a critical vulnerability in my server.
This is a brief and partly mundane account of how the Huawei AspiegelBot (PetalBot) web crawler discovered and exploited a serious vulnerability in the back-end services of a website I operate. What started as a simple security vulnerability found by a web-crawler quickly escalated into a world-wide attack from hordes of faceless wanna-be hackers and script kiddies running run-of-the-mill exploit and directory scanning software.
It all started when I noticed what seemed to be some kind of Denial-of-Service (DoS) attack against the server I use to host the VF Cash website and services. These attacks were particularly odd because I noticed that not only did the attacks exhaust the bandwidth of the server but also, they locked up the server in question requiring that the server was restarted to regain any kind of operation or remote access. I thought it strange, busy with other tasks I just rebooted the server, but the problem persisted for about three days at a frequency of 1/2 times per day before I came to realise what exactly the problem was..
I run a REST API that takes URL parameters and I had not disallowed bot access to the endpoint via the robots.txt file which is bad enough in its own right, as the last thing I need is web-crawlers putting unnecessary stress on my back-end services with random nonsensical queries.
However, in this instance, it would seem that the web crawler bot “PetalBot” had discovered the VF Cash REST API via the FAQ hyperlink on the main page, which, as well as the robots.txt mistake, I had also failed to set as a “no-follow” link.
Upon PetalBot discovering the selection of example URL’s provided for the REST API it decided that it would be a good idea to give them a try, and of all inputs, it decided to use “scan” as the input for the balance endpoint. This was particularly problematic because it caused the server to launch an IPv4 wide scan of over 4 billion internet-connected devices every time it executed this endpoint in this particular manner, which it seemed to particularly like to do, daily.
Port scanning over 4 billion internet-connected devices is not ideal for a number of reasons, primarily because it’s an intensive process that takes well, you know, a few minutes of CPU time to say the least but also, basically all of your internet bandwidth is consumed, particularly when you also take into consideration how many of those machines that you scan might respond or even port scan you back to see what’s going on. What started as a simple security vulnerability found by a web-crawler was now quickly escalating into a world-wide attack from hordes of faceless wanna-be hackers and script kiddies running run-of-the-mill exploit and directory scanning software to whom I was inadvertently making aware of my existence on a, at minimum, daily basis.
The REST API essentially takes URL parameters using PHP and then converts those URL parameters to strings using the escapeshellarg() function which are then safe to be used in a shell_exec() function which executes a binary program using the formerly mentioned strings as execution arguments, once the binary finishes executing it returns the necessary data to the PHP script which in turn is output to a HTML page.
The problem was that when designing this binary program I figured that the most frequent and common operation in which it would be used for is checking account balances, and for that reason, if only one execution argument was specified it would assume the input to be an account address for such a balance check operation (if the input did not match any of the other registered commands). A small oversight on my part meant that attackers could now enter any one of the other registered one-argument commands as an account address via the REST API balance check and as such the server would execute the specified command that was input, rather than the intended account balance check. Most of these commands were already accessible via the REST API anyway, but, the scan option which was never really implemented to be a serious option — more so of a last resort — was particularly harmful and would not only lock up the server in a CPU intensive process lasting some minutes but also exhaust all internet bandwidth at the same time — making it really the perfect DoS attack vector; and the first person, or in this case, robot, to discover and exploit it in the wild was a web crawler.
So what started as just a few, arguably minor oversights, quickly escalated to expose one of the most critical denial-of-service attack vectors my server had to offer and to none other than your average search engine web crawler.
Just my luck. Really, if you think about it.