How to Detect Bad Bot Traffic

In this episode, im going to show you ,how to detect bad bot traffic.

bad Bot traffic is described as any non-human behavior on a website or app. There are good bots out there, like bots that crawl content for Google, digital assistants, chatbots, and social bots. However, bad bots are those that scrape content, spread spam content, or carry out credential stuffing attacks. It is believed that over 40% of internet traffic is bot traffic, and a significant amount of that is malicious bot traffic.

These malicious bots violate a website’s Terms of Service or Robots.txt rules. There are also bots that carry our cybercrime such as identity theft or account takeover. While some of these practices are illegal, there are still malicious bots out there that aren’t doing anything illegal.

To disguise themselves, bots may be distributed through a botnet, meaning copies of the bot are running on multiple devices. Because each device has its own IP address, botnet traffic comes from multiple different IP addresses, which makes it difficult to identify and block the source of the malicious bot traffic.

Excessive bot traffic can overwhelm a web server’s resources, slowing or stopping services for a real human trying to use a website or application. Sometimes this is intentional and takes the form of a DoS or DDoS attack.

Malicious bot activity can include credential stuffing, web/content scraping, DoS or DDoS attacks, brute force password cracking, inventory hoarding, spam content, email address harvesting, and click fraud. Advertising companies are good at detecting bots, so if a website is monetizing through ads and there is click fraud, there is potential for the advertising company to ban the website and its owner from its network.

Bot traffic can both hurt and skew a publisher’s analytics. These changes in metrics can cause difficulty in knowing how to measure the actual performance of a website.

Google Analytics will also allow you to exclude all hits from bots and spiders. Additionally, if the source of the bot can be identified, users can also provide a specific list of IPs to be ignored by Google Analytics.

While this will stop most bots, it can’t stop all bots. Most malicious bots are trying to do something other than disrupt traffic analytics, and these safety measures do nothing to stop harmful bots except to preserve analytics data.

A number of other tools can help stop abusive bot traffic. Other mechanisms, like CAPTCHAs, can be a first step in trying to block bots. These will mostly deter basic bots.

Another way is rate limiting, or limiting network traffic. It often puts a cap on how often someone can repeat an action within a certain timeframe, like logging in. A network engineer can also use log files to look at a website’s traffic and look for suspicious network requests; they would then gather the IP addresses to be blocked. This process is extremely tedious though.

The best way to stop malicious bot traffic is through a bot management system that is able to figure out if there is bad bot activity and differentiate between user activity and helpful bot activity. Good bot management systems should be able to differentiate between good and bad bot activity.

Additionally, sites with limited inventory can be targeted by inventory hoarding bots; these bots go to e-commerce sites and load tons of merchandise into their shopping carts, making that merchandise unavailable for purchase by legitimate customers. This can also trigger unnecessary restocking of inventory.

To identify bot traffic, web engineers can look directly at network requests to their sites and identify likely bot traffic. There are many ways to identify bot traffic:

Abnormally high pageviews, abnormally high bounce rate, surprisingly high or low session duration, junk conversions, and a spike in traffic from an unexpected location.