Once used primarily by search engines, bots now have a variety of uses — both good and bad. The good bots are primarily search engine crawlers and other similar bots used for aggregating or monitoring content. These bots obey the website owner’s rules as specified in the robots.txt file, publish methods of validating them as who they say they are, and work in a way to avoid overwhelming the websites and applications they visit.
Bad bots are built to perform various malicious activities. They range from basic scrapers that try to get some data off an application (and are easily blocked), to advanced persistent bots that behave almost like human beings and look to evade detection as much as possible. These bots attempt attacks that range from web and price scraping to inventory hoarding, account takeover attacks, distributed denial-of-service (DDoS) attacks, and much more.
Barracuda researchers have been tracking bots on the internet and their effect on applications for several years now, and analyzing those traffic patterns for the first six months of 2023 they identified several interesting trends.
From January 2023 to June 2023, bots made up nearly 50% of internet traffic, with bad bots making up 30% of traffic. That’s down from 2021 when Barracuda research found that bad bots made up 39% of internet traffic.
North America was the source of 72% of bad bot traffic in the first half of 2023. Roughly two-thirds (67%) of bad bot traffic came from hosting providers, while 33% was from residential and other IP addresses. Most bad bot traffic comes from the two large public clouds: AWS and Azure, which skews the geographic data toward North America.
Let’s take a closer look at what’s driving these trends and where the traffic is coming from.
The e-commerce bot bubble
When PlayStation 5 launched in 2020, people quickly realized that it was out of stock everywhere — except with unauthorized resellers who used e-commerce bots to quickly buy up all of the available PS5s and then resell them at a much higher price. This brought bad bots into the limelight, and from late 2020 to mid-2022 we saw a significant amount of bad bot traffic from these types of e-commerce bots.
We saw a lot of people using bad bots to buy anything that was launched and limited edition — sneakers, clothes, Funko Pops, and more. Bot forums were loaded with people trying to figure out ways to get around restrictions and anti-bot protections — and many were making real money. This trend finally ended in late 2022 when the bottom dropped out of the sneaker resale market after inflation started going up.
This decrease in traffic from e-commerce bots was likely the main driver for the drop in bad bot traffic from 39% of internet traffic in the first half of 2021 to 30% in the first half of 2023.
Sources of bot traffic
In their analysis, Barracuda’s researchers also uncovered interesting insights into where bad bot traffic is coming from. The U.S. is the country of origin for almost three-quarters (72%) of bad bot traffic. The next four regions are the United Arab Emirates (12%), Saudi Arabia (6%), Qatar (5%), and India (5%). However, the traffic source is skewed toward the U.S. because 67% of bad bot traffic comes from public cloud data centers’ IP ranges.
From our sample set, most of the bot traffic comes in from the two large public clouds — AWS and Microsoft Azure — in roughly equal measure. This could be because it is easy to set up an account for free with either provider and then use the account to set up bad bots. It also makes it relatively simple to identify and block these bots. If your application does not expect traffic from a specific data center IP range, you can consider blocking it, similar to geo-IP based blocking.
Barracuda’s researchers also saw a significant amount of bad bot traffic (33%) coming from residential IP addresses. A lot of this is because bot creators are trying to hide in residential traffic by using someone else’s IP address through proxies to try to bypass IP blocks.
Attackers have been using this tactic for some years now, particularly for things like web scraping or other bot attacks. If attackers are doing something malicious, they don’t want to do it from their own IP address due to traceability, so they end up due to traceability, so they end up using services that provide anonymous residential IP ranges.
This can sometimes lead to residential IP users ending up in “CAPTCHA hell,” unable to pass CAPTCHAs from Google or Cloudflare because their IP was used by one of these attackers and flagged for malicious activity.
Increasing attacks on APIs
The more serious bot threat groups are still operating, getting more sophisticated, and causing serious damage. Bots are getting cleverer, and as a result account takeover attacks, including attacks against APIs, are increasing. Attacks against APIs are growing because they are relatively under-protected and easier to attack with automation because they are made for automation.
These account takeover attacks generally start with a brute-force attack or a credential stuffing/password spraying attack. In a brute-force attack, cybercriminals keep trying permutations and combinations of credentials until they find one that succeeds. For example, an attacker would use a list of common usernames (like admin or administrator) and passwords (like hunter123 or password) and keep iterating until they are successful. In credential stuffing, attackers start with known good credentials from a data breach and rely on people reusing their passwords on other sites. These attacks are more successful and get to that success sooner because password reuse is so common.
Defenses like rate limits and multifactor authentication (MFA) can help detect and stop brute-force attacks, so attackers will then try things like low-and-slow bots to bypass rate limits and other techniques like phishing and MFA-bombing to bypass MFA. Unfortunately, many organizations do not have proper rate limits and monitoring in place, which can lead to bigger problems, as it did with the Optus breach in 2022.
Effective defenses
When it comes to protecting against bot attacks, organizations can be overwhelmed at times due to the number of solutions required. The good news is that solutions are consolidating into Web Application and API Protection (WAAP) services. To protect your business, as well as your data, analytics, and inventory, you need to invest in WAAP technology that identifies and stops bad bots. This will improve both user experience and overall security.
- Put proper application security in place. Install a web application firewall or WAF-as-a-Service solution and make sure it is properly configured with rate limiting and monitoring in place. This is an important first step to make sure your application security solution is working as intended.
- Invest in bot protection. Make sure the application security solution you choose includes anti-bot protection so it can effectively detect and stop advanced automated attacks.
Take advantage of machine learning. With a solution that uses the power of machine learning, you can effectively detect and block hidden almost-human bot attacks. Be sure to turn on credential stuffing protection to prevent account takeover as well.