AhrefsBot: What Is It & How Does It Work

This guide explains the AhrefsBot.

Below, you’ll find out what is AhrefsBot, how it works, and ways to control this web crawler on your website using directives the bot obeys.

Ahrefsbot: What Is It & How Does It Work

What Is AhrefsBot?

AhrefsBot is a web crawler that compiles and indexes the link database for the Ahrefs digital marketing toolset. The AhrefsBot crawls the web to fill the link database with new links and checks the status of existing links to provide up-to-the-minute data for Ahrefs users.

There are currently more than 12 trillion links in the database that AhrefsBot has crawled on the Internet. This link data is used by digital marketers and search engine optimization (SEO) specialists to plan, execute, and monitor their online marketing campaigns.

The AhrefsBot is considered a good bot used for marketing purposes that obeys robots.txt rules and Crawl-Delay directives with User-Agent String: Mozilla/5.0 (compatible; AhrefsBot/7.0; +http://ahrefs.com/robot/).

How Does AhrefsBot Work?

AhrefsBot works by automatically visiting publicly accessible web pages and following links on those pages. The process of crawling from link to link enables AhrefsBot to find new URLs and dead links on the Internet to keep its database fresh with link data.

The AhrefsBot crawls more than 6 billion web pages every 24 hours and updates the link index every 15-30 minutes. It uses an algorithmic process to determine the crawl budget for each website. Based on the crawl rate limit and demand it assigns to the website, the AhrefsBot will crawl a different number of web pages during each visit to fetch the link data.

AhrefsBot is also programmed so it does not crawl a website too fast to avoid overloading it, which can lead to timeouts and server errors. It also does not collect or store any information about the websites it analyzes. Neither does AhrefsBot trigger ad views or show up as visitor traffic in Google Analytics.

According to the Imperva Incapsula Bot Traffic Report, AhrefsBot is one of the most active web spiders used by commercial enterprises to crawl websites and retrieve information for digital marketing purposes. AhrefsBot works continuously to give online marketers better insight into the indexing and ranking algorithms of search engines like Google, Yahoo, and Bing so they can better optimize their websites and SEO campaigns.

Controlling AhrefsBot On Your Website

AhrefsBot can be controlled by your website’s robots.txt file to change the frequency of the crawler visiting your website and to block it entirely from crawling links on your domain.

Changing AhrefsBot Crawl Frequency

AhrefsBot crawl frequency can be changed by specifying the minimum acceptable delay between two consecutive requests in the robots.txt file using this markup:

User-agent: AhrefsBot
Crawl-Delay: [value]

Crawl-Delay value is time in seconds. For example, Crawl-Delay: 5.

Blocking AhrefsBot from Your Website

AhrefsBot can be blocked from crawling your website by adding the following markup to the robots.txt file:

User-agent: AhrefsBot
Disallow: /

AhrefsBot always respects the Disallow directive that instructs the spider not to crawl the website and will prevent AhrefsBot from storing link data about the website in its database; making it unavailable to Ahrefs users. However, AhrefsBot does need time to pick up the Disallow directive if this is a newly added change in your robots.txt file. Once discovered, AhfefsBot will honor the Disallow directive during the next scheduled crawl.

You’ll also want to disconnect Google Analytics and Search Console from your Ahrefs account if you’ve set those up. Otherwise, Ahrefs can still access your private website data for reporting purposes.

Note: You can also visit this related tutorial on how to block SemrushBot if you want to prevent that bot from crawling your website. Or read this introductory guide on SemrushBot explaining what it is and how it works.

AhrefsBot IP List

AhrefsBot uses IP ranges and individual IPs that can be Whitelisted or Blacklisted for website crawling access which you’ll find below. If you need help Blacklisting these IP addresses for your website, then check out this related guide on how to block AhrefsBot with sample code you can copy and paste into your website’s root .htaccess file.

AhrefsBot IP Ranges

AhrefsBot Individual IP Addresses

AhrefsBot IP Addresses for Cloudflare

If you’re using Cloudflare, the AhrefsBot may be blocked by the Cloudflare firewall. You can try using the IP ranges above or the individual IP addresses below to lift the restriction by adding them to the firewall Whitelist.

AhrefsBot Summary

I hope you enjoyed this guide on AhrefsBot.

As you discovered, AhrefsBot is a web crawler that compiles and indexes the link database for the Ahrefs digital marketing toolset. The AhrefsBot works continually by crawling the web to fill its link database with new links and checking the status of existing links to find dead URLs. This process provides up-to-the-minute data for Ahrefs users. You change the frequency of AhrefsBot crawling your website and prevent it from accessing your site through the robots.txt file or IP addresses.