crawl budget

April 05, 2023

Search Engine optimization

crawl budget

When a page is published on your website, is it indexed and ranked by Google?

For a page to appear in search results and drive traffic to your website, Google must first crawl it. In Google's own words: "Crawling is the entry point for websites into Google's search results."

However, since Google does not have unlimited time and resources to search every page on the web at all times, not all pages are searched.

This is what SEOs refer to as the crawl budget, and optimizing it can be crucial for the growth of your company website.

Read on to learn what a crawl budget is, or jump to another section:

What is a crawl budget?

The crawl budget is the amount of time and resources a search engine allocates to crawling a specific website. In other words, it's the maximum number of pages a search engine can crawl on your website within a given timeframe. The crawl budget can vary for different search engines (or crawlers).

Google states that you shouldn't worry about the crawl budget unless:

  • Their website has more than 1 million unique pages with content that is changed approximately once a week.
  • They have a medium-sized website with around 10.000 pages and change the content frequently (daily).
  • They run a news website.
  • Search Console classifies most of your website's URLs as discovered – not currently indexed.

Each website receives a different crawl budget due to these two factors:

1. Crawling demand. The number of pages, the frequency of posts/updates, and the popularity of the pages define this.

2. Crawl rate limit. This is influenced by server capabilities, crawl limits set by the website owner (in Search Console), the search engine's crawl limit, and other factors. Google can also automatically adjust the crawl rate if the server is slow, and vice versa.

Should I be worried about the crawl budget?

If you're working on smaller websites, you might not need to worry about the crawl budget. According to Google, "Most publishers don't need to worry about the crawl budget. If a website has fewer than a few thousand URLs, it's usually crawled efficiently."

However, if you are working on large websites, especially those that automatically generate pages based on URL parameters, you should prioritize activities that help Google understand what to crawl and when.

Why is crawl budget important for SEO?

There's an entire section of search engine optimization dedicated specifically to this situation. The goal is to guide the Googlebot in such a way that the available crawl budget is used very effectively and high-quality pages that are particularly important to the website owner are indexed.

Less important pages must be identified first. These are primarily pages with poor content or little information, as well as broken pages that return a 404 error code.

These pages must be excluded from crawling so that the crawl budget remains available for the higher-quality pages. Subsequently, the important subpages must be structured so that they are prioritized for crawling by spiders.

Here are some simple ways to maximize your website's crawl budget.

Best practices

1. Improve sideload speed

Improving your website's page load speed can lead to Googlebot crawling more URLs on your site. In fact, Google says:

"A fast-loading website improves the user experience and also increases the crawl rate." In other words, slow pages consume valuable Googlebot time. However, if your pages load quickly, the Googlebot has time to visit and index more pages.

Use internal links

Googlebot prioritizes pages with many external and internal links. Ideally, you would have backlinks to every single page on your website, but that's unrealistic in most cases. That's why internal linking is so important. Your internal links direct Googlebot to all the different pages on your website that you want to index.

Flat website architecture

According to Google: "URLs that are more popular on the internet tend to be crawled more often to keep them fresher in our index."

And in Google's world, popular means link authority.

Therefore, you want to use a flat website architecture on your website. A flat architecture ensures that all pages of your website receive a certain level of link authority.

Avoid "orphan websites".

Orphan pages are pages that have no internal or external links pointing to them. Google has a very hard time finding orphan pages. So, if you want to get the most out of your crawl budget, make sure that at least one internal or external link points to every page of your website. Limit duplicate content.

Limiting duplicate content makes sense for many reasons.

As it turns out, duplicate content can hurt your crawl budget. Google doesn't want to waste resources by indexing multiple pages with the same content. So, make sure that 100% of your website's pages consist of unique, high-quality content. This isn't easy for a website with more than 10.000 pages, but it's essential if you want to get the most out of your crawl budget.

How to increase the crawling budget?

Here are some ways you can optimize your website's crawling budget.

1. Speed ​​up your server and reduce page load times

The server response time and page load time directly affect crawling. It works roughly like this:

When Googlebot crawls your website, it first downloads the resources and then processes them. If your server responds quickly to Google's crawl requests, it can crawl more pages on your website.

Therefore, use a fast and reliable web hosting service and a Content Delivery Network (CDN) to improve the server's initial response time.

At the same time, you can reduce page loading times by:

  • Prevent large but non-critical resources from being crawled using robots.txt.
  • Avoid long forwarding chains
  • Remove heavy and poorly coded themes and plugins to reduce page overload.

2. Add more links

The number of links on a page tells Google something about the importance of that page. Googlebot prioritizes crawling pages with more backlinks and internal links.

Therefore, you can increase your crawl budget by adding more external and internal links to your pages. While acquiring backlinks from external sites takes time and isn't entirely under your control, you can start with the easier option – internal linking.

3. Fix broken links and reduce detours

Too many broken internal links (404 or 410 response codes) and redirected URLs (3xx) can waste your website's crawl budget. Even though these pages have had a low crawl priority for some time with unchanged content, it's better to fix them to optimize your crawl budget and overall website maintenance.

If you find the broken internal links, you can either restore the page under the same URL or redirect the URL to another relevant page.

Regarding detours, check if there are many unnecessary detours and detour chains, and replace them with a direct link.

4. Use the Indexing API if possible.

Another way to make your pages crawl faster is to use Google's Indexing API. This allows you to notify Google directly when you add, remove, or update pages on your website.

However, the indexing API is currently only available for use cases such as live videos and job postings. If this applies to your website, you can use it to keep your URLs up-to-date in Google's index and search results.

Similar articles:

Request free SEO consultation

Enter your details and we will contact you 📅

    Increase your traffic!

    Analyze your website now ➜

    Switzerland Flag