Crawling is the process by which a search engine bot fetches a URL and downloads its content. It is the first step before a page can be indexed.
is the process in which an automated bot (such as ) requests a URL and downloads its along with resources like images, and JavaScript. Crawling is the prerequisite for content to be processed and at all.
Which URLs a crawler may fetch can be controlled through the robots.txt file. Its main purpose is to manage crawler traffic and avoid overloading servers.
It is important to distinguish crawling from indexing: crawling only means fetching a page, not adding it to the search index. For search engines to read instructions such as noindex or a , a page must be crawlable. If it is blocked via robots.txt, those signals cannot be detected reliably.
Crawling = fetching a URL
controlled via robots.txt
prerequisite for indexing and reading robots directives