Glossary

Crawler/Spider/Bot

A crawler also called spider or bot is a type of program that bounces around web site in order to collect information from the pages to add to a search engine index. Most major search engines do this regularly to update their platform.

The crawler programs are referred to as either spiders or bots. When a site owner submits an update that their site is new or has been updated, a crawler usually picks up on this through being programmed to do so. The crawler will then pick selected sites and index them.

Web crawlers can be used by sites to update their content or adding to their index. Even though there are crawlers trying to index every site on every search engine the task is impossible due to the vastness of the internet.

Crawlers use site resources on each page they visit so there are sites that set robot.txt to govern which areas of the site the crawlers will work on indexing so that there is a minimal drain on resources. Crawlers can also be used to validate HTML code and hyperlinks.