Last updated: July 17, 2021
What is CheckLiveUrlBot?
CheckLiveUrlBot is a software that checks whether the given site is live.
CheckLiveUrlBot uses the following data:
- IP address:
- CheckLiveUrlBot/1.0 (+https://gimpsy.com/faq/checkliveurlbot)
When does CheckLiveUrlBot visit my site?CheckLiveUrlBot visits your site in two cases:
- If you wish to suggest a site to the Gimpsy directory. Only sites with a live URL can be accepted.
- If your site is listed in the Gimpsy directory, CheckLiveUrlBot visits your site periodically (about once per month or less). If CheckLiveUrlBot detects that the site does not respond after 15 consequent attempts, the site will be postponed from the Gimpsy listing until we receive confirmation from you that the site has been repaired.
What data are crawled in my site?
CheckLiveUrlBot only checks that the URL is live by receiving HTTP headers from the server where your site is hosted. It does not receive any content from your site. Thanks to this, it generates almost zero traffic.
Why it is important to grant CheckLiveUrlBot access to my site?
If CheckLiveUrlBot cannot gain access to your site for any reason, including but not limited by internet connection issues, site server internal errors, CheckLiveUrlBot blocklisting by your internet service provider, etc., CheckLiveUrlBot interprets such situations as if your site is inaccessible. In that case you cannot suggest the new site to the Gimpsy directory; already listed sites are at risk to be postponed.
Can I exclude the CheckLiveUrlBot?
No. If you wish to use the Gimpsy service, you should grant CheckLiveUrlBot access to your site.
How to verify CheckLiveUrlBot?
Before you decide to block CheckLiveUrlBot, be aware that the user agent string used by CheckLiveUrlBot can be spoofed by other crawlers. It is important to verify that the problematic request is actually coming from Gimpsy. The best way to verify that a request actually comes from CheckLiveUrlBot is to use a reverse DNS lookup on the source IP of the request. Below is the correct verification result:
# host 188.8.131.52 184.108.40.206.in-addr.arpa domain name pointer crawler.gimpsy.net. # host crawler.gimpsy.net crawler.gimpsy.net has address 220.127.116.11