You are viewing our old blog site. For latest posts, please visit us at the new space. Follow our publication there to stay updated with tech articles, tutorials, events & more.

Automated management of google crawl errors in FirstNaukri

5.00 avg. rating (95% score) - 1 vote

An important role in driving traffic to our websites is played by SEO. In firstnaukri, we get many hits from SEO which makes it one of the biggest drivers of traffic for us.

To ensure better rankings on google search results page, we must ensure maximum crawl rate, minimum crawling errors and minimum not found.

Problem Statement: To detect the google search engine crawled errors on a regular basis, so that we can find and fix them accordingly.

To solve this either we have to go to the webmaster tool and manually check the errors or use the developer API exposed by Google to access various details on the fly. We have integrated this in firstnaukri in order to keep an automated tab on the below mentioned things..

We can run queries over our Google Search data to see how often firstnaukri.com appears in Google Search results, with what queries, whether from desktop or smartphones, and much more. We can use the results to improve our website’s search performance.

We signed up with google webmaster tool, enabled API and created a credential to use the API via google developer console for our project. We created a google service account and downloaded the service_accout_details.p12 file to use in our project for authorisation.

A .p12 file is a binary format file for storing the server certificate, any intermediate certificates, and the private key into a single encryptable file.

The PHP clinent library was downloaded from https://developers.google.com/webmaster-tools/v3/libraries.

Steps to use Search Console API :

  1. In order to use the API, we have to get our credentials from the p12 file.
    $credentials = new Google_Auth_AssertionCredentials($devId , [Google_Service_Webmasters::WEBMASTERS_READONLY], file_get_contents(/path/to/p12/config/file));
    where $devId is the developer service account id , which will be in the form of abcde@xyzw-1234.mno.gserviceaccount.com
  2. We have to create a GoogleClient object in order to access the services.
    $client = new Google_Client();
    $client
    ->setAccessType(‘offline’);
    $client
    ->setAssertionCredentials($credentials);
    if($client
    ->getAuth()->isAccessTokenExpired()){
    $client
    ->getAuth()->refreshTokenWithAssertion();
    }
  3. In order to get the various crawling details of our website we have to create an object of Google_Service_Webmasters class.
    $webmasterObj = new Google_Service_Webmasters($client)

We can retrieve a time series of the number of URL crawl errors per error category and platform , and also get the list of our site’s sample URLs for the specified crawl error category and platform. Once we get these information, we can fix them as soon as possible and inform google as marked as fixed in order to avoid penalty by google.

We can also get all information about our sitemap (i.e. is sitemap indexed , etc).

We retrieved the data and analyzed the data to figure out how to reduce crawl errors. In this way we are able to reduce the crawl errors.

We get the counts for different categories (as shown in the screenshot below), and details of all categories mentioning the link with last crawled date and first detected date.

1

2

In this way, we got to know all the urls that had been error prone, cached urls, invalid urls, or deleted urls or urls which need to be redirected to specific location to the google.

Next step is well known, i.e. to provide a further fixes for the same :-).

Benefits of the above process is that number of errors can be caught and fixed before crawlers starts penalizing us and in turn improve our search rankings further that would lead to an increase in traffic in our site.

For us, Errors were reduced by 50% in the first go..

Posted in SEO

3 thoughts on “Automated management of google crawl errors in FirstNaukri

  1. Ѕuper-Duper website! I аm loving it!! Will be back later to read
    some more. I am bookmarking your fеeds also

  2. Great blog! I am loving it!! Will be back later to read some more.

    I am taking your feeds also

Comments are closed.