Robots.txt configuration in Magento 2. Nofollow and Noindex

Robots exclusion standard, also known as robots.txt file, is important for your website or store when communicating with search engines crawlers. This standard defines how to inform the bots about the pages of your site that should be excluded from scanning or, visa versa, opened for crawling. That’s why robots.txt file is significant for the correct website indexation and its overall search visibility.

By default, Magento 2 allows you to generate and configure robots.txt files. You can prefer to use the default indexation settings or specify custom instructions for different search engines.

To configure the robots.txt file follow these steps:

1. Open the «Stores» tab, select the «Configuration» option and select the «Design» tab.

Magento robots.txt

2. Open the «Search Engine Robots» section and switch the «Default Robots» option to one from the drop-down.

robots.txt Magento 2

3. Enter custom instructions to the robots.txt file.

Magento robots

4. Hit «Save Config» to complete the operation.

We recommend to use the following custom robots.txt for your Magento 2 store:

User-agent: *
Disallow: /*?
Disallow: /index.php/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/
Disallow: /wishlist/
Disallow: /admin/
Disallow: /catalogsearch/ Disallow: /checkout/
Disallow: /onestepcheckout/
Disallow: /customer/
Disallow: /review/product/
Disallow: /sendfriend/
Disallow: /enable-cookies/
Disallow: /LICENSE.txt
Disallow: /LICENSE.html
Disallow: /skin/
Disallow: /js/
Disallow: /directory/

Lets consider each groups of commands separately.

Stop crawling user account and checkout pages by search engine robot:
Disallow: /checkout/
Disallow: /onestepcheckout/
Disallow: /customer/
Disallow: /customer/account/
Disallow: /customer/account/login/

Blocking native catalog and search pages:
Disallow: /catalogsearch/
Disallow: /catalog/product_compare/
Disallow: /catalog/category/view/
Disallow: /catalog/product/view/

Sometimes Webmasters block pages with filters..
Disallow: /*?dir*
Disallow: /*?dir=desc
Disallow: /*?dir=asc
Disallow: /*?limit=all
Disallow: /*?mode*

More reasonable to use canonical tag on these pages.

Blocking CMS directories.
Disallow: /app/
Disallow: /bin/
Disallow: /dev/
Disallow: /lib/
Disallow: /phpserver/
Disallow: /pub/

These commands are not necessary. Search engines are smart enough to avoid including CMS files in their index.

Blocking duplicate content:
Disallow: /tag/
Disallow: /rewiew/

Don’t forget about domain and sitemap pointing:
Host: (www.)domain.com
Sitemap: http://www.domain.com/sitemap_en.xml

Meta robots tags: NOINDEX, NOFOLLOW

After configuring the robots.txt file, you can switch your attention to Nofollow and Noindex tags. These tags are used to spread the weight of pages and cover some unnecessary parts of code from crawlers.

Nofollow hides a part of text or the whole page from indexation.
Noindex is an attribute of < a > tag, that prohibits the transfer of the page weight to an unverified source. In addition, you can use Nofollow for pages with the large number of external links.

To apply Nofollow or Noindex to your current configuration you can either update the robots.txt file or use meta name=“robots” tag.

All possible combinations:

<meta name="robots" content="index, follow" / >
<meta name="robots" content="noindex, follow" / >
<meta name="robots" content="index, nofollow" / >
<meta name="robots" content="noindex, nofollow" / >

Magento 2 meta robots

Add the following code to the robots.txt file in order to hide specific pages:

User-agent: *
Disallow: /myfile.html

Alternatively, you can prohibit indexation with this code:

<html >
<head >
<meta name=”robots” content=”noindex, follow”/ >
<title>Site page title </head >

Important notice:

Tags Noindex and Nofollow has many advantages over blocking a page through robots.txt:

1. Robots.txt prevents a page crawling during scheduled website crawling. However, this page could be found and crawled from other websites links.
2. If a page has inbound links all juice will be transmitted to other website pages through this page internal links.

Using the instruction above, you will be able to manually configure the robots.txt file of your Ma-gento 2 store and hide the unnecessary parts of code or spread the weight of pages.

You can simplify working with robot.txt file with 3d party Magento 2 seo plugin

SEO Suite Ultimate

The first Magento 2 SEO solution. Eliminates duplicate content issues, improves website indexation and makes it search engine & user friendly.

$299

SEO Meta Templates

Advanced SEO attribute templates to easily optimize product and category page meta data, as well as the short and detailed descriptions.

$99

Post Comments

Submit Comment




* Required Fields