Exclude Web Pages From Search Indexes
How to exclude web pages from search indexes
Section titled “How to exclude web pages from search indexes”Sometimes you need to prevent a page from being indexed by google or by our on site search engine
There’s a few place where this update must be made
- Robots.txt
This is the basic, edit /public/robots.txt and add your page or directory structure to exclude at the bottom for the US and CA locale:
Disallow: /en-US/sales/special-offers/*Disallow: /en-CA/sales/special-offers/*or
Disallow: /en-US/sales/special-offers/specific-pageDisallow: /en-CA/sales/special-offers/specific-page2. Sitemap
Section titled “2. Sitemap”The sitemap is dynamically generated by the app/controllers/pages_controller.rb
In the pages controller you will see a large hash defined at the bottom called LAYOUT_MAP
For the given page, instead of specifying the layout only as a string, use a hash and specify the :skip_site_map attribute
Here’s an example specifying a regular layout for a page and one specifying a layout + the skip site map parameter
LAYOUT_MAP =...'sales/veterans' => '/pages/simple_page_summary','sales/special-offers/carpetone' => { layout: '/pages/simple_page_summary' ,skip_site_map: true },...3. NOINDEX, NOFOLLOW
Section titled “3. NOINDEX, NOFOLLOW”For good measure you should also add to the meta headers of the page the ROBOTS instructions, in your page html.erb :
<% content_for :head do %><meta name="robots" content="noindex,nofollow,noarchive,nosnippet,noodp"><% end %>