Is your sitemap in the correct format and also accessible on browser, but still getting the error, “Sitemap can be read, but has errors. Your Sitemap appears to be an HTML page. Please use a supported sitemap format instead.“ in GSC (Google Search Console)? You’re in the right place.
In this tutorial, I will share 4 solutions to fix, “Your Sitemap appears to be an HTML page. Please use a supported sitemap format instead“ error in Google Search Console.
The error, “Your Sitemap appears to be an HTML page” can be caused by caching plugin, security plugin, CDN caching settings, or by an Alias Record which is added to connect CDN for your website. The cache plugin may cache the XML sitemap of your website and output it as an HTML page, and security plugin may block Google from fetching a sitemap.
This error negatively affects the organic traffic of your site because it de-indexes most of your site URLs, including your main domain. So it is important to fix this error as quickly as possible.
Below is the complete detail of the error in Google Search Console:
Sitemap can be read, but has errors
Sitemap is HTML
Your Sitemap appears to be an HTML page. Please use a supported sitemap format instead.
When the sitemap is not submitted because of “Sitemap can be read, but has errors” and you do a Live Test for a URL of your site in Google Search Console using the Inspection Tool it shows you following details:
URL is not available to Google
This page cannot be indexed. Pages that aren’t indexed can’t be served on Google. See the details below to learn why it can’t be indexed Learn more
Page cannot be indexed: Soft 404
URL will be indexed only if certain conditions are met
Following topics will be covered in this post:
- Test & Validate Your XML Sitemap
- How to Remove Sitemap in Google Search Console
- How to Fix Your Sitemap Appears to be an HTML Page
01. Test & Validate Your XML Sitemap on Browser
In this section, I will guide you on how to test & validate your XML Sitemap, just to confirm that it is accessible, error free, and in the right format.
01. Open the Sitemap on Browser
- Open your sitemap on browser. Its URL would be your_domain/sitemap.xml or your_domain/sitemap_index.xml. For example, techworld/sitemap.xml.
- Here you can see that I have opened the XML sitemap of my blog on browser which is generated by Yoast SEO. It is accessible and showing me the xml files of posts, pages, categories, and tags.
- Right-click on a white space and click on the View page source.
- You can see that the format of sitemap is correct. It is a pure XML file, not an HTML page. In the same way you can open the page source of other sitemaps.
02. Validate XML Sitemap
In this section, I will guide you on how to use Validate XML Sitemap, a search engine optimization tool to validate your xml sitemaps.
- Go to Validate XML Sitemap page.
- Enter your website sitemap URL.
- Click on the VALIDATE SITEMAP button.
- You can see that the sitemap is valid and there are no issues, warnings, and errors.
02. How to Remove Sitemap in Google Search Console
In this section, I will guide you on how to remove the sitemap in Google Search Console. Once you go through the solutions below you may need to remove your existing sitemap in Google Search Console and submit it again.
- Go to Google Search Console and click on your sitemap.
- Click on the More options… icon located at top-right corner.
- Click on the Remove sitemap.
03. How to Fix Your Sitemap Appears to be an HTML Page
I will share 4 solutions that you can try one by one to fix the sitemap error in the Google Search Console.
- Exclude Sitemap URL in Cache Plugin Settings
- Exclude Sitemap URL in Security Plugin Settings
- Exclude Sitemap URL in CDN Caching Settings
- Remove CDN Alias Record from Host Records
01. Exclude Sitemap URL in Cache Plugin Settings
Open you cache plugin settings to check whether it is caching your sitemap or not. If it is caching the sitemap, exclude sitemap URL(s). When sitemap is cached, the cache plugin may output it as an HTML page.
Here I have shown excluded sitemap URLs generated by Yoast plugin in WP Fastest Cache plugin. The main URL of sitemap is sitemap_index.xml.
After excluding the sitemap URL in your cache plugin settings, remove the existing sitemap in Google Search Console and submit it again.
02. Exclude Sitemap URL in Security Plugin Settings
If you have installed a security plugin such as Sucuri Security, WordFence, All-in-One WordPress Security and Firewall, Block Bad Queries (BBQ), Defender, WordPress File Monitor Plus, WP Hide & Security Enhancer, Titan Anti-spam & Security, WP Cerber Security, NinjaFirewall, etc, check its settings. It might have blocked Google from fetching your sitemap.
03. Exclude Sitemap URL in CDN Caching Settings
They way your cache plugin can cache your XML sitemap, CDN caching can also cache your XML sitemap, as a result it may send an HTML page instead of XML file.
If your site is connected with CDN, first check its cache settings to know whether it is caching the sitemap or not, and then go to Cache Exceptions/Exclude section and exclude your sitemap URLs.
Here you can see that I have added the paths of XML sitemaps in the Namecheap SuperSonic CDN Cache Exception settings. CDN cache settings will ignore all these URLs (paths) from being cached.
04. Remove CDN Alias Record from Host Records
Your CDN Alias Record can also cause, “Your Sitemap appears to be an HTML page” error in Google Search Console. By removing the CDN Alias Record and adding A Record with the IP address of your site to point the domain to your cPanel, you can get rid of sitemap error in the Google Search Console.
When you connect your site to CDN, an Alias Record is added to Host Records to point your domain to a hostname, for example, hostname.eng.sun.com.
Here I am guiding you on how to remove/delete CDN Alias Record from Host Records in Namecheap which was causing the sitemap error in Google Search Console.
- Login to your Namecheap Account.
- Go to Domain List from the sidebar.
- Click on the Manage button of the domain for which you have set up CDN.
- Open the Advanced DNS tab.
- Find your CDN’s Alias Record and remove it from the Host Records. When you remove this ALIAS record your site is disconnected from CDN.
- Here the CDN’s host name is alpha.supersonic.ai.
ALIAS Record @ a03l73sdfsdffsqnhsg1g3t.alpha.supersonic.ai 1 min
- After removing the CDN record create an A Record on Host Records with your site’s IP Address to point the domain to your cPanel (where your site is hosted). You can get your site’s IP Address from cPanel account.
- The Host Value is @, Value is IP Address, and Automatic is TTL.
A Record @ IP Address Automatic
After you try a solution, submit your sitemap in the Google Search Console. If the above mentioned reasons are the cause of error, your sitemap in the Google Search Console will be fixed.
You can see here that all the sitemaps are read in the Google Search Console and Status is Success.
If you like this post then don’t forget to share with other people. Share your feedback in the comments section below.
- Fixed: Update Failed Could Not Update Post in the Database
- 403 Forbidden Access to this Resource on the Server is Denied!
- Fixed: Updating failed. The response is not a valid JSON response
- How to Remove Lazy Loading on Featured Image in WordPress
- Rest API did not Behave Correctly | Fix WordPress Block Editor