Technical SEO should be the foundation of any SEO strategy to improve organic search visibility. I started SEO auditing in 2010-ish. I ran over 100+ SEO audits in the past 10 years both in-house and as one-off projects.
- Understanding HTTP Status Codes
- Site Architecture & Internal Links
- URL Structure
- HTTPS Website Security
- Optimize Robot.txt File
- On-Page HTML Element Checks
- Page Speed Checks & Improvements
- Check your XML Sitemap for Issues
- Check Mobile-First Indexing Best Practices
- Check issues with Canonical Tags
- Check Issues with Pagination Implementation
- Validate Schema Markup
- A few other things to watch out for.
- Google Search Console Reports
- My Favourite Free & Paid Technical SEO Tools
Understanding HTTP Status Codes
As a part of technical SEO site auditing, the first thing you need to pay attention is to the HTTP status codes of your site pages and resources. A status code is issued by a server in response to a browser’s request. There are over 60 different status codes each having its own meaning. The most common status codes you would come across during your technical SEO audits are 2xx successful, 3xx redirection, and the problematic 4xx client error or 5xx server error status codes as shown in the below table.
|Status Code||What they mean|
|301||Permanent redirects: requested resource moved permanently to another location.|
|302||Temporary redirects: requested resource moved temporarily to another location.|
|403||Forbidden: requested resource is forbidden for some reason|
|404||Not Found: requested resource not found in the location.|
|410||Gone: requested resource permanently gone from the location.|
|500||Internal/Generic Server Error|
|503||Service Unavailable: If a server is overloaded or undergoing maintenance.|
To identify the different status codes of your website, you can use a number of different methods such as;
- Aiyma Redirect Path browser extension to do spot checks of certain pages.
- Or use website site crawlers such as ScreamingFrog / Botify / Deepcrawl / various to run a site crawl of your entire site.
Problematic status codes that you should specifically be looking for when auditing websites to find technical SEO issues are 301’s, 302’s, 404’s, 410’s, 5xx errors and lots of redirects. 3xx and 4xx errors can be resolved by updating internal links to the correct location instead of using redirects. Speak to your IT team who maintain the servers if you encounter a large number of server errors.
Site Architecture & Internal Links
Having a clear site structure with optimal internal linking is key for search engines to be able to fully understand and effectively index your site. Optimal site architecture and a good internal linking strategy can benefit users and search engine bots. For users, good site information architecture will allow them to navigate the website making it easier to discover additional pages and keeping them within the site. It will also help search engine bots to clearly crawl the site to understand the site structure hierarchy. Internal links help in spreading link equity and PageRank around the website.
Optimise your internal links to reduce the crawl depth of key pages to spread the link equity in the most effective way possible. Follow the below best practices to optimise the flow of your link equity through the site;
- Reduce the amount of duplicate content to ensure link equity isn’t wasted on these duplicate pages.
- Remove low-quality pages from your site to avoid wasting the flow of link equity to these pages.
- Use the Rel=”nofollow” directive at the link level to signal crawlers that the link should be ignored. This will stop you from passing link equity to the nofollowed pages.
- Add pages or folder structure to the robots.txt to block them from getting crawled and passing of link equity.
- Avoid having too many links per page to prevent diluting your link equity.
- Fix your internal link redirects, especially redirect chains that cause more hops for a bot to reach the final page. This can dilute the spread of link equity to the final page.
- Find pages not in the site structure by combining analysis from your site crawl report, log files, search console and analytics data.
Internal links can be present on the navigation menu, body content, footer, sidebar, related links section on blog articles or related products section on product description pages. The navigation menu and the body content links usually link to the most popular pages of the site. It’s no surprise that for a site, the homepage gets the most external links. So, it’s important to optimise the internal links on the homepage to link to most pages of your site. This will pass the link equity to the deeper pages of the site architecture.
There are a number of ways to review the internal links on a website;
- Manually by clicking through the links on your site to check if everything is ok from a user perspective.
- Use a crawling software ScreamingFrog, Botify or Deepcrawl to pull the internal links for a large number of your site pages. Ensure all your internal links point to 200 status code pages and don’t go through redirects.
- Use tools like Sitebulb or ScreamingFrog itself to visualise your site’s internal linking structure and crawl depth.
- Review your site’s internal linking structure for orphan pages and include them in your main internal lining architecture if the page is useful. An easy way to spot these is by finding pages within your sitemap but not linked within the site pages.
Some examples of sites getting internal linking right;
Review unique internal links to your most important pages from time to time. Maximise internal linking to pages with the highest search demand and contribution to revenue. A good internal linking structure to your key pages helps bots decide what pages are most important.
URLs act as a minor ranking factor. Pay attention to your URL structures. Make them human-readable to make it easy for both search engines and users to understand the destination page and the structure of your website.
Let’s take an example of a facetted URL from the House of Fraser website – https://www.houseoffraser.co.uk/women/dresses/maxi-dresses. The URL format is clear for both users and search engines what the page is about and what they are likely to find within. Make it as clear as possible to search engines by laying out your URLs in an ordered format. You can also see the journey the user has followed to reach that specific maxi dresses page.
When it comes to the importance of URL structure versus click depth, John Mueller revealed that Click depth determines page importance more than the URL structure. You must ensure that your key pages are as closely linked to the homepage as possible.
HTTPS Website Security
Google considers a website’s security as a ranking factor. Your site must run on HTTPS as this is the secure version. When it comes to HTTPS, you must ensure the following;
- Internal links must point to the HTTPS version.
- Images load via HTTPS.
- Ad networks must load via HTTPS.
- Automatic redirects must be in place from HTTP to HTTPS.
Optimize Robot.txt File
A robots.txt file is located in the root folder. Crawlers and spiders of user agents such as Googlebot, Bingbot, Yandexbot etc will access the robots.txt file to understand what can or can’t be crawled of the site before accessing other areas of your site. Crawlers may still choose to ignore the instructions.
We can block certain sections of a website from crawlers using the Robots.txt to prevent crawl budget wastage of site sections you don’t want getting indexed. The 2 common directives used in a robots.txt file is the ‘Allow’ and ‘Disallow’ to instruct the User-agents. Make sure that this file does not exclude any important sections of the site from getting crawled. The robots.txt can also include the XML sitemap URL using the Sitemap directive to aid in your URL discovery.
Use the robots.txt Tester in the Search Console to test the robots.txt markup. Please note that Google does not support the inclusion of noIndex directives within robots.txt files.
On-Page HTML Element Checks
When you are auditing a website, it is important to investigate key SEO elements such as page titles, meta descriptions, headings etc on your important pages are fairly well optimised. Also, review your page canonical tags or hreflangs (for international sites) for correctness. There are a number of ways to check this;
- View the page source of the pages you want to check these elements.
- Or you can inspect the DOM on chrome developer tools by right-clicking on a page and clicking inspect to view the rendered page source. Use the search functionality to inspect the SEO elements.
- Use browser extensions such as SEO META in 1 CLICK check meta-data, image alt tags, canonicals, headers and other SEO elements of a page.
- You can also use website site crawlers such as ScreamingFrog / Botify / Deepcrawl / various to check the SEO elements when auditing.
Page Speed Checks & Improvements
Fast pages convert and rank better. The benefits are beyond just SEO and are much more important in the mobile-first world. Pages with longer loading times tend to have higher bounce rates. This can have a significant impact on your website performance and rankings.
There are various online tools available to help you analyse your site speed and provide in-depth analysis and recommendations and quick fixes such as;
- Choosing a good Web hosting company.
- Improve server response time.
- Having lots of images on your web pages not only affects overall performance for a user, but it can also have an impact on the ability of the page to rank in Google. Optimising the size of images using image compression techniques. You can use a lossless optimizer such as ImageOptim or FileOptimizer to make your images download faster, without losing quality.
- Eliminate render-blocking resources.
- Fix internal Redirects.
- Enabling Cache – Server and Browser caching. Using CDNs.
- and more…
The most common tools used by SEO to test your Web pages and see how fast they perform are Google Page Speed Insights/Google Lighthouse, GTMetrix/WebPageTest and the speed report which is part of the Google search console.
On Google Page speed insights for example, when you run a test on a webpage, everything that is green and orange is arguably OK, but the problems are usually highlighted in red that you should be looking to fix. that you want to focus on. On the page speed report within your search console, you can see a high-level view of the number of slow, moderate and fast URLs of your site. You can dig deeper to check the exact details of the issue for both desktop and mobile.
Various tools can be used to analyse the page speed of a page and be presented with detailed opportunities and the potential savings that can be made of load time by fixing the issues. My recommendation would be to use a combination of these speed check tools to get the best possible picture as each tool can highlight a couple of unique opportunities.
Check your XML Sitemap for Issues
An XML Sitemap is a way of telling search engines about your site URLs (pages, videos, images etc) you wish to be indexed in search results. An HTML sitemap, on the other hand, helps users navigate the site. The XML sitemap must contain the URL and the last modified date. It can also contain other optional fields such as alternate language versions for an International site. A clear sitemap quickly shows search engine crawlers the key pages you want them to discover sooner. Especially beneficial for large sites.
Ensure your site has a valid XML sitemap or sitemap index containing only indexable, 200 status code, self canonicalised site URLs and submit it to the search console.
There a number of tools that can be used to create XML sitemaps. Most CMSs come with out of the box dynamic XML sitemap generators so the sitemap stays up to date when pages are added/removed. In what comes as good news from the announcement in July 2020, WordPress 5.5 gets a built-in XML sitemap feature and will be included in all future updates.
You could use plugins, free online tools or site crawlers to find XML sitemap issues such as inclusions of non 200 status code URLs / non-self canonicalised URLs / non-indexable URLs within your XML sitemap. As a part of auditing the sitemap for standard issues as explained here, you may even come across orphaned URLs present in the XML sitemap that you could then link within your site architecture.
To check if a site has an XML sitemap or sitemap index, check the site’s robots.txt and look for the sitemap declaration.
Check Mobile-First Indexing Best Practices
It is a no-brainer that users are becoming more active on mobile compared to desktop. So, it’s important to adopt effective mobile-first SEO strategies and be mobile-friendly. Mobile-first indexing which Google announced in March 2018 essentially means Google will use the mobile version of the site for indexing and ranking to better help out primary mobile users find exactly what they are looking for on SERPs. So, this means that Google is primarily going to crawl the mobile version of the website using the Smartphone user agent and index and rank you based on that.
Since the 1st of July 2019, all new sites by default are on the mobile-first index. But this does not mean the site follows mobile best practices. Those sites that existed before this date would be evaluated and moved to the mobile-first index if they follow mobile best practices. Site owners will be notified through the Google search console when their site is moved across the mobile-first index. Follow the Google developers documentation link if you want to read up in-depth details on Mobile-first indexing best practices.
What is the mobile best practice one should follow?
- Ensure you have the same content on your mobile and desktop site. If you have a good responsive design (Google recommends this), you should be OK. Review the following valuable content on both versions of the desktop and mobile site to ensure parity;
- Menu Links
- Main body content.
- Links on Footer and Sidebar.
- Schema markup
- Ensure your site is mobile friendly without any usability issues on mobile devices.
Check issues with Canonical Tags
A canonical tag is used to tell search engines the preferred version of the page URL to which content should be attributed. The non-self-canonical tagged pages are non-Indexable. Because Google treats canonical tags as a hint and not a directive, you can’t completely rely on them. Canonical tags are extremely useful to prevent problems with duplicate pages. It is effectively telling search engines which version needs to be indexed in SERPs.
Canonicals Best Practice
- Implement a self-referential canonical tag on the unique pages you want to index. This tells the search engines that the page itself is the preferred version you want to index.
- On Ecommerce sites, to deal with duplication, canonical tags can be used on your facetted/filter pages to canonicalise to the category page to stop filter pages from targeting the same terms as your category page. Check out my SEO guide for eCommerce sites or WooCommerce based eCommerce sites here.
- Don’t canonicalise a page to a redirect page, a non-indexable page or a non-200 status code page. These provide mixed signals to search engines.
- Ensure all your pages within the paginated series contain a self-referential canonical tag. It is a common pitfall on eCommerce sites to see all the paginated pages within the paginated series canonicalise to the 1st page.
To identify the canonical tags for a page, you could;
- Inspect the DOM (right-click and click Inspect). Search for “canonical”.
- Check Page Source. As opposed to the DOM, this is the unrendered HTML.
- You can also crawl a page using a crawler such as Screaming Frog to investigate canonical tags for your site pages. It is easy to review missing self-referential canonical tags, canonical tags pointing to another page or incorrect implementations of a canonical tag through a site crawl.
- You can also use the Search Console to inspect canonical tags either by;
- Inspect a single URL using the Inspect feature.
- Reviewing the excluded URLs in the coverage report. View the “Duplicate without user-selected canonical section”. This report is used to identify issues where Google has selected a different canonical URL to the one set. Sometimes Google can go wrong.
Check Issues with Pagination Implementation
Typically, pagination is usually implemented either using a traditional numbered pagination or using a ‘Load More’ button which is my preference as it provides an intuitive experience for a user. The first thing that comes to everyone’s mind when we think of Pagination is rel=prev/next markup which used to be an important consideration for defining the paginated series for Google. However, in early 2019, Google announced that rel=prev/next has not been used by them for indexation for a while and hence no longer crucial. But they can be left on the pages and do no harm if implemented correctly. Incorrect implementation of pagination can cause spider/bot traps.
To identify incorrect rel=prev/next markup implementation, you could;
- Go to the final page of the paginated series and inspect the rel=prev/next meta tag. If the last page contains a value for the next markup, then it is incorrectly implemented.
- You can also use a crawler to check if the Pagination is correctly implemented for a site. On Screaming Frog, for example, you can check the Pagination section once the crawl analysis is complete. Issues such as non-indexable paginated URLs / non-self-referential paginated URLs can be easily identified using Screaming Frog. The latter is a common issue on eCommerce sites where the paginated series of a product lister page usually canonicalise back to page 1.
I highly recommend you watch the “The State of Pagination and Infinite Scroll on The Web” video from the BrightonSEO 2019 conference.
Validate Schema Markup
Implementing schema markup / structured data can enhance your site’s appearance in the SERPs. By defining what you would like to see for some elements in the structured data you can standardise the display of your brand in SERPs. Some benefits of implementing schema include displaying rich search results, rich cards (on mobile), knowledge graph, breadcrumbs, carousels and more in SERPs. Common schema use cases include news articles or blog posts, product schema for your eCommerce product pages, breadcrumbs, recipes, reviews, events etc.
The popular schema markup types are JSON-LD (Google’s preference) & Microdata. Microdata is HTML attributes within markup throughout a page. JSON-LD is a structured JSON object produced and injected into a page in one piece. Where possible use JSON-LD schema markup as it’s generally easier to implement and maintain. Click here to view an example of Product structured data using JSON-LD, RDFa & Microdata.
To validate if the schema markup is implemented correctly, use the Structured Data Testing Tool.
A few other things to watch out for
- Review the Faceted Navigation of your eCommerce sites for common issues. If not handled correctly faceted navigation can cause duplication, massively eat up your crawl budget and dilute your main pages link equity to low-value pages.
- Perform Google searches using the ‘site’ command for your domain – site:domain.com. Review SERPs listings, look for issues.
- Tracking issues
- Hacked pages.
- Blocked resources.
- Hidden on-page content or links.
- Pages wrongly canonicalised.
- Unexpected robots.txt changes
Monitor Google Search Console
Google search console is a free invaluable tool for site owners and SEOs. GSC includes 16 months of search traffic data with key reports such as the index coverage, server errors, sitemap, speed reports (including the new core web vitals reports), links and mobile usability reports and much more. These reports can help you monitor, troubleshoot and fix site issues. In November 2020, Google has released a new and improved version of the crawl stats report that you can use for your site within the search console to find issues.
Personally, on a day-to-day basis, I use the search console for the following;
- Analyse website search query impressions, clicks and position on Google search.
- Monitor Sitemap issues.
- Review index coverage reports.
- Proactively fix site issues upon receiving alerts over email.
- Use the URL inspection tool to analyse the indexation and crawling issues of your pages.
Fix what Google is telling you. Google has put together search console training videos on their YouTube channel to teach you how to monitor your site traffic and make informed decisions to optimise your site’s search appearance on Google SERP to ultimately increase your organic traffic. Don’t forget to connect your site to Bing Webmaster Tools as they have been revamping a lot of their offerings recently.
My Favourite Free & Paid Technical SEO Tools
- Webmaster Tools: Google Search Console, Bing Webmaster Tools
- Log analysis + cloud-based site crawlers: Deepcrawl, Oncrawl, Botify.
- Desktop website crawlers: Screaming Frog SEO Spider, Sitebulb.
- Page Speed tools: WebPageTest, GTMetrix, Google Page Speed Insights (Also install the Lighthouse extension)
- Image Compression Tools: ImageOptim, FileOptimizer
- Site Audit Tools: SEMRush / Ahrefs.
- Browser Extensions: Ayima redirects path browser extension, SEO META in 1 CLICK.
- Site monitoring tools: Robotto, Little Warden, VisualPing