Single Page Application SEO. Nginx & Apache With Rendertron

Andriy Mishenin
4 min readNov 30, 2020
Single Page Application SEO

A Single Page Application (SPA) is a great way to improve user experience. Web portals made as SPAs are fast and responsive. These days this approach is a defacto standard for SaaS platforms, social networks, and other feature-rich web applications.

When it comes to SEO, many believe a SPA is a bad idea. SPAs render pages in a browser using JavaScript, and it means that search engine crawlers cannot read the content of the entire website unless they can correctly run all JavaScript code of the website. Processing JavaScript is complicated because of the required computing power, and few search engine bots can do that. If they can’t, they rely on the HTML part of the code that usually contains only default meta-data and placeholders instead of real content.

Luckily, Google can execute JavaScript and supports major JavaScript libraries and frameworks when indexing Single Page applications. Other search engine bots don’t support JavaScript or have minimal support and cannot run sophisticated JavaScript frameworks and libraries. The same is true for social networks, messengers, and other applications that generate link previews or read website content for other purposes.

However, there are workarounds to make sure your project can be indexed even by simple crawlers.

Dynamic Rendering

Hence, when a bot visits a website, it receives an HTML version of every requested page instead of a dynamic SPA. This static HTML prepared on the server is the same as the resulting HTML (DOM) a browser generates after running all included SPA scripts. This way, search engine bots can reach the content without any difficulties.

This feature should not be programmed since there are headless browsers that can do the server-side rendering. For example, Rendertron runs as a server and uses headless Chrome to render requested pages.

Note that a SPA page rendered on a server is only a snapshot of the related SPA page. It is well suited for bots to extract data, but it will not be a working SPA if loaded into a browser.

Dynamic Rendering with Rendertron on Nginx & Apache

First, install Rendertron on your server, run it in docker, or deploy to Google Cloud. To make sure it operates correctly, you can test the installation the following way: https://render-tron.appspot.com/render/https://trackabi.com/.

Now the server should forward all bots via rendertron while normal users should directly access the site’s sources. The server can identify a bot by its user agent. Here are user agent names of some common bots: bingbot, yandex, baiduspider, twitterbot, facebookexternalhit, rogerbot, linkedinbot, embedly, SkypeUriPreview, quora link preview, showyoubot, outbrain, pinterest, slackbot, vkShare, TelegramBot, WhatsApp.

To let SEO optimizers check a website, include their user agents as well. You can find the user agent name of the SEO checker you use in the web server’s access log. Here are user agent names of a few popular SEO checkers: W3C_Validator, RSiteAuditor, SiteCheckerBotCrawler, SeoSiteCheckup, SeobilityBot

You can also include the word “bot” to identify unknown bots that possibly can contain that word in their names.

Set Up Rendertron With Apache

Make sure mod_rewrite and mod_proxy_http are enabled in the Apache configuration. Then add conditional URL rewriting either in a .htaccess file, the VirtualHost configuration, or the main configuration file.

RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} bot|bingbot|yandex|baiduspider|twitterbot|facebookexternalhit|rogerbot|linkedinbot|embedly|SkypeUriPreview|quora link preview|showyoubot|outbrain|pinterest|slackbot|vkShare|TelegramBot|WhatsApp|W3C_Validator|RSiteAuditor|SiteCheckerBotCrawler|SeoSiteCheckup|SeobilityBot
RewriteRule ^(.*)$ https://YOUR-RENDERTRON-URL/render/https://YOUR-WEBAPP-ROOT-URL$1 [P,L]

Set Up Rendertron With Nginx

In your server configuration (nginx.conf), map the $http_user_agent to a custom variable indicating whether you consider this user agent a bot.

map $http_user_agent $is_bot {
default 0;
'~*bot' 1;
'~*bingbot' 1;
'~*yandex' 1;
'~*baiduspider' 1;
'~*twitterbot' 1;
'~*facebookexternalhit' 1;
'~*rogerbot' 1;
'~*linkedinbot' 1;
'~*embedly' 1;
'~*SkypeUriPreview' 1;
'~*quora link preview' 1;
'~*showyoubot' 1;
'~*outbrain' 1;
'~*pinterest' 1;
'~*slackbot' 1;
'~*vkShare' 1;
'~*TelegramBot' 1;
'~*WhatsApp' 1;
'~*W3C_Validator' 1;
'~*RSiteAuditor' 1;
'~*SiteCheckerBotCrawler' 1;
'~*SeoSiteCheckup' 1;
'~*SeobilityBot' 1;
}

In your site configuration, add the following to send requests via Rendertron whenever the current user agent is a bot.

server {
listen 80;
server_name example.com;
# ... other configuration...

if ($is_bot = 1) {
rewrite ^(.*)$ /rendertron/$1;
}
location /rendertron/ {
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $remote_addr;
proxy_pass http://YOUR-RENDERTRON-URL/render/$scheme://$host:$server_port$request_uri;
}
}

Using Nginx With Rendertron In Docker

When using Nginx and Rendertron in docker within the same docker network you may want to refer to Rendertron by the name of its service in docker. However, something like proxy_pass http://rendertron/render/$scheme://$host:$server_port$request_uri; will not work because the server will not be able to resolve the rendertron host. To overcome this, you can define the ‘rendertron’ as an upstream server.

upstream rendertron {
server rendertron:3000;
}

Server-Side Rendering (SSR)

Unlike Dynamic Rendering, Server-Side Rendering produces a fully functional application that can operate as any SPA rendered in a browser. It’s an application run on the server, ’paused,’ and sent to a client (a browser). Such JavaScript applications that can run on both a client and a server are called isomorphic or universal. Once loaded into a browser, the app may proceed to operate. This ability is the main difference comparing to the dynamic rendering explained earlier.

Server-side rendering is somewhat complicated, and its complexity increases rapidly as the complexity of the SPA increases. Rendering a big application on the server is resource-intensive and may turn into a bottle-neck if the server is not powerful enough.

For example, for ReactJS, the most popular solutions that can help to perform SSR are:

Summary

SPA is an excellent way of building web-based applications. And although SEO of SPA requires some extra steps, they are not that complicated and doable with basic experience in Apache or Nginx.

Originally published at https://trackabi.com

--

--

Andriy Mishenin

A Senior Software Engineer and a Certified Project Manager who is truly passionate about web and mobile applications design and development