Single Page Applications (SPA) are powered by client-side rendering templates, providing a very dynamic experience for end users. Recently, Google announced that they would crawl web pages and execute JavaScript just like normal users, allowing websites powered by SPA frameworks (Angular, Ember, Vue, etc.) to be crawled without penalty from Google.
In addition to search, other web crawlers are also important for your site's visibility, namely rich social sharing bots that rely on meta tags remain blind to JavaScript.
In this tutorial we will build an alternative routing and rendering module for your Express and Node.js servers that you can use with most SPA frameworks and enable your website to be displayed on Twitter, Facebook and Rich sharing on Pinterest.
This tutorial is dedicated to web bots that extract socially shared information. Don't try this technique with search engine web crawlers. Search engine companies may take this behavior seriously and view it as spam or fraud, so your rankings may drop quickly.
Likewise, with rich social sharing, make sure the content you present is consistent with what users see and what bots read. Failure to maintain this consistency may result in restrictions on social media sites.
If you post an update in Facebook and include a URL, the Facebook bot will read the HTML and look for the OpenGraph meta tag. Here is an example of the Envato Tuts home page:
Inspect the page, in the head
tag, here are the relevant tags that generated this preview:
Pinterest uses the same protocol as Facebook and OpenGraph, so their sharing works in much the same way.
On Twitter, this concept is called "cards", and Twitter has several different variations, depending on how you want to present your content. Here's an example of a Twitter card from GitHub:
This is the HTML that generates this card:
Note: GitHub uses similar technology to the one described in this tutorial. The page HTML differs slightly from tags with the name attribute set to twitter:description
. I had to change the user agent as described later in this article to get the correct meta tags.
If you only need a title, description, or image for your entire website, adding meta tags is not a problem. Just hardcode these values into the head
of your HTML document. However, you may be building a more complex website and you want rich social sharing to vary based on the URL (this may be a wrapper for the HTML5 History API that your framework is operating on).
A first try might be to build a template and add the value to the meta tag just like you would add anything else. Since the bot that extracts this information doesn't execute JavaScript at this point, you end up with the template tag instead of the expected value when trying to share.
To make the website readable by bots, we are building a middleware that detects the user agent of social sharing bots and then builds a fallback router that serves the bots the correct content, thus avoiding the use of SPA frameworks.
Clients (bots, web crawlers, browsers) send the User Agent (UA) string in the HTTP header of every request. This should identify the client software; while web browsers have a wide variety of UA strings, bots tend to be more or less stable. Facebook, Twitter, and Pinterest publish the user-agent strings of their bots as a courtesy.
In Express, the UA string is included in the req
uest object as a user-agent
. I'm using regular expressions to identify different bots that I'm interested in providing alternative content to. We will include it in the middleware. Middleware are like routes, but they don't require a path or method, and they (usually) pass the request to another middleware or route. In Express, routes and middleware are contiguous, so place them above any other routes in your Express app.
app.use(function(req,res,next) { var ua = req.headers['user-agent']; if (/^(facebookexternalhit)|(Twitterbot)|(Pinterest)/gi.test(ua)) { console.log(ua,' is a bot'); } next(); });
The regex above looks for "facebookexternalhit", "Twitterbot", or "Pinterest" at the beginning of the UA string. If present, the UA will be logged to the console.
This is the entire server:
var express = require('express'), app = express(), server; app.use(function(req,res,next) { var ua = req.headers['user-agent']; if (/^(facebookexternalhit)|(Twitterbot)|(Pinterest)/gi.test(ua)) { console.log(ua,' is a bot'); } next(); }); app.get('/',function(req,res) { res.send('Serve SPA'); }); server = app.listen( 8000, function() { console.log('Server started.'); } );
In Chrome, navigate to your new server (it should be http://localhost:8000/
). Open DevTools and turn on Device Mode by clicking on the smartphone icon in the upper left corner of the developer pane.
在设备工具栏上,将“Twitterbot/1.0”放入UA编辑框中。
现在,重新加载页面。
此时,您应该在页面中看到“Serve SPA”,但是查看 Express 应用程序的控制台输出,您应该看到:
Twitterbot/1.0 是一个 bot
现在我们可以识别机器人了,让我们构建一个备用路由器。 Express 可以使用多个路由器,通常用于按路径划分路由。在这种情况下,我们将以稍微不同的方式使用路由器。路由器本质上是中间件,因此除了 req
、res
和 next
之外,就像任何其他中间件一样。这里的想法是生成一组具有相同路径的不同路由。
nonSPArouter = express.Router(); nonSPArouter.get('/', function(req,res) { res.send('Serve regular HTML with metatags'); });
我们的中间件也需要更改。现在,我们不再只是记录客户端是机器人,而是将请求发送到新路由器,重要的是,如果 UA 测试失败,则仅将其与 next()
一起传递。因此,简而言之,机器人获得一台路由器,其他人获得为 SPA 代码提供服务的标准路由器。
var express = require('express'), app = express(), nonSPArouter = express.Router(), server; nonSPArouter.get('/', function(req,res) { res.send('Serve regular HTML with metatags'); }); app.use(function(req,res,next) { var ua = req.headers['user-agent']; if (/^(facebookexternalhit)|(Twitterbot)|(Pinterest)/gi.test(ua)) { console.log(ua,' is a bot'); nonSPArouter(req,res,next); } else { next(); } }); app.get('/',function(req,res) { res.send('Serve SPA'); }); server = app.listen( 8000, function() { console.log('Server started.'); } );
如果我们使用与上面相同的例程进行测试,将 UA 设置为 Twitterbot/1.0
浏览器将在重新加载时显示:
使用元标记
提供常规 HTML
使用标准 Chrome UA,您将获得:
服务SPA
正如我们上面所讨论的,丰富的社交共享依赖于 HTML 文档头部内的 meta
标签。由于您正在构建 SPA,因此您甚至可能没有安装模板引擎。在本教程中,我们将使用玉。 Jade 是一种相当简单、诱人的语言,其中空格和制表符是相关的,并且不需要结束标签。我们可以通过运行来安装它:
npm 安装jade
在我们的服务器源代码中,在 app.listen
之前添加此行。
app.set('视图引擎', '玉石');
现在,我们将仅输入想要提供给机器人的信息。我们将修改 nonSPArouter
。由于我们已经在应用程序集中设置了视图引擎,因此 res.render
将进行玉石渲染。
让我们设置一个小玉模板来服务于社交共享机器人:
doctype html html head title= title meta(property="og:url" name="twitter:url" content= url) meta(property="og:title" name="twitter:title" content= title) meta(property="og:description" name="twitter:description" content= descriptionText) meta(property="og:image" content= imageUrl) meta(property="og:type" content="article") meta(name="twitter:card" content="summary") body h1= title img(src= img alt= title) p= descriptionText
这个模板的大部分内容是 meta
标签,但您也可以看到我在文档正文中包含了这些信息。在撰写本教程时,似乎没有一个社交共享机器人实际上会查看元标记之外的任何其他内容,但如果在以下位置实施任何类型的人工检查,则最好以某种人类可读的方式包含信息:稍后的日期。
将模板保存到应用程序的 view
目录并将其命名为 bot.jade
。不带扩展名的文件名(“bot”)将是 res.render 函数的第一个参数。
虽然在本地开发总是一个好主意,但您将需要在其最终位置公开您的应用程序以完全调试您的 meta
标记。我们的小型服务器的可部署版本如下所示:
var express = require('express'), app = express(), nonSPArouter = express.Router(), server; nonSPArouter.get('/', function(req,res) { var img = 'placeholder.png'; res.render('bot', { img : img, url : 'https://bot-social-share.herokuapp.com/', title : 'Bot Test', descriptionText : 'This is designed to appeal to bots', imageUrl : 'https://bot-social-share.herokuapp.com/'+img }); }); app.use(function(req,res,next) { var ua = req.headers['user-agent']; if (/^(facebookexternalhit)|(Twitterbot)|(Pinterest)/gi.test(ua)) { console.log(ua,' is a bot'); nonSPArouter(req,res,next); } else { next(); } }); app.get('/',function(req,res) { res.send('Serve SPA'); }); app.use('/',express.static(__dirname + '/static')); app.set('view engine', 'jade'); server = app.listen( process.env.PORT || 8000, function() { console.log('Server started.'); } );
另请注意,我使用 express.static
中间件来提供 /static
目录中的图像。
将应用程序部署到可公开访问的位置后,您应该验证您的 meta
标记是否按预期工作。
首先,您可以使用 Facebook 调试器进行测试。输入您的网址并点击获取新的抓取信息。
您应该看到类似以下内容:
接下来,您可以继续使用 Twitter 卡验证器测试您的 Twitter 卡。对于此操作,您需要使用 Twitter 帐户登录。
Pinterest 提供了一个调试器,但此示例无法开箱即用,因为 Pinterest 只允许在主页以外的 URL 上使用“丰富的 pin”。
In actual implementation, you need to handle the integration of data sources and routing. It's a good idea to review the routes specified in the SPA code and create alternate versions of anything you think might be shared. Once you've established a route for possible sharing, set the meta
tag in your main template, which will serve as your fallback if someone shares a page you don't intend.
While Pinterest, Facebook, and Twitter make up a large portion of the social media market, there are other services you may want to integrate. Some services do publish the names of their social sharing bots, while others may not. To determine the user agent, you can console.log
and examine the console output - try this on a non-production server first, as trying to determine the user agent on a busy site can be difficult. From that point on, you can modify the regular expression in our middleware to capture the new user-agent.
Rich social media sharing is a great way to drive people to your beautiful single-page application-based website. By selectively directing bots to machine-readable content, you can provide the bots with the right information.
The above is the detailed content of Enhancing social sharing in Node.js single-page applications. For more information, please follow other related articles on the PHP Chinese website!