UNPKG

12 kBMarkdownView Raw
1# apostrophe-site-map
2
3This module generates XML and plaintext sitemaps for sites powered by the [Apostrophe](https://apostrophenow.org) CMS.
4
5It serves two purposes: [white-hat SEO](https://support.google.com/webmasters/answer/183668?hl=en&ref_topic=6080646&rd=1) and content strategy.
6
7## SEO with sitemaps
8
9A frequently updated and accurate XML sitemap allows search engines to index your content more quickly and spot new pages immediately. But *an out-of-date sitemap is worse than nothing and will damage your site's SEO.*
10
11This module generates a sitemap that includes all of the pages on your site that are visible to the public, including "pieces" such as events, and blog posts. And it does so dynamically, with a short cache lifetime, so your sitemap is not out of date.
12
13## How to use it
14
15* Install the module.
16
17`npm install --save apostrophe-site-map`
18
19* Configure it in `app.js`, as one of your modules.
20
21```javascript
22{
23 // You should configure `baseUrl` to ensure full URLs in your sitemap
24 baseUrl: 'http://example.com',
25 modules: {
26 'apostrophe-site-map': {
27 // array of doc types you do NOT want
28 // to include, even though they are
29 // accessible on the site. You can also
30 // do this at the command line.
31 excludeTypes: []
32 }
33 }
34}
35```
36
37#### Alternative configuration
38
39If you don't like to modify/overwrite the baseUrl for the site or keep the site without a baseUrl, you can add baseUrl in the configuration of the module:
40
41```javascript
42{
43 // No baseUrl here
44 modules: {
45 'apostrophe-site-map': {
46 baseUrl: 'http://example.com',
47 excludeTypes: []
48 }
49 }
50}
51```
52
53* Just launch your site as you normally would. In development that might just be:
54
55```
56node app
57```
58
59* Access `http://localhost:3000/sitemap.xml` (in production, of course, the hostname is different).
60
61> **AN IMPORTANT WARNING: if you ALREADY have a STATIC public/sitemap.xml file, THAT FILE WILL BE SENT INSTEAD.** Remove it. Also, SITEMAPS ARE CACHED for one hour by default, so you won't see changes instantly. Read on for how to change the cache lifetime, and what you can realistically expect from Google.
62
63### Clearing the cache, and changing the cache lifetime
64
65To better support multiple-server environments, this module now serves sitemaps directly and caches them in your database. That way we don't have to worry about whether a static file exists in a given environment, running the same task on multiple servers, etc.
66
67By default sitemaps are cached for 1 hour. You can change this by specifying the `cacheLifetime` option to this module, in seconds. However, don't get too excited: Google [usually does not check a sitemap more often than a few times a month](https://webmasters.stackexchange.com/questions/43874/how-often-does-gwt-check-dynamic-sitemaps).
68
69You can clear the cache at any time with this command line task:
70
71```
72node app apostrophe-site-map:clear
73```
74
75This will force a new sitemap to be generated on the next request.
76
77### Generating the sitemap ahead of time
78
79You can use this command line task to update the sitemap in Apostrophe's cache at any time, rather than waiting for it to expire after an hour and generate again on the next request:
80
81```
82node app apostrophe-site-map:map --update-cache
83```
84
85If your site has many pages and pieces, generating the sitemap dynamically may take a long time. Scheduling the above task to run at least twice an hour via a [cron job](https://www.howtogeek.com/101288/how-to-schedule-tasks-on-linux-an-introduction-to-crontab-files/) guarantees that a search engine will never be forced to wait when requesting your sitemap. If you have enough content, search engines may hang up before your sitemap is generated, so this task is very useful.
86
87### Generating sitemaps as static files
88
89If you wish, you can generate a sitemap as a static file.
90
91Just run this task:
92
93```
94node app apostrophe-site-map:map
95```
96
97When `--update-cache` is not given, this task generates an XML sitemap and displays it on the console. This is mostly useful for content strategy purposes. If your goal is to serve the sitemap to search engines, see above for a better way.
98
99## How to tell Google about your sitemap
100
101Create a `public/robots.txt` file if you do not already have one and add a Sitemap line. Here is a valid example for a site that doesn't have any other `robots.txt` rules:
102
103```
104Sitemap: http://EXAMPLE.com/sitemap.xml
105```
106
107You can also have other `robots.txt` directives if you wish.
108
109On Google's next crawl of your site it should pick up on the presence of the sitemap.
110
111## Changing the priority of pages and pieces
112
113By default, an XML sitemap will assign a priority to a page based on its depth. The home page has a priority of 1.0 (the highest), a subpage of the home page 0.9, and so on.
114
115Pieces receive a priority of 0.7; however if they have a `startDate` property (i.e. they are events) in the future, they bump up to 0.8, and if they have a `startDate` in the past they bump down to 0.6.
116
117**You can also set the priority yourself.** Once you install this module you will discover that there is a new "sitemap priority" field in "page settings," and when editing a piece via the edit dialog box. You can set this field to any number between 0.0 and 1.0, with 1.0 being the highest.
118
119As of this writing, Google suggests that they may use the priority to rank the importance of pages *relatively within your site.* **Please do not set all the priorities to 1.0. It will only hurt your chances of communicating which pages are most important to Google.**
120
121## Content strategy
122
123You can also use this module just to generate a map of your site for your own study:
124
125```
126node app apostrophe-site-map:map --format=text --indent
127```
128
129The result is a very informative depth-first list of pages. Note the use of leading spaces to indicate depth:
130
131```
132/
133 /about
134 /about/people
135 /about/ducklings
136/products
137 /products/cheesemaker
138```
139
140You'll want to pipe that to a text file and consider printing it.
141
142*The displayed "depth" of pieces won't always correspond directly to the pieces-pages that display them.* You might want to exclude them when generating content strategy maps.
143
144## Warning: watch out for your custom stuff!
145
146This module does the best it can.
147
148It'll list your published pages, and your published pieces. And it'll rank future events higher than past events.
149
150But it doesn't know anything about the custom URLs, independent of Apostrophe's usual mechanisms, that you're generating in your own creative and amazing modules.
151
152If that's a concern for you, create `lib/modules/apostrophe-site-map/index.js` in your project, subclass the module, and override the `custom` method to output information about **additional** URLs. *Note: if you have multiple locales via `apostrophe-workflow` this method is called once per locale.* This method now receives `req, locale, callback` if written to accept three arguments.
153
154It's straightforward: all you have to do is pass Apostrophe page objects, or anything else with an `_url` property and a `siteMapPriority` property, to `self.output`.
155
156Here's a simple example. Note the use of `self.host` to get the "stem" of the URL (`http://mysite.com`).
157
158For regular pages in the page tree, `level` starts at `0` (the home page) and increments from there for nested pages. For your own "pages," just keep that in mind. The higher the `level`, the lower the `priority` will be in the XML sitemap. Or pass the`siteMapPriority` property explicitly.
159
160> This feature is **not** for changing priorities of existing pages and pieces. It is for your custom routes and dispatch URLs that the module cannot discover on its own. See the "page settings" dialog box or the edit dialog box for a field that lets you set the priority of an ordinary page or piece.
161
162```javascript
163// lib/modules/apostrophe-site-map/index.js, at project level, not in node_modules
164module.exports = {
165 construct: function(self, options) {
166 self.custom = function(req, locale, callback) {
167 // Discover something via the database, then...
168 self.output({
169 _url: 'http://mysite.com/myspecialplace',
170 // Defaults to 0.5 if not set and a `level` property
171 // cannot be used to infer it
172 siteMapPriority: 0.9
173 });
174 return callback(null);
175 };
176 }
177};
178```
179
180Note that `req` only has the same privileges as an anonymous site visitor. If you call `find` methods with it, you will only see what typical site visitors see. This is good, because **you don't want Google to index restricted pages.**
181
182## How to exclude stuff
183
184"I don't want thousands of blog posts in my sitemaps." OK, so do this in `app.js` when configuring the module:
185
186Or do it in `app.js` when configuring the module:
187
188```javascript
189 {
190 'apostrophe-site-map': {
191 excludeTypes: [ 'apostrophe-blog-post' ]
192 }
193 }
194```
195
196You may specify multiple doc types to exclude. You may also exclude page types the same way by adding their doc type to the array, e.g., `styleguide`.
197
198You can also do this at the command line, which is helpful when generating a map just for content strategy purposes:
199
200```
201node app apostrophe-site-map:map --format=text --indent --exclude-types=apostrophe-blog
202```
203
204Alternatively, you can set the `sitemap` option to `false` when configuring any module that extends `apostrophe-custom-pages` or `apostrophe-pieces`.
205
206You can also explicitly set it to `true` if you wish to have sitemaps for a piece type that is normally excluded, like `apostrophe-users`. Of course this will only help if they have a `_url` property when fetched, usually via a corresponding module that extends `apostrophe-pieces-pages`.
207
208## Removing the `siteMapPriority` field globally
209
210You may wish to not include the `siteMapPriority` field on any pieces or pages. To do this, add a `noPriority` option set to `true` when configuring `apostrophe-site-map` in your `app.js`:
211
212```javascript
213 {
214 'apostrophe-site-map': { noPriority: true }
215 }
216```
217
218## Integration with the `apostrophe-workflow` module
219
220If you are using the `apostrophe-workflow` module, the sitemap module will automatically fetch content for the live versions of all configured locales.
221
222By default, the result will be emitted as a single sitemap. [According to Google, this is OK, although you must claim all of the sites under a single identity in the Google webmaster console.](https://support.google.com/webmasters/answer/75712?hl=en) However, if you would prefer a separate sitemap file for each hostname found in the absolute URLs, you can set the `perLocale` option to `true` when configuring the module.
223
224Or, if you're generating static sitemaps at the command line, you can pass the `--per-locale` option.
225
226When you set the `perLocale` option, sitemaps are served by the module from `/sitemaps/fr.xml`, `/sitemaps/en.xml`, etc., and a sitemap index is served from `/sitemaps/index.xml`. **Make sure you list `/sitemaps/index.xml` for your Sitemap directive in `robots.txt`**.
227
228> If you generate static files instead with the `apostrophe-site-map:map` task, a physical `public/sitemap` folder is created. **IF YOU CHANGE YOUR MIND AND WISH TO LET THE MODULE SERVE SITEMAPS FOR YOU, REMOVE THIS FOLDER.** Otherwise the static files will always "win."
229
230If the `perLocale` option is set to `true` for the module or the `--per-locale` command line parameter is passed, the `--file` command line parameter is ignored unless `--format=text` is also present. This allows you to still use the module for content strategy.
231
232## Performance
233
234If you have thousands of pieces, building the sitemap may take a long time. By default, this module processes 100 pieces at a time, to avoid using too much memory. You can adjust this by setting the `piecesPerBatch` option to a larger number. However, be aware that if you have many fields and joins, it is possible to use a great deal of memory this way.
235
236```javascript
237modules: {
238 {
239 'apostrophe-site-map': {
240 piecesPerBatch: 500
241 }
242 }
243}
244```