A DITA WebHelp transformation scenario can be configured to produce a
sitemap.xml file that is used by search engines to aid crawling and
indexing mechanisms. A sitemap lists all pages of a WebHelp system and allows
webmasters to provide additional information about each page, such as the date it was last
updated, change frequency, and importance of each page in relation to other pages in your
WebHelp deployment.
The structure of the
sitemap.xml file looks like
this:
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.example.com/topics/introduction.html</loc>
<lastmod>2014-10-24</lastmod>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
</url>
<url>
<loc>http://www.example.com/topics/care.html#care</loc>
<lastmod>2014-10-24</lastmod>
<changefreq>weekly</changefreq>
<priority>0.5</priority>
</url>
. . .
</urlset>
Each page has a
<url> element structure containing additional
information, such as:
- loc - the URL of the page. This URL must begin with the protocol (such
as http), if required by your web server. It is constructed from the
value of the webhelp.sitemap.base.url parameter from the transformation
scenario and the relative path to the page (collected from the href
attribute of a topicref element in the DITA map).
Note: The value must
have less than 2,048 characters.
- lastmod - the date when the page was last modified. The date format is
YYYY-MM-DD.
- changefreq - indicates how frequently the page is likely to change.
This value provides general information to assist search engines, but may not correlate
exactly to how often they crawl the page. Valid values are: always,
hourly, daily, weekly,
monthly, yearly, and never. The first
time the sitemap.xml file is generated, the value is set based upon
the value of the webhelp.sitemap.change.frequency parameter in the DITA
WebHelp transformation scenario. You can change the value in each url
element by editing the sitemap.xml file.
Note: The value
always should be used to describe documents that change each time
they are accessed. The value never should be used to describe archived
URLs.
- priority - the priority of this page relative to other pages on your
site. Valid values range from 0.0 to 1.0. This value does not affect how your pages are
compared to pages on other sites. It only lets the search engines know which pages you
deem most important for the crawlers. The first time the sitemap.xml
file is generated, the value is set based upon the value of the
webhelp.sitemap.priority parameter in the DITA WebHelp transformation
scenario. You can change the value in each url element by editing the
sitemap.xml file.
Note: lastmod, changefreq, and
priority are optional elements.
Creating and Editing the sitemap.xml File
Follow these steps to produce a
sitemap.xml file for your WebHelp
system, which can then be edited to fine-tune search engine optimization:
- Edit the transformation scenario you currently use for
obtaining your WebHelp output. This opens the Edit DITA Scenario
dialog.
- Open the Parameters tab and set a value for the following
parameters:
- Execute the transformation scenario.
- Look for the sitemap.xml file in the transformation's output
folder. Edit the file to fine-tune the parameters of each page, according to your
needs.