DITA Best Practices : Content Reuse

Content reuse’ is often seen as one of the key benefits to be gained by implementing a content management system (CMS).

It is certainly true that there are considerable advantages in being able to use a single page (or piece of content) in multiple locations on a single site, across multiple sites, or in different published formats.

What is Content Reuse?

‘Content reuse’ refers to any situation where a single piece of source content is written once, and then used in multiple locations or contexts.

It is a term that appears widely in tenders, vendor marketing materials and industry reports. There is not, however, a consistent understanding of what content reuse means in practice, and the term is used to mean many things, each of which may be met by different technology solutions.

DITA and DITA Open Toolkit support three kinds of reuse:

  • Content reuse, in which a source topic or part of a topic is written once and used in multiple locations. For example, you might reference the same concept topic (say, processing DITA files) in both the processing and the troubleshooting maps. Another example might be to use the DITA content reference (conref) mechanism to reuse content once (say, using the text of a controlled vocabulary topic in an "about" file) or many times (say, repeating a short warning statement about the proper use of a hardware unit).
  • Information design reuse (specialization), in which you extend the definition of an existing DITA element to be used in a special way. Specialization makes use of the fact that DITA is based on the principle of inheritance.
  • Processing reuse, in which you override stylesheet processing to customize your output.

Metadata as conditional processing

Use metadata for conditional processing. By applying conditional values to content both within topics and in maps, you can generate output that contains only the specified content and links. The first step is identify the values on which to process, then to create a DITA processing (.ditaval) file that lists properties and all the possible values. By default, DITA includes properties for conditionally processing by audience, platform and product. You can easily add additional properties or customize the default properties with your own values. Next the authors must apply the appropriate values to elements within topics, topics within maps, or maps within maps. This means the author must know all of the appropriate values and when to apply them. The key to this exercise is not having too many values. If there are too many values or the value structure is too complex, meaning that multiple values need to be applied to the same elements or topics, authors will not apply the! values correctly. After the values are applied, you can generate the output using the appropriate value for each deliverable. If the values are not correctly or consistently applied or you do not specify the correct values during generation, you will not get the correct output. You need to have a rigorous verification regime to make sure that the deliverables contain the anticipated content.

Conditional Text

Understand the dependencies for repurposing content. Enterprise content is often applicable to more than one purpose or deliverable in a company. The first challenge to repurposing content is identifying the content other authors want to access and reuse. This requires communication and coordination between all the stakeholders. Start small and work with those teams with whom you have the best communication and/or share the most content.

The next challenge is storing the content in a manner and location so that the other users can access it. This is particularly difficult when each department uses a different repository for content storage and will require the support of IT to address. In many cases, most of the users who want your content will not have access to the repositories in which you store it.

Lastly, you need to understand the way these authors want to consume the content. If they are not authoring in XML, they will need generated output. This means you need to know into which format you must generate the output and how often you need to regenerate it. If they are also authoring in XML, they will want to reuse the XML source. To avoid unintentionally changing reused content and unknowingly propagating changes throughout the enterprise, you must have a reuse strategy in place to know who is reusing or consuming what content.

Controlling content through content references (conrefs/xrefs)

Reuse controlled content using content references. Controlled content is content that cannot be changed without authorization. The first consideration is identifying who owns the content – who has authorization to make changes. The second consideration is training authors to reuse it. In order to reuse controlled content, create a master list of phrases or other content in one or more files, then reference the content into the applicable location in individual topics. For example, if your content contains regulated warnings, create a topic with the legally approved warning and reference or pull the content into each topic rather than rewriting the warning every time. This provides a single version of the truth and saves authors time. Another good example of controlled vocabulary is an acronym list that provides the full explanation of the acronym for authors to reference rather than retyping. The keys to successfully reusing controlled content are that the content must be easy to find and the authors must reference all instances of content. Authors will not reuse content that they cannot easily find, and if they do not insert the proper reference elements, the content cannot be updated globally.

Update dynamic content using content references. Dynamic content is content that can be updated at any time. Unlike controlled content, the ability to update dynamic content is not limited to a specific owner. This means that you must communicate when content is reused and when it is updated. You can do this either via technology or process. If you store the content in a content management system (CMS), the workflow support can manage this communication. Authors can easily see if someone else is reusing their content, and the CMS provides a mechanism to inform authors when content they are reusing has changed

Common Files

A common file library has been built within Vasont; it is available within Common Components in the Collection Explorer.  

Locating files in Vasont

One of the primary questions we have users is: how do I know if equivalent or similar content has been written? Use the Nav/Query feature to locate common files