Skip to content

DITAToo v1.6 Enhances Word-to-DITA Conversion

by Wim on January 27th, 2012

By Alex Masycheff

We are just a few days away from the official release of DITAToo version 1.6. While we are completing our QA checks and updating the user guide, I’d like to tell you about a further feature that you’ll see in this version.

As you probably know, previous versions of DITAToo provided the ability to convert unstyled Microsoft Word documents to DITA. You could give DITAToo a Word document full of local and inconsistent formatting, and DITAToo would convert it to a set of valid DITA topics. The problem was that DITAToo converted everything to concepts. Even if you had a procedure in the original Word document, the output was always a concept topic. Users have been asking us to extend the conversion to different information types. In version 1.6, we’ve taken a very important step in this direction.

Now DITAToo automatically identifies information types in a Word document based on visual representation of the content. For example, if DITAToo finds a numbered list, DITAToo assumes it’s a procedure and converts it to a DITA task. Similarly, if DITAToo finds just several paragraphs and unnumbered lists, it assumes it’s background information and converts it to a DITA concept. What’s also important is that DITAToo doesn’t rely on styles in this analysis. Everything in your Word document can be formatted using the Normal style and manual formatting. DITAToo still identifies most of the content correctly.

We’ve tested this approach on a lot of Word documents that contained hundreds and hundreds of pages, and found that most of the time it works.

Overall, the conversion process now looks as follows:

1. DITAToo splits a Word document into topics based on headings. In parallel, DITAToo extracts all embedded graphics and stores them as separate files.
2. Then DITAToo identifies information types of each topic.
3. Based on the results of the analysis, DITAToo generates a series of concepts and tasks. DITAToo also re-establishes links to the relevant images.
4. In the repository, DITAToo creates separate folders for concepts, tasks, and graphics.
5. Then DITAToo uploads the converted topics to the appropriate folders in the repository.
6. Finally, DITAToo creates a DITA map that replicates the structure of the original Word document and uploads this map to the repository too.

If you find that some topics were wrongly converted as concepts or tasks, you can always use another new feature that we’ve added to DITAToo 1.6 and convert a concept to a task or vice versa in a few clicks (see here for more info: http://lnkd.in/3ASUSg)

Stay tuned and you’ll be able to try DITAToo 1.6’s new features in just a few days!

Comments are closed.