Language Scribing

Language Scribing Overview

Web Content Accessibility Guidelines (WCAG) AA accessibility requires granular language tagging.

The Well-Formed Document Workflow has a manual method of creating alternate language styles using the ISO codes. In the Word file, new language styles can be created with the following name pattern.

Pattern in Word: [scmlstyle]@lang=[langcode]

The metadata and lang styles can be added in a Word document, a sam file, an ScML file, or an InDesign document. If added to sam, ScML, or InDesign, this metadata will round-trip through the Well-Formed Document Workflow. (To extract this info out of InDesign, Scribe Tools 4.0 or later is required.)

This example shows how the metadata for Spanish-language italic text could be identified in Word and carried through to sam, ScML, and InDesign. Note: In each environment, the formatting of the style name is slightly different.

In Word: lang-i@lang=es
In sam/ScML: <lang-i lang="es">
In InDesign: lang-i-language-es

Scribe recommends that authors scribe the language information in Word when possible to avoid missing instances of different languages.

Procedure for Language Scribing in Word

Create the necessary language styles needed for the project in Word.
Toward the end of the scribing process, review all italic text for phrases that need to have language metadata applied. Apply the language styles created.
If any nonitalic text needs to have language data applied, apply the appropriate language style as needed.

Notes: lang-i in bibliographies can prevent the bibliography tools from working as expected. If language scribing is needed in the bibliography, Scribe recommends handling this after copyediting instead of before.

Procedure for Language Scribing in sam

Review Special Characters

Review the special characters list in the Digital Hub for languages that fall outside the Latin alphabet. These can be searched for within Sublime.

Review Character Styles

Review italic terms, various "-i" styles, and various lang terms in a new Sublime file.

Find: <i>[^<]+</i>|<[^>]+-b?i>[^<]+</[^>]+-b?i>|<lang[^>]*>[^<]+</lang[^>]*>|<[^>]*lang[^>]*>[^<]+</[^>]*>

Copy into a new file, permute unique lines, remove English text, and filter out proper names. Add lang attributes as needed to the original file.

Review Paragraph Styles

Review block quote (bq) and senseline (sl) paragraphs in case there are full paragraphs in another language.

Find: <[^>"]*(bq|sl)[^>]*>[^\n]+

Copy into a new file, turn off word wrap, and skim for non-English text.

Review Book-Specific Styles

Certain books such as Bibles or language books may have additional paragraph or character styles that are being used to identify languages. Review additional content for languages based on the type of publication.