Documentation

Vetting Guide

Vetting is the act of assessing materials to determine the state of the content and actions to be performed.

Consider the following:

  • What is the scope of the project?
  • Have all accessibility requirements been addressed?
  • Will the project involve editorial tasks?
  • Will it be e-book only?
  • Will it be typeset only?
  • In what order will the final products be produced?

Word Scribing Vet

Check a Word document for errors, problems, and issues that will affect the scribing of the file and to determine how long it will take to scribe the file.

1. Files

Are all files present? (Check against the table of contents.)

Do all files open?

Do the files contain the content that the file names indicate?

2. Styles

Does the document use ScML styles?

If so:

  • Are ScML styles applied correctly?
  • Are all ScML styles current?

If not, does the file use styles or other indicators to indicate structure?

3. Structure

Does the document have a clear structure?

Is the structure consistent throughout?

4. Letter Casing

Do elements use consistent letter casing (title case, sentence case, all-caps) throughout?

5. Elements

What kinds of specialized elements are present throughout the document (sidebars, figures, tables, equations, etc.)?

How complex are these elements?

6. Character Styles

Is there any specialized use of character styles (e.g., bold, italic, underline, superscript, subscript, small caps)?

Are any characters rendered with a style such as bold or small-caps that need to be set apart to convey meaning (e.g., dispk for dialogue speaker or gt for glossary term)?

7. Special Characters

Do all special characters render within the document?

Are all characters unique Unicode entities or are they rendered with a legacy font?

Are all special characters used properly?

Use the Digital Hub to obtain a list of special characters used in the document:

  1. Upload your file to the Digital Hub.
  2. In the Word section under the Assets tab, click the Stats icon to see a list of all Non-ASCII characters.

8. Graphics

Does the document contain embedded images?

If so, refer to the Images Vet documentation.

9. Hidden Text

Does the document contain hidden text?

Use the SAI’s Cleanup tool to mark potentially hidden text.

10. Line Breaks and Tabs

Are line breaks and tabs used to indicate spacing or to delineate special elements?

11. Notes

Are footnotes and endnotes embedded?

Is there an equal number of fnnum, fnref, and fn?

Is there an equal number of ennum, enref, and en)?

Do the footnote and endnote windows contain any content other than notes, such as heads? If not, do these heads need to be added?

Are there endnotes in sections that will fall outside of the InDesign text flow (sidebars, tables, boxes, etc.)?

  • If so, these may need to be changed to footnotes or table notes.
  • Add a query note if it is unclear whether an element will be outside of the text flow.

12. Metadata and Alt Text

Has alt text been included?

Have long descriptions been supplied, as needed?

Have other metadata been supplied?

13. Text Checks

Will full text checks be run at the scribing stage, or will limited text checks be run (with full text checks done as part of a copyediting task)?

Copyediting Vet

Check a Word document for errors, problems, and issues that will affect the copyediting of the file and to determine how long it will take to copyedit the file.

1. Files

Are all files present?

Do all files contain complete content?

Do all files open?

Do the files contain the content that the file names indicate?

Are all elements consistent throughout?

Are any unusual elements present?

2. File Statistics

Obtain statistics for these:

  • Character count
  • Word count
  • Number of non-ASCII special characters

Use the Digital Hub to obtain a list of file statistics for the document:

  1. Upload your file to the Digital Hub.
  2. In the Word section under the Assets tab, click the Stats icon to see the file statistics.

3. Specifications

What level of editing is required?

Will any editorial tasks for this project deviate from the standard procedure?

Will any other materials be forthcoming that will contain text not included in this copyedit (e.g., a praise page, copyright page, or a cover or jacket)?

4. Notes

Does this project contain footnotes or endnotes?

If so:

  • How many notes are there (of each type)? Are there an equal number of fnnum, fnref, and fn? Are there an equal number of ennum, enref, and en?
  • What style guide is to be followed for the notes (CMS, APA, MLA, etc.)?
  • Does the current formatting of the notes match the required final formatting?
  • Are the bibliographic details complete in the notes?
  • Do the notes’ bibliographic details match the bibliography?
  • If there are blind notes, have the key phrases been identified, and do they match the phrases used in the note paragraphs?

5. Parenthetical Citations

Does this project contain parenthetical citations?

If so:

  • What style guide is to be followed for the parenthetical citations (CMS, APA, MLA, etc.)?
  • Does the current formatting of the citations match the required final formatting?
  • Do the citations match the reference list?

6. Bibliography and Reference List

How many entries are there?

Are all references complete?

Are there any missing entries?

What style guide is to be followed for the references (CMS, APA, MLA, etc.)?

Does the current formatting of the references match the required final formatting?

Are the references consistently formatted?

Does the bibliography need to be converted to a reference list (or vice versa)?

Are there aspects that will require a specialist’s input or expertise?

What decisions can be made by the person scribing, and what decisions will need specific instruction?

7. Quotations

Do all quotations include attributions?

What style guide is to be followed for the quotations (CMS, APA, MLA, etc.)?

Does their current formatting match the required final formatting?

When checking the accuracy of quotations, what are the expected requirements (particularly for Bible quotations)?

8. Figures and Tables

Are there figures or tables?

If so:

  • How many figures or tables are there?
  • What are the formatting requirements?

9. Editing Level

1. Language

Was the content created in the author’s native language?

Is there any non-English language material?

If so,

  • How much material is non-English language?
  • Is the non-English language material set off from the main text or mixed in with English material?
  • Is the non-English material translated into English?
  • When checking the non-English language material, what are the requirements?
  • Are there aspects that will require a specialist’s input or expertise?
  • What decisions can be made by the person scribing, and what decisions will need specific instruction?

2. Author Style

Is punctuation used consistently (correctly or incorrectly)?

To what degree does the unedited manuscript conform to the style guide?

To what degree should changes be made to conform to a style versus the author’s style?

Does the writing contain any nuances that affect the readability of the manuscript?

3. Audience

What is the intended audience?

Are any industry-specific terms used?

If so:

  • Will the terms need to be spelled out in the text, or are they understood as a default by the intended audience?
  • Will one need to give more or less consideration to a specific audience?

10. Metadata and Alt Text

Has alt text been included?

Have long descriptions been supplied, as needed?

Have other metadata been supplied?

Proofreading Vet

Check a PDF document or Word file for errors, problems, and issues that will affect the proofreading of the file and to determine how long it will take to proofread the file.

1. Files

Are all materials present?

What is the character count or page count of the files to be proofread?

2. Tolerances

What are the editorial tolerances at this stage?

How many rounds of proofreading are expected?

3. Format

How will the proofread be performed (on paper, Word document, or PDF)?

Will changes be provided using comments in Word or Acrobat, or will changes be provided in a separate Word document?

4. Previous Edit

Have the files been edited previously?

If so:

  • Is there an established stylesheet?
  • Are there any unresolved editorial notes or queries?

5. Review

Will an author/editor review the file before, during, or after the proofread?

If there is a conflict between different sets of feedback, which should take precedence?

6. Metadata and Alt Text

Has alt text been included and handled properly?

Have long descriptions been supplied and handled properly?

Have other metadata been supplied?

Design and Typesetting Vet

Check the source files, which can include images, Word documents, fonts, and so on, for errors, problems, and issues that will affect the design and typesetting of the project and to determine how long both of those components will take.

1. Files

Are all materials present?

Are any materials to be supplied at a later date? If so, should placeholder content be added to account for this?

2. Design Origin

Is a new design being created?

Is the design based on an existing design?

If so:

  • In what format does the original design exist (hard copy, PDF, InDesign, or a different program)?
  • Should the original design be matched exactly?
  • Is this book part of a series?

If not:

  • What tone should be conveyed in the design?
  • Are there similar books to be partially matched or reviewed for inspiration?
  • Is there a cover image or other reference file available to guide the design?
  • Are there any design aspects specifically to be avoided?

3. Specifications

What are the design specifications?

  • Trim size
  • Output requirements (printer requirements; crop marks; color or grayscale; bleed)
  • Typographical requirements

4. Software

Are there file version requirements (e.g., InDesign 2023, InDesign 2024)?

Is the use of any other program other than InDesign required?

Is the use of any specialized plug-in required?

5. Content

Does the content or structure of the document raise any questions that would impact design considerations?

Are there any elements the typesetter should be aware of (e.g., special fonts, equations)? If so, list them.

6. Equations

If there are any equations, in what format are the equations (MathType, Equation Editor, etc.)?

Note the amount of equations.

7. Tables

If there are any tables, how complex are the tables?

Note the amount of tables.

8. Figures

If there are any figures, do the figures require special treatment?

Note the amount of images.

9. Non-English Characters

Are there any non-English characters?

If so, what non-English characters are present?

Does any text read right-to-left, like Hebrew?

Does any non-English text require a special font?

Are all special characters Unicode?

10. Fonts

Are all necessary fonts present?

Can the base fonts render every character present in the book?

11. Formatting Instructions

Are there any specialized formatting instructions? If so, list these instructions.

12. Import/Export Considerations

Will this book require special attention to accommodate the importing or exporting of XML content?

Are elements present that will require manual overrides to content or “-alt” styles while typesetting?

13. Possible Problems

Are there any factors present in the document that will increase typesetting time?

Is there anything unusual in the materials?

14. Metadata and Alt Text

Has alt text been included?

Have other metadata been supplied?

Extraction Vet

Check the source files, which can include InDesign, Quark, PDF, web, and FrameMaker files, for errors, problems, and issues that will affect the extraction of the files and to determine how long the extraction will take.

General

1. Source Files

What format are the source files?

Do all files open?

Were the source files produced using the WFDW?

Is there a reference PDF? If so, do the source files match the print version or reference PDF? If not, can an accurate PDF be generated from the source files?

2. Extraction Output

To what format will the extracted files be exported (IDTT, XML, XTG, etc.)?

3. Fonts

Are the fonts used in the source files available?

4. Inconsistencies

Are there any inconsistencies in style usage?

Are there any inconsistencies in text used, e.g., text appearing on book covers or images compared with the book’s interior?

5. Text Flow

Does the content flow correctly? Is there any reflow due to font, computer, or program version issues?

Are all text boxes/stories properly linked? Is everything in one text flow?

If images or other boxes outside of the main text flow need to be placed manually, will that affect the overall time estimate in a significant way?

6. Order

Is the content in the intended order?

7. List Items

Are numbers and bullets in lists automatically generated by the typesetting program?

8. Notes

Are there footnotes or endnotes?

9. Images

Are all images present?

Are any images masked/cropped in the typesetting program?

Are images named in a regular/sequential way?

Refer to the Images Vet documentation.

10. Directional Language

Is any directional language being used, e.g., "Figure 1 (left)"?

Will it be necessary to break up captions or alter directional language when converting to a reflowable ePub?

11. Special Characters

Are Unicode characters used, or are special characters rendered by legacy fonts?

12. Plug-Ins

Are any specialized plug-ins being used?

13. Returns

Have hard and soft returns been used properly?

The following GREP search in InDesign will find soft returns that are not preceded by a space. The replacement expression will add a space.

Find: ([^ ])\n
Replace with: $1 \n

Note: Do not replace all if this will affect URLs or other soft returns that should not have a space in front of them.

14. Spaces

Have typesetter spaces (hair spaces, thin spaces) been applied properly? (Typically these are removed from extracted text so that they don’t become regular spaces. Nonbreaking spaces are usually kept.)

15. Metadata and Alt Text

Has alt text been included?

Have long descriptions been supplied, as needed?

Have other metadata been supplied?

InDesign Source

1. Style Usage

Are styles used properly and consistently?

Are ScML styles present?

Have GREP and nested styles been applied correctly? If not, text may get identified incorrectly in the exported XML.

If the source files were not produced using the WFDW, check for modified styles. List how the styles should be mapped to ScML.

2. Master Pages

Do master pages contain content that needs to be extracted?

3. Notes

Are InDesign notes present (Window > Editorial > Notes)? If so, the extracted IDTT will contain the text with no indicator, resulting in the notes being mixed in with live text.

4. Sample

Does an extraction sample reveal any problems? (If working with files produced with the WFDW, extract the files to XML and use the Digital Hub to convert them to .sam.)

Do the images or their captions get anchored when running Scribe tools?

5. Special Conditions

Are layers being used?

Is there anything in the structure pane that should not be included in the XML output (e.g., pre-tagged images)?

6. Paragraph Marks

Have any pilcrows, or paragraph marks, been identified with tocnum or tso style? If so, these will get deleted during export and the paragraphs would be combined.

Quark Source

1. Extension

Do the source files have extensions? (Zip files for transmission as files without extensions tend to become corrupted.)

2. Style Usage

Are styles properly used in the Quark files?

Are ScML styles present?

If the source files were not produced using the WFDW, check for modified styles. List how the styles should be mapped to ScML.

PDF Source

1. OCR

Is the text selectable, or is the source PDF an image-only PDF?

Will OCR be required to extract content from the PDF?

How will the OCR output be verified?

2. Output

Can the PDF be saved as a Word document, or will content need to be copied and pasted?

Are there bad line breaks, combined words, separated words, and so on in the output?

3. Images

Are all image files present?

If not, will the images be extracted from the PDF?

Refer to the Images Vet documentation.

Web Source

1. Output

Can the web page be printed to text?

When printing to text, are any styles lost, such as bold or italics?

FrameMaker Source

1. Conversion

Will the FrameMaker files be saved to any other format before extracting?

If so, save the files down to MIF for ease of conversion to InDesign or other formats.

Images Vet

Check the image files (.jpg, .tiff, .png, and so on) for errors, problems, and issues that will affect how images are handled and to determine how long the image work will take.

1. Files

Are all materials present?

Are any materials to be supplied at a later date? If so, should placeholder content be added to account for this?

If images are embedded within a Word document, follow these steps to extract the images:

  • PC: Using a program such as 7Zip, extract content from the .docx file.
  • Mac: Change the .docx extension to .zip and unzip the file to extract content.
  • Navigate to the media folder inside the word folder.

2. Number of Images

How many images are there, and how many of each type (e.g., charts, equations, photographs)?

3. Captions

Are captions present for all images that require them?

4. Image Resolution/DPI

What is the DPI of each image?

Does the current DPI of each image match the requirements of the final output?

5. Work Required

What type of image work will be required? Will images need to be recreated or resized?

Do all images meet expectations?

Is any content cut off?

Does the content need to be cropped?

6. Format

What format are the images?

Are the images currently in the format to be used in the final output?

Are images presented in a way that matches descriptions in captions? (e.g., side by side images about which the caption refers to the left and right portion.)

7. Text Editing

Will images with text require copyediting or proofreading?

Are there any spelling errors?

Can text changes be applied to the images?

8. Permissions

Have permissions been obtained?

9. Accessibility

10. Metadata and Alt Text

Has alt text been included?

Will the use of color affect color-blind readers in a way that obscures the intended purposes of the images?

Have long descriptions been supplied, as needed?

Indexing Vet

Check the source files, which can include Word and PDF files, for errors, problems, and issues that will affect the indexing of the files and to determine how long the indexing will take.

1. Files

Are all materials present?

Are any materials to be supplied at a later date? If so, should placeholder content be added to account for this?

2. Index Specifications

Does this index have any specific needs (e.g., focus on sports teams, specific individuals, or locations)?

What style guide will the index follow (CMS, APA, MLA, etc.)?

Will the index be run-in?

How many sublevels will be included?

Does the index have a required length?

3. Multiple Indexes

Will multiple indexes be needed?

4. Type

What type of index(es) will be needed (subject, author names, Bible citations)?

5. Unlinked Index Cross-References

Will all cross-references link when processed to ePub through the Digital Hub, or will any aspect of the cross-references require manual linking?

6. Index Generation

Will the index be generated as an embedded index in Word, or will it be based on a typeset file?

E-books Vet

Check the source files, which can include InDesign, Quark, PDF files, and so on, for errors, problems, and issues that will affect the conversion of the files and to determine how long the conversion will take.

1. Files

Are all materials present?

Are any materials to be supplied at a later date? If so, should placeholder content be added to account for this?

Are the reference files present (e.g, PDF)?

What source files will be used for e-book conversion?

How will the e-book be checked? Do all involved have access to the same programs to use when checking the e-book?

See the Extraction Vet documentation.

2. Metadata and Alt Text

Is the metadata present?

This includes the following:

  • Creator information
  • BISAC category
  • eISBN
  • Accessibility Features
  • Access Modes
  • Sufficient Access Modes
  • Accessibility Summary
  • Conformance

Has alt text been included?

Have long descriptions been supplied?

3. Special Characters

Do all special characters render within the document?

Are all characters unique Unicode entities or are they rendered with a legacy font?

Are all special characters used properly?

Use the Digital Hub to obtain a list of special characters used in the document:

  1. Upload your file to the Digital Hub.
  2. Click the Stats icon to see a list of all Non-ASCII characters.

4. Character Styles and Decorative Elements

Will small caps and dropcaps be retained?

What elements from the reference file should be included (e.g, images used for section breaks, decorative ornaments, etc.)?

5. Internal Linking

Will the e-book need internal linking?

6. Tables

Are there any tables present in the source files?

Are they complex?

7. Figures

Are there any complex figures?

Are there any text-heavy images or images that will not render well on e-readers?

Are the images present?

Will images (including cover images) need to be cropped, pulled from the PDF, or edited in any other way?

Do ornaments need to be retained?

8. Content Outside of the Text Flow

Are there any sidebars or boxes that fall outside of the main text flow?

9. Concrete Poetry

Will any poetry, or other specially formatted content, require an image and live text in order to present the visual aspect while still functioning with read-aloud technology?

See the Concrete Poetry page for how to handle this.

10. CSS

Is there an existing CSS or will one need to be created?

To what degree does the e-book have to match a reference PDF?

11. Unlinked Index Cross-References

Do all cross-references link when processed through the Digital Hub?

If a large number of cross-references do not link automatically, how will they be handled (link manually or remove the cross-references)?

12. Special Text Formatting

Will classes that deviate from ScML need to be added to the HTML file for rendering purposes (i.e., alt styles)?

13. Web PDF

Will a Web PDF be needed?

14. Updating Backlist E-books to Current Standards

Was the original e-book created using the WFDW?

If so, is the ScML available that was used to create the original e-book? And does this ScML file contain any text changes that were made to the e-book’s HTML?

If not, are the source mechanics (e.g., packaged InDesign) files available for a new extraction?

Short Description (Alt Text) Vet

Check the source files for errors, problems, and issues that will affect the creation of short description text. See Short Description Text for more information about how to write alt text.

1. Files

Are all materials present (images, full manuscript)?

Will the alt text be included in the main manuscript or maintained separately? If maintained separately, will it be added to the manuscript before production stages, or will it be added to InDesign or ScML files?

What categories of images are present?

  • Figures/Illustrations
  • Covers
  • Logos
  • Author or editor photos
  • QR codes
  • Decorative images
  • Advertisements
  • Images of text
  • Equations

Are any images split up for a print or Web PDF version that need to be recombined for an e-book version? If so, the alt text must be coordinated to be appropriate in each output.

2. Boilerplate Text

Do any images, such as logos, have established alt text to be used?

3. Context

Is information included in the body of the book and captions that can be excluded from the alt text?

4. Long Descriptions

Do any images require a long description in addition to the short description? This will likely be the case for complex charts or graphs.