Documentation

Regular Expressions Resource

Regular expressions from the checklists on scribenet.com are presented here with minimal context. These can be run on .sam and .scml files. For more details on how and when to use these searches, see the Regular Expressions Resource Supplement.

Quotation Marks, Parentheses, and Brackets

Find: ^[^“]*\”|\“[^”]*\“|\”[^“]*\”|\“[^”]*$

Find: “—|”—|—“|—”

Find: ^[^(]*\)|\([^)]*\(|\)[^(]*\)|\([^)]*$

Find: ^[^[]*\]|\][^[]*\]|\[[^]]*$|\[[^]]*\[

Find: (^[^<{\n]*"|>[^<{\n]*"|}[^<{\n]*")

Find: ”([^ \)\<\]\:;\?\&\/—])|([^ \(\>\[—])“

Find: <i>\((.*)</i>\)|<i>“(.*)</i>”|<i>\[(.*)</i>\]|\(<i>(.*)\)</i>|“<i>(.*)”</i>|\[<i>(.*)\]</i>

Punctuation

Find: ([\!:;,\.‘’“”\?\–])\1

Find: ([\!:;,\.\?\–…])([’”])([A-z0-9])

Find: </i>([\!:;,\.‘’“”\?\–])”

Find: ([ \x{a0}])\.([ \x{a0}])\.([ \x{a0}])\.([ \x{a0}])\.

Find: ([\[\(“‘])( )|( )([\.,:;\?!\)\]’”])([^\x{a0}])|([A-z0-9]+) \. \. ([A-z0-9]+)

Find: \.([\x{00A0} ])\.([\x{00A0} ])\.’([A-zÀ-ÿ]+)|([A-z0-9>]+)\.([\x{00A0} ])\.([\x{00A0} ])\.([\x{00A0} ])([\!:;,\?\–])

Find: ([^A-Z][\.\?\!”])([A-Z])|([,:;\)])([A-Za-z“])|([”])([a-z])|([a-z\>])\(|,”,|Scribe, Inc|<crt(.*)(<url>www\.zondervan\.com</url>\. The)

Find: ([a-zÀ-ÿ0-9])</(p|pf|psec|paft|pcon|rf|rf1|rf2|rff)>\n|</([ib])></(rf|rf1|rf2|rff)>\n|([;,\–\-\—])</([a-z]+)h

Find: <([^>/]*)>[^A-Za-z0-9\n]?</\1>

Italic Commas and Periods

Find: ,</i>

Find: </i>,

Find: \.</i>

Find: </i>\.

Find: Jr</i>\.|Dr</i>\.|<i>St</i>\.|<i>Ms</i>\.|<i>Mr</i>\.|<i>Mrs</i>\.|<i>([Ee])t al</i>\.|U\.S</i>\.|U\.S\.A</i>\.|C\.E</i>\.|B\.C\.E</i>\.|B\.C</i>\.|A\.D</i>\.|([Ii])bid</i>\.|([Ii])\.e</i>\.|([Ee])\.g</i>\.|([Oo])p\. cit</i>\.|([Ee])tc</i>\.|Inc</i>\.|Bros</i>\.

Unexpected Character Patterns

Find: --|–-|—-|-–|-—|([a-z0-9]+)\||- -|'|“ | ”|\),[0-9]|([A-z0-9]+)−([A-z0-9]+)|“’

Find: [A-Za-z]<i>[A-Za-z]|[A-Za-z]</i>[A-Za-z]|\.([A-z])\.([A-z])</([a-z]+)>\.

Find: </(ct|ctfm|ctbm|cs|cn|pn|pt|ps|ut|un|us|ept|au|au1)>\n([ ]*)<ah[^a]|<structure>([^{])

Find: ([ \t])([0-9]{1,3})</toc([^e])|([ \t])([0-9ivxl]+)</tocfm|([ \t])([0-9]{1,3})</tocill|([ \t])([ivxl]+)</tocill

Find: <dropcap>([A-z])([A-z])|<dropcap>([\[\(“‘\¿\¡])([A-z])([A-z])

Find: ([ ])–([0-9A-z])|>–([0-9A-z])|([A-z0-9])—\. ([A-z0-9]+)

Find: </i>: <i>

Find: ([\d][\d][\d][\d])-([^0-9])|([\d][\d][\d][\d])-([\d][\d][\d][\d])

Find: n´t|´s|s´

Find: </fnnum>([\t \.]+)([a-z])

Spaces

Find: ( )(\x{a0})|(\x{a0})( )|(\x{a0})<([\/a-z0-9\-]+)>( )|( )<([\/a-z0-9\-]+)>(\x{a0})

Find: ([ \x{a0}])(\t)|(\t)([ \x{a0}])|\x{00A0} | \x{00A0}|([\x{00A0}])([\x{00A0} ])|([\x{00A0} ])([\x{00A0}])

Find: ( )<([ef]nref)([^>]*>[^<]*</\2>)

Find: ^( *)(<[^>]*>)( )|( )$

Find: ([^ ]\||\|[^ ])

Find: ^( *)(<[^\n]*?)( ){2,}

Bible References

Find: ([0-9]+): ([0-9]+)

Incorrect Line Breaks

Find: ^[ ]*<[^>]*>[a-z]

URLs

Find: (<url( href[^>]*)?>[^<]*)([\x{2013}\x{2014} ])

Find: <url>([ \.\(\[])|([ ,\.\)\]])</url>

Find: ([A-Za-z0-9\.\-:/]+\.(?!jpg|tif|eps|png|svg|jpeg)[A-Za-z]{2,})([^ <"\n]*[^ ><"”'’\)\],;:\.–\n—\?])?

Find: ([^ \<\"\>])http|>([Aa])mazon\.com</url

Find: ([ ><"“'‘\(\[–\n—])(@[a-zA-Z0-9_]{1,15})

Process the file to ePub 3 in the Digital Hub.

Open the e-book in Kindle Previewer.

Go to File > Run Quality Checks

Click Open Report.

ISBNs and Zip Codes

Angle Brackets

Find: &#x3e;|&#x3c;|&#62;|&#60;|&gt;|&lt;|<<|>>

Typesetter Spaces

Find: &#173;|&#819[2-9];|&#820[0-4];|&#8239;

Find: &#x00AD;|&#x200[0-9A-C];|&#x202F;

Find: [\x{ad}\x{2000}-\x{2009}\x{200a}-\x{200c}\x{202f}]

Find: [^\.](&#160;|&#x00A0;|\x{a0})[^\.]|(&#8205;|&#x200D;|\x{200d})

Hyphen Spacing

Find: - | -

Potentially Incorrect Hyphenation

Find: [A-zÀ-ÿ]+-[A-zÀ-ÿ]+

Missing Spaces around Tags and Commas

Find: (</[^>]+>)([A-Za-zÀ-ÿ]+)|([A-Za-zÀ-ÿ]+)<(?![eft]nref|page)([^/][^>]*)>|([A-z0-9>]+)\.([A-z0-9]+):

Find: ,([a-zÀ-ÿ0-9]+)([^ \n]*)

Find: (<in[12f]*>)(.*),[a-zÀ-ÿ0-9]

Scribing/Articulation

Find: <b([fl1]+)>(\t){0,1}([A-z0-9]+|<symb>[A-z0-9]+</symb>)|<b([fl1]+)><page id="p([0-9A-z]+)"/>(\t){0,1}([A-z0-9]+|<symb>[A-z0-9]+</symb>)

Find: <u([fl1]+)>(\t){0,1}([0-9]+)([\.\)])|<u([fl1]+)><page id="p([0-9A-z]+)"/>(\t){0,1}([0-9\.]+)([\.\)])

Small Caps

Find: <[^>]*sm[^>]*>[^<]*</[^>]*sm[^>]*>

Tetragrammaton

Find: <[^>]*tetr[^>]*>[^<]*</[^>]*tetr[^>]*>

Alt Text

Find: <img([^<]*)/>

Find: alt="(Presentation|presentation)"

Italic Terms, Phrases, and Titles

Find: <(i|tnw|cite|em)>([^<]*)</\1>

Self-Closing Note Reference Tags

Find: <([fe])nref/>|<([fe])nnum/>

Self-Closing and Unnecessary Tags (.sam/.scml)

Find: <(?!cell|img|page)[^<]*/>|</([^>]*)>[^A-Za-z0-9\n]?<\1>|<([^>/]*)>[^A-Za-z0-9\n]?</\2>

Index Section (.scml files)

Position of Tags and Spaces (.sam/.scml)

Find: ^( *)(<[^\n]*?)( )(</[^>]*>)|(<[^/|^>]*>)( )

Page IDs (.sam/.scml)

Find: (<xref.*?>.*?)(<page id=".*?"/>)(.*?</xref>)|(</url>)(<page id=".*?"/>)(<url>)

Find: <([^/>]*)>(<page[^>]*>)</\1>

Find: [a-z]+<page id="([^<]*)"/>[a-z]+

Find: <url>([^<]*)</url><page id="p([0-9A-z]+)"/><url>([^<]*)</url>
Replace with: <url href="\1\3">\1</url><page id="p\2"/><url href="\1\3">\3</url>

Page references (.scml)

Find: [^</][Pp]age

Single-Chapter Bible Books (.scml)

Find: (<xbr t=")(Ob|Phm|2Jn|3Jn|Jud|Pr Az|Bel|Sus|Pr Man|LJe)( )([2-9]|[0-9]{2,})(:)([0-9-]+")

Blind Notes Pairs

DTD Validation Troubleshooting (.sam/.scml)