javascript - REGEX to capture sentences with quotes -


i having trouble putting regex match quotes , sentences. here (simplified) specs trying meet:

  • a sentence chain of characters followed punctuation mark (a dot, keep things simple) or newline.

  • a quote chain of characters between 2 ".

  • each sentence should new match.

  • a sentence can contain quotes, , quotes can contain sentences. last sentence in quote should end capture.

so far have come this: \s*((?:("[^"]*")|[^.\n])*\.+"?)\s*

test case: regex101

as can see can't separate quotes sentences. example:

§2: "your lordship," mya informed lord robert, "lady waynwood’s banners have been seen hour down road. here soon, cousin harry. want greet them" should full match, regex gives me 3 , captures next paragraph.

§3: "they invited," said uncertainly, "for tourney. don’t..." should stop full match , regex goes on capture alayne closed book.

i can't figure out going wrong, appreciated.

edit: desired output

regex101

((?![.\n\s])[^.\n"]*(?:"[^\n"]*[^\n".]"[^.\n"]*)*(?:"[^"\n]+\."|\.|(?=\n))) 

splitting up:

  • (?![.\n\s]) - first check starting valid character (not whitespace or end of sentence.
  • [^.\n"]* - match text not surrounded in quotes not contain sentence terminator.
  • (?:"[^\n"]*[^\n".]"[^.\n"]*) - match (in non-capturing group) quote contains @ least 1 character , not contain newline , not end quote sentence terminator - followed zero-or-more characters not in quote , not contain sentence terminator.
  • * - previous non-capturing group can repeated 0 (so there can sentences without quotes) -or-more times.
  • (?:"[^"\n]+\."|\.|(?=\n)) - finally, include either quote terminates full stop or full stop @ end of sentence or check ending newline.

Comments

Popular posts from this blog

How has firefox/gecko HTML+CSS rendering changed in version 38? -

javascript - Complex json ng-repeat -

jquery - Cloning of rows and columns from the old table into the new with colSpan and rowSpan -