textbox - Inconsistencies in caret position, string length and matches index in C# -
i trying selected word in scintilla textbox using regex, , noticing inconsistencies between reported string length, matches index , caret position or start of selection:
private keyvaluepair<int, string> get_current_word() { int cur_pos = scin_txt.selection.start; keyvaluepair<int, string> kvp_word = new keyvaluepair<int, string>(0, ""); matchcollection words = regex.matches(scin_txt.text, @"\b(?<word>\w+)\b"); foreach (match word in words) { int start = word.index; int end = start + word.length; if (start <= cur_pos && cur_pos <= end) { kvp_word = new keyvaluepair<int,string>(start, word.value); break; } } return kvp_word; }
in short, splitting string in words , using matches indexes see if caret contained within word.
unfortunately, numbers don't seem match properly:
scin_txt contains string:
"le clic droit été désactivé pour cette image. j"
this string 49 characters long, textlength
property returns 53 , selection.start
(or caret.position
, same result) property returns 52. caret @ last position in string , there (to knowledge) no spaces or invisible characters after letter "j".
meanwhile regex match indexes , length seem correct.
is bug or there don't understand how lengths , selection indexes computed? there workaround find word containing caret?
the scintilla apis badly named. text
property returns bytes, rather text, , textlength
gives number of bytes, not number of characters.
presumably, using utf-8 mode, "text" acually:
le clic droit \xc3\xa9t\xc3\xa9 d\xc3\xa9sactiv\xc3\xa9 pour cette image. j
which 53 bytes long.
edit:
if want find position of start/end of word, there's sci_wordstartposition / sci_wordendposition messages. caret positioning, there's sci_positionbefore / sci_positionafter messages, take account current code-page. (presumably these messages have functional equivalents in api of particular scintilla binding using - or perhaps generic sendmessage
function accessing them).
Comments
Post a Comment