pt_kwic
is an useful function that allows you to extract
words placed before and after a keyword. It is similar to
kwic
from quanteda package, with two
important differences: it is dedicated to work only with
Portuguese texts in a way that ignores diacritics and
is case insensitive; also it returns each word in a
separate column as a default.
pt_kwic( string, id_decision = NULL, keyword = NULL, before = 5, after = 5, unite = TRUE )
string | a vector of texts from which to search for the keyword. |
---|---|
id_decision | a vector of id_decisions. If ommited, it defaults to text1,text2, text3..., |
keyword | you can provide a regex expression or a vector of regex expressions. |
before | Number of words before the keyword. Default is 5 |
after | Number of words after the keyword. Default is 5 |
unite | if FALSE, places every previous and posterior word in separate column |
a tbl with id_decision, keyword location (start and end), the keyword, the previous words, and the posterior words.