pycantonese.corpus.CantoneseCHATReader.search¶
-
CantoneseCHATReader.
search
(*, onset=None, nucleus=None, coda=None, tone=None, initial=None, final=None, jyutping=None, character=None, pos=None, word_range=0, 0, sent_range=0, 0, tagged=True, sents=False, participant=None, exclude=None, by_files=False)[source]¶ Search the data for the given criteria.
For examples, please see https://pycantonese.org/searches.html.
- Parameters
- onsetstr, optional
Onset to search for. A regex is supported.
- nucleusstr, optional
Nucleus to search for. A regex is supported.
- codastr, optional
Coda to search for. A regex is supported.
- tonestr, optional
Tone to search for. A regex is supported.
- initialstr, optional
Initial to search for. A regex is supported. An initial, a term more prevalent in traditional Chinese phonology, is the equivalent of an onset.
- finalstr, optional
Final to search for. A final, a term more prevalent in traditional Chinese phonology, is the equivalent of a nucleus plus a coda.
- jyutpingstr, optional
Jyutping romanization of one Cantonese character to search for. If the romanization contains more than one character, a ValueError is raised.
- characterstr, optional
One or more Cantonese characters (within a segmented word) to search for.
- posstr, optional
A part-of-speech tag to search for. A regex is supported.
- word_rangetuple[int, int], optional
Span of words to the left and right of a matching word to include in the output. The default is (0, 0) to disable a range. If sent_range is used, word_range is ignored.
- sent_rangetuple[int, int], optional
Span of sentences before and after a sentence containing a matching word to include in the output. The default is (0, 0) to disable a range. If sent_range is used, word_range is ignored.
- taggedbool, optional
If True (the default), words in the output are in the tagged form. Otherwise just word token strings are returned.
- sentsbool, optional
If True (default is False), sentences containing matching words are returned. Otherwise, only matching words are returned.
- participantstr or iterable[str], optional
One or more participants to include in the search. If unspecified, all participants are included.
- excludestr or iterable[str], optional
One or more participants to exclude in the search. If unspecified, no participants are excluded.
- by_filesbool, optional
If True (default: False), return data organized by the individual file paths.
- Returns
- list