pycantonese.word_segmentation.Segmenter

class pycantonese.word_segmentation.Segmenter(*, max_word_length=5, allow=None, disallow=None)[source]

A customizable word segmentation model.

New in version 3.0.0.

Methods

fit(sents)

Train the model with the input segmented sentences.

predict(sent_strs)

Segment the given unsegmented sentences.

__init__(*, max_word_length=5, allow=None, disallow=None)[source]

Initialize a Segmenter object.

Parameters
max_word_lengthint, optional

Maximum word length this model allows.

allowiterable[str], optional

Words to allow in word segmentation.

disallowiterable[str], optional

Words to disallow in word segmentation.

Methods

__init__(*[, max_word_length, allow, disallow])

Initialize a Segmenter object.

fit(sents)

Train the model with the input segmented sentences.

predict(sent_strs)

Segment the given unsegmented sentences.