PyCantonese Logo
3.4.0
  • Quickstart
  • Corpus Data
    • CHAT Format
    • Built-in Data
    • CHILDES and TalkBank Data
    • Custom Data
  • Corpus Reader Methods
    • Headers
    • Transcriptions and Annotations
      • Jyutping Romanization
      • Chinese Characters
    • Word Frequencies and Ngrams
  • Corpus Search Queries
    • Searching by a Jyutping Element
    • Searching by a Chinese Character
    • Searching by a Part-of-speech Tag
    • Searching by a Word or Utterance Range
    • Searching by Multiple Criteria
    • Output Format of Search Results
    • Complex Searches
  • Parsing Cantonese Text
    • Input 1: A Plain String
    • Input 2: A List of Strings
    • Input 3: A List of Tuples of Strings
    • Customizing Word Segmentation
    • Customizing Part-of-Speech Tagging
    • Outputting CHAT Data
    • More Customization
  • Jyutping Romanization
    • Characters-to-Jyutping Conversion
    • Parsing Jyutping Strings
    • Jyutping-to-Yale Conversion
    • Jyutping-to-TIPA Conversion
  • Stop Words
  • Word Segmentation
    • Customizing Segmentation
  • Part-of-Speech Tagging
  • API Reference
    • Corpus Data
      • pycantonese.read_chat
      • pycantonese.hkcancor
      • pycantonese.CHATReader
        • pycantonese.CHATReader.search
      • pycantonese.CHATReader.search
    • Jyutping Romanization
      • pycantonese.characters_to_jyutping
      • pycantonese.parse_jyutping
      • pycantonese.jyutping_to_yale
      • pycantonese.jyutping_to_tipa
    • Natural Language Processing
      • pycantonese.stop_words
      • pycantonese.parse_text
      • pycantonese.segment
      • pycantonese.word_segmentation.Segmenter
      • pycantonese.pos_tag
      • pycantonese.pos_tagging.hkcancor_to_ud
    • CHATReader
    • Token
    • Jyutping
  • Changelog
    • [Unreleased]
      • Added
      • Changed
      • Deprecated
      • Removed
      • Fixed
      • Security
    • [3.4.0] - 2021-12-28
      • Added
      • Changed
      • Removed
      • Security
    • [3.3.1] - 2021-05-14
      • Fixed
    • [3.3.0] - 2021-05-14
      • Changed
      • Fixed
    • [3.2.4] - 2021-05-07
      • Fixed
    • [3.2.3] - 2021-04-12
      • Fixed
    • [3.2.2] - 2021-03-23
      • Fixed
    • [3.2.1] - 2021-03-21
      • Fixed
    • [3.2.0] - 2021-03-20
      • Added
      • Changed
      • Deprecated
      • Fixed
    • [3.1.1] - 2021-03-18
      • Fixed
    • [3.1.0] - 2021-02-21
      • Added
      • Fixed
    • [3.0.0] - 2020-10-25
      • Added
      • Changed
        • API-breaking Changes
        • Non-API-breaking Changes
      • Deprecated
      • Security
    • [2.4.1] - 2020-10-10
      • Fixed
    • [2.4.0] - 2020-10-10
      • Added
    • [2.3.0] - 2020-07-24
      • Added
      • Removed
    • [2.2.0] - 2018-06-30
      • Added
    • [2.1.0] - 2018-06-11
      • Added
      • Fixed
    • [2.0.0] - 2016-02-06
    • [1.0] - 2015-09-06
    • [1.0dev] - 2015-09-02
    • [0.2.1] - 2015-01-25
    • [0.2] - 2015-01-22
    • [0.1] - 2014-12-17
  • Archives
    • Tutorials
    • Research Outputs
PyCantonese
  • »
  • Search


© Copyright 2014-2022, Jackson L. Lee | Documentation last updated on June 06, 2022.

Built with Sphinx using a theme provided by Read the Docs.