Categorical Vocabulary



polyaxon.processing.categorical.CategoricalVocabulary(unknown_token='<UNK>', support_reverse=True)

Categorical variables vocabulary class.

Accumulates and provides mapping from classes to indexes. Can be easily used for words.


freeze(self, freeze=True)

Freezes the vocabulary, after which new words return unknown token id.

  • Args:
    • freeze: True to freeze, False to unfreeze.


get(self, category)

Returns word's id in the vocabulary.

If category is new, creates a new id for it.

  • Args:

    • category: string or integer to lookup in vocabulary.
  • Returns: interger, id in the vocabulary.


add(self, category, count=1)

Adds count of the category to the frequency table.

  • Args:
    • category: string or integer, category to add frequency to.
    • count: optional integer, how many to add.


trim(self, min_frequency, max_frequency=-1)

Trims vocabulary for minimum frequency.

Remaps ids from 1..n in sort frequency order. where n - number of elements left.

  • Args:
    • min_frequency: minimum frequency to keep.
    • max_frequency: optional, maximum frequency to keep. Useful to remove very frequent categories (like stop words).


reverse(self, class_id)

Given class id reverse to original class name.

  • Args:

    • class_id: Id of the class.
  • Returns: Class name.

  • Raises:

    • ValueError: if this vocabulary wasn't initialized with support_reverse.