object NGrams extends SettingsBuilder with Serializable

Transform a collection of sentences, where each row is a Seq[String] of the words / tokens, into a collection containing all the n-grams that can be constructed from each row. The feature representation is an n-hot encoding (see NHotEncoder) constructed from an expanded vocabulary of all of the generated n-grams.

N-grams are generated based on a specified range of low to high (inclusive) and are joined by the given sep (default is " "). For example, with low = 2, high = 3 and sep = "", row ["a", "b", "c", "d", "e"] would produce ["ab", "bc", "cd", "de", "abc", "bcd", "cde"].

As with NHotEncoder, missing values are transformed to [0.0, 0.0, ...].

Source
NGrams.scala
Linear Supertypes
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. NGrams
  2. Serializable
  3. Serializable
  4. SettingsBuilder
  5. AnyRef
  6. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. def apply(name: String, low: Int = 1, high: Int = -1, sep: String = " "): Transformer[Seq[String], Set[String], SortedMap[String, Int]]

    Create a new NGrams instance.

    Create a new NGrams instance.

    low

    the smallest size of the generated *-grams

    high

    the largest size of the generated *-grams, or -1 for the full length of the input Seq[String]

    sep

    a string separator used to join individual tokens

  5. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  6. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  7. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  8. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  9. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  10. def fromSettings(setting: Settings): Transformer[Seq[String], Set[String], SortedMap[String, Int]]

    Create a new NGrams from a settings object

    Create a new NGrams from a settings object

    setting

    Settings object

    Definition Classes
    NGramsSettingsBuilder
  11. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  12. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  13. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  14. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  15. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  16. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  17. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  18. def toString(): String
    Definition Classes
    AnyRef → Any
  19. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  20. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  21. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()

Inherited from Serializable

Inherited from Serializable

Inherited from SettingsBuilder

Inherited from AnyRef

Inherited from Any

Ungrouped