object HeavyHitters extends SettingsBuilder with Serializable
Transform a collection of categorical features to 2 columns, one for rank and one for count. Only the top heavyHittersCount items are tracked, with 1.0 being the most frequent rank, 2.0 the second most, etc. All other items are transformed to [0.0, 0.0].
Ranks and frequencies are estimated with Algebird's SketchMap data structure. With probability at
least 1 - delta
, this estimate is within eps * N
of the true frequency (i.e., true frequency
<= estimate <= true frequency + eps * N
), where N is the total size of the input collection.
Missing values are transformed to [0.0, 0.0].
- Source
- HeavyHitters.scala
- Alphabetic
- By Inheritance
- HeavyHitters
- Serializable
- Serializable
- SettingsBuilder
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
apply(name: String, heavyHittersCount: Int, eps: Double = 0.001, delta: Double = 0.001, seed: Int = Random.nextInt): Transformer[String, SketchMap[String, Long], Map[String, (Int, Long)]]
Create a new HeavyHitters instance.
Create a new HeavyHitters instance.
- heavyHittersCount
number of heavy hitters to keep track of
- eps
one-sided error bound on the error of each point query, i.e. frequency estimate
- delta
a bound on the probability that a query estimate does not lie within some small interval (an interval that depends on
eps
) around the truth- seed
a seed to initialize the random number generator used to create the pairwise independent hash functions
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
def
fromSettings(setting: Settings): Transformer[String, SketchMap[String, Long], Map[String, (Int, Long)]]
Create a new HeavyHitters from a settings object
Create a new HeavyHitters from a settings object
- setting
Settings object
- Definition Classes
- HeavyHitters → SettingsBuilder
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()