case class ZetaSketchHllPlusPlus[T](p: Int = HllCount.DEFAULT_PRECISION)(implicit zs: ZetaSketchable[T]) extends ApproxDistinctCounter[T] with Product with Serializable
com.spotify.scio.estimators.ApproxDistinctCounter implementation for org.apache.beam.sdk.extensions.zetasketch.HllCount. HllCount estimate the distinct count using HyperLogLogPlusPlus (HLL++) sketches on data streams based on the ZetaSketch implementation.
The HyperLogLog++ (HLL++) algorithm estimates the number of distinct values in a data stream. HLL++ is based on HyperLogLog; HLL++ more accurately estimates the number of distinct values in very large and small data streams.
- p
Precision, controls the accuracy of the estimation. The precision value will have an impact on the number of buckets used to store information about the distinct elements. should be in the range
[10, 24]
, default precision value is15
.
- Alphabetic
- By Inheritance
- ZetaSketchHllPlusPlus
- Serializable
- Product
- Equals
- ApproxDistinctCounter
- AnyRef
- Any
- Hide All
- Show All
- Public
- Protected
Instance Constructors
- new ZetaSketchHllPlusPlus(p: Int = HllCount.DEFAULT_PRECISION)(implicit zs: ZetaSketchable[T])
- p
Precision, controls the accuracy of the estimation. The precision value will have an impact on the number of buckets used to store information about the distinct elements. should be in the range
[10, 24]
, default precision value is15
.
Value Members
- final def !=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def ##: Int
- Definition Classes
- AnyRef → Any
- final def ==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- final def asInstanceOf[T0]: T0
- Definition Classes
- Any
- def clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.CloneNotSupportedException]) @native()
- final def eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- def estimateDistinctCount(in: SCollection[T]): SCollection[Long]
Return a SCollection with single (Long)value which is the estimated distinct count in the given SCollection with type
T
Return a SCollection with single (Long)value which is the estimated distinct count in the given SCollection with type
T
- Definition Classes
- ZetaSketchHllPlusPlus → ApproxDistinctCounter
- def estimateDistinctCountPerKey[K](in: SCollection[(K, T)]): SCollection[(K, Long)]
Approximate distinct element per each key in the given key value SCollection.
Approximate distinct element per each key in the given key value SCollection. This will output estimated distinct count per each unique key.
- Definition Classes
- ZetaSketchHllPlusPlus → ApproxDistinctCounter
- def finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.Throwable])
- final def getClass(): Class[_ <: AnyRef]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- final def isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- final def ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
- final def notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- final def notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- val p: Int
- def productElementNames: Iterator[String]
- Definition Classes
- Product
- final def synchronized[T0](arg0: => T0): T0
- Definition Classes
- AnyRef
- final def wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException])
- final def wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws(classOf[java.lang.InterruptedException]) @native()