This depends on APIs from scio-bigtable
and imports from com.spotify.scio.bigtable._
.
Convert a key-value pair to a Bigtable Mutation
for writing
Convert a Bigtable Row
from reading to a formatted key-value string
Count words and save result to Bigtable
Usage:
sbt "runMain com.spotify.scio.examples.extra.BigtableWriteExample
--project=[PROJECT] --runner=DataflowRunner --region=[REGION NAME]
--input=gs://apache-beam-samples/shakespeare/kinglear.txt
--bigtableProjectId=[BIG_TABLE_PROJECT_ID]
--bigtableInstanceId=[BIG_TABLE_INSTANCE_ID]
--bigtableTableId=[BIG_TABLE_TABLE_ID]"
Bump up the number of bigtable nodes before writing so that the extra traffic does not affect production service. A sleep period is inserted to ensure all new nodes are online before the ingestion starts.
Ensure that destination tables and column families exist
Bring down the number of nodes after the job ends to save cost. There is no need to wait after bumping the nodes down.
Read word count result back from Bigtable
Usage:
sbt "runMain com.spotify.scio.examples.extra.BigtableReadExample
--project=[PROJECT] --runner=DataflowRunner --region=[REGION NAME]
--bigtableProjectId=[BIG_TABLE_PROJECT_ID]
--bigtableInstanceId=[BIG_TABLE_INSTANCE_ID]
--bigtableTableId=[BIG_TABLE_TABLE_ID]
--output=gs://[BUCKET]/[PATH]/wordcount"
Bigtable Input and Output