2020-05-19

gcp►b-data-engineer

b04-Bigtable

Cloud Bigtable

High performance, massively scalable NoSQL
Ideal for large analytical workloads

Bigtable infrastructure

Front-end server pool serves requests to nodes
Compute and Storage are separate, No data is stored on the node except for metadata to direct requests to the correct tablet
tables are shards into tablets. They are stored on Colossus, google’s filesystem. as storage is separate from compute node,
replication and recovery of node data is very fast, as only metadata/pointers need to be updated

instances

Entire bigtable project called ‘instance’

cluster

Nodes grouped into clusters

1 or more cluster per instance

instance type

Development - low cost, single node, on replication
Production - 3+ nodes per cluster, replications available

Schema Design

Per table - Row key is only indexed item

Hands on

install cbt in Google Cloud SDK

1 2	gcloud components update glcoud components install cbt

set env variable

1	echo -e "project=[PROJECT_ID]\ninstance=[INSTANCE_ID]">~/.cbtrc

create table

1	cbt createtable my-table

list table

cbt ls

add column family

1	cbt createfamily my-table cf1

list column family

1	cbt ls my-table

add value to row1, column family cf1, column qualifier c1

1	cbt set my-table r1 cf1:c1=testvalue

read table

1	cbt read my-table

delete table

1	cbt deletetable my-table

MA Jian's Blog

Enthussiasm in developing

b04-Bigtable

Cloud Bigtable

Bigtable infrastructure

instances

cluster

instance type

Schema Design

Hands on

install cbt in Google Cloud SDK

set env variable

create table

list table

add column family

list column family

add value to row1, column family cf1, column qualifier c1

read table

delete table