Desarrollo de un modelo de datos de rendimiento para Cassandra

DataStax está trabajando en la construcción de un modelo de datos de rendimiento para Apache Cassandra. Qué es este trabajo y cómo hacerlo correctamente, en la conferencia Cassandra Day Rusia 2021, dijo Artyom Chebotko, arquitecto de soluciones de DataStax.







imagen







Apache Cassandra. DataStax. use cases, . .

. , Cassandra , , . . 3 , . , , .







Cassandra



Cassandra , , KEYSPACE — . . , replication strategy, - replication factors .







imagen







DC-WEST — - replication factor 3. DC-EAST replication factor 5. KEYSPACE. , KEYSPACE, replication strategy.







KEYSPACE . Create Table — .







imagen







. SQL: 4 , 4 . primary key — — , , 2 . — year. , partition key, . — name. clustering key, , .







imagen







Partition key YEAR , . . YEAR partition key. partition. , 2015 partition, 2015 partition. - .







imagen







— Cassandra , , , replication factor. , partition — - 3 , - 5 . 1- partition 3 . partition key Cassandra , , , .







imagen







KEYSPACE, — Cassandra Query Language, Structured Query Language, SQL.







, Create Table, .







imagen







partition key, , primary key partition key , , clustering key. , clustering key.







, . , . , , - , . partition, partition.







clustering order by — , partition, . , , clustering key. Cassandra , . , , , .







imagen







, partitions. , primary key. primary key ID, partition key. partition . . « , » — Single-Row Partitions. , Cassandra. partitions , 1. Multi-Row Partitions.







imagen







, partition key, clustering key, Cassandra, . . . . 10 , . partition partition - .







partition key. Venue year — «» «». DataStax Accelerate. partition key . , — - . title, — . .







Country , partition, . , .







. . ? , 5 , , K — partition key, — clustering key, — ascending descending, , — . S — .







imagen







, . , CQL. SQL: select, from, where, group by, order by, limit. allow filtering — .







imagen







Select — , from — . Cassandra . . , join — , union — , intersection — . , 2 , . , , , join, , join.







where — , primary key. partition key — . — — clustering key, , /. . use cases, , .







Group by primary key , .

Order by — . Cassandra , . , . , . . .







Limit — .







llow filtering — , . , . , , , , .







, artefacts_by_venue.







imagen







artefacts, venue - , year - , partition key. partition key clustering key — . clustering key. : partition key clustering key.







, .







imagen







, venue. partition key, Cassandra , , . partition key, clustering key.







venue, year — partition key, title , primary key, . Country. . , , .







imagen







Primary key , . -, , partition key, partition , , partition. .







clustering key ( ). , join, - , , . , , , . .









— . , , . .







imagen







— . . — . , — , . , . — , ( ). , .







, , . , — access patterns . . , , , , . . , , .







- — — .







, Cassandra , (consistency) , , . — join . , .







, — , , , . , .







imagen







4 :







  1. .
  2. , , .
  3. , .
  4. .


:







  1. Conceptual Data Model.
  2. Application Workflow Model.
  3. Logical Data Model.
  4. Physical Data Model.


- : Entity-Relationship Diagram (-), Application Workflow Diagram ( ), Chebotko Diagram Chebotko Diagram&CQL.







. — .







, : « — Conceptual Data Model Application Workflow Model»? . , , . , . , , .







: ? consistency level , ?



: , . . , . ? partition key, Cassandra- , . 100 , replication factor 3, partition key , 3 — . secondary index partition key, 100 , .



?

  1. partition key
  2. . , OLTP-, , . Cassandra, -. . - Cassandra — Spark, - . - -, , , .




consistency level . , . .



, , .




DataStax Academy , 2. , . , : , .









— Internet of Things . ? - , , . , , , , . - , , , . .







imagen







.







, . , ?







imagen







, - . - — .







, , . , . , , - .







, . — , . — , . , . ID — . , — . — — , , : , , . , .







, , , — . , . ID timestamp - . — timestamp — .







, Entity-Relationship (-), . , . , .







imagen







Application Workflow Model — . : , .







Application Workflow . . . : - — , . , - , . . - data access pattern. , , batch.







4 , 4 4 . , , 1 — . ?







  1. .
  2. . ? . , . . .
  3. : .
  4. : .


. , . ? : . — : /. clustering key, partition key. , . , , ID .







imagen







, . , , Application Workflow. — . — , . , , DataStax Academy.







sensors_bynetwork — . Network — partition key, partition. Temperatures by_sensor — , timestamp. , + . timestamp clustering key, . , . .







imagen







, ? , . — . 3 . — . bucket — partition key, name — clustering key. partition . partition. Bucket — , , partition.







: networks — . , partition.







? week — . . partition key. . partition , partition . ? — , . , , . , .







, , 100 000 100 . . , 5 , - 100 . 100 000 - — 10 . - 100 000 — 1 . .







, ? , , — 24 . , . 1 000 — 24 * 1 000 = 24 000 . , , . , . . .







— . — . timestamp — .







: , like - , ?



secondary indexes, , , secondary indexes . , , Cassandra . , , , . , — solar indexes, Cassandra, .

, — . , CQL. . . , KEYSPACE, . , , , , , partition key, clustering key — . — CQL , , Stargate API — .







imagen







2 : , . , , . , partition, .. bucket = all. , , , partition.







. forest-net, , . : network = forest-net, -. - . . .







, , ? ? 2 partition, 2 . , . 2 : , . . , in, . in, , 2 . , .







, , , .









. , . .







imagen







, . — . , . , — «» «». - . , mutual funds ( ), ETF (Exchange-traded fund). . , .







. keys, username, , , — . . , . , . -, , : , . , .







imagen







Workflow — 3 . . , , . — . , . . 5 . , 5 , . , . — . — : . — + + + . — + + . .







, ?







imagen







4 3- . 3.1 3.2. , , , . Trade_id — id . , : . partition — , trade_id.







, . ? . — . — . , .







, trades_by_a_d ? ? , — . , . , , 100 000 — . — — . , , , 100 000 .







imagen







, — trade_id . Trade_id — TIMEUUID. UUID — . timestamp, . , .







, - . .







imagen







? , TIMEUUID? TIMEUUID timestamp .







imagen







, , , . TIMEUUID — , .







, — TIMEUUID, . trade_id > maxTIMEUUID — , , . , timestamp. timestamp . .







: . ?



: ? — update insert . , . : trades — 4 , , -. -. ? baches, . baches , , baches, partition, . .



partition , . insert application retry, - . - — - , - , . Spark , , . join Spark, .



All Articles