Replication & Materialization

CData Virtuality Server は、幅広いユースケースをカバーするいくつかのタイプのレプリケーションをサポートしています。このセクションでは、各タイプの背後にあるアルゴリズムの概念を説明し、それらがどのように使用できるかを示します。

Materialization

例えば、データソースのテーブルのコピー、ビューの内容のコピー、結合や集約の結果のコピーなどです。マテリアライゼーションはAnalytical Storage のテーブルに保存されますが、このテーブルは直接使用するものではありません。むしろ、推奨最適化と呼ばれる特別なルールが、CData Virtuality Sources のクエリエンジンに、ソースからのデータではなく、計算のデータを使用するように指示する場合に使用されます。Materialization の考え方は、ユーザーがデータのロジックに完全に集中できるようにすることであり、データがどこにどのように保存されるかは考慮しません。これはCData Virtuality Server によって処理されます。

Materialization には、Complete Materialization と Incremental Materialization の 2 種類があります。

Replication

レプリケーションは通常、コピーするデータがデータソースや論理レイヤーのどこにも1対1の対応がなく、自動プロセスまたは手動で作成または変更される場合に採用されます。CData Virtuality Server には、このような自動処理がいくつか含まれています：ゆっくりと変化するディメンションタイプ 2 (History Update)、Upsert、BatchUpdate など。カスタムSQLプロセスを作成してデータをReplicationすることもできます。

レプリケーションにはいくつかの種類があり、それぞれのサブページで詳しく説明しています。

Choosing Between Materialization and Replication

一般的に、ベストプラクティスは、デフォルトでMaterializationを使用し、データをより直接的に制御および操作する必要がある場合にのみReplicationを使用することです。

MaterializationとReplicationを選択する際の経験則は次のとおりです：

データを格納することによってデータのロジックを変更するのではなく、純粋にパフォーマンス上の理由から既存のロジックの1:1のコピーを作成する場合は、マテリアライゼーションを使用するのが最適です。通常、マテリアライゼーションは、データのロジックを変更することなく、いつでも作成、削除、または完全に再ロードすることができます。
データを保存することでデータのロジックを変更し、データをいつでも簡単に再読み込みできないようにするのであれば、レプリケーションは適切なツールです。

注意すべき重要な点として、設定が正しく構成されていない場合や、ソース・スキーマに従って ID フィールドが正しく選択されていない場合、Complete Materialization 以外のすべてのタイプで重複データが発生する可能性があります。完全レプリケーションでは、ソースにすでに複製が存在する場合にのみ複製が作成されます。

Comparison of Replication Types Based on Operations on Source Data

To view the full table, click the expand button in its top right corner

Type	Process	Description
Materialization
Complete	`INSERT` `UPDATE` `DELETE`	Row added/updated/deleted in materialized table
Incremental	`INSERT`	Row added to a materialized table if its Row check field fulfils the `WHERE` requirement (Subject to Delete old data setting)
Incremental	`UPDATE`	Updated Row is inserted into a materialized table if its Row check field fulfils the `WHERE` requirement (Subject to Delete old data setting). If no identity requirement is set, the existing row will remain in the materialized table
Incremental	`DELETE`	Row remains in the materialized table
Replication
Batch	`INSERT`	Row added to a materialized table. If no identity requirement is set, additional duplicates of existing rows may come with the replication
Batch	`UPDATE`	Updated row is inserted into a materialized table. If no identity requirement is set, the existing row will remain in the materialized table and additional duplicates of existing rows may come with the replication
Batch	`DELETE`	Row remains in the materialized table. If no identity requirement is set, additional duplicates of existing rows may come with the replication
History Update	`INSERT`	Row added to a materialized table
History Update	`UPDATE`	Row added to a materialized table and existing row gets an update on totimestamp. Only performed, when the update happened on one of the fields selected as Columns to check
History Update	`DELETE`	Existing row gets an update on totimestamp
Copy Over	`INSERT` `UPDATE` `DELETE`	Row added/updated/deleted in materialized table
Upsert Update	`INSERT`	Row added in a materialized table if specified via `keyColumnsArray` and `updateColumns`, otherwise no action
Upsert Update	`UPDATE`	Row updated in a materialized table if specified via `keyColumnsArray` and `updateColumns`, otherwise no action
Upsert Update	`DELETE`	No action

Comparison of Replication Types Based on Transparency and Flexibility