Citus official documentation is very good, the information is very comprehensive, and contains a multi-tenant application documentation, so it is easy to learn
Environment preparation
Run with docker-compose and integrate the graphql engine for ease
version: ‘2.1‘services: graphql-engine: image: hasura/graphql-engine:v1.0.0-alpha26 ports: - "8080:8080" command: > /bin/sh -c " graphql-engine --database-url postgres://[email protected]/postgres serve --enable-console; " master: container_name: "${COMPOSE_PROJECT_NAME:-citus}_master" image: ‘citusdata/citus:7.5.1‘ ports: ["${MASTER_EXTERNAL_PORT:-5432}:5432"] labels: [‘com.citusdata.role=Master‘] worker: image: ‘citusdata/citus:7.5.1‘ labels: [‘com.citusdata.role=Worker‘] depends_on: { manager: { condition: service_healthy } } manager: container_name: "${COMPOSE_PROJECT_NAME:-citus}_manager" image: ‘citusdata/membership-manager:0.2.0‘ volumes: [‘/var/run/docker.sock:/var/run/docker.sock‘] depends_on: { master: { condition: service_healthy } }
curl https://examples.citusdata.com/tutorial/companies.csv > companies.csvcurl https://examples.citusdata.com/tutorial/campaigns.csv > campaigns.csvcurl https://examples.citusdata.com/tutorial/ads.csv > ads.csv
CREATE TABLE companies ( id bigint NOT NULL, name text NOT NULL, image_url text, created_at timestamp without time zone NOT NULL, updated_at timestamp without time zone NOT NULL);CREATE TABLE campaigns ( id bigint NOT NULL, company_id bigint NOT NULL, name text NOT NULL, cost_model text NOT NULL, state text NOT NULL, monthly_budget bigint, blacklisted_site_urls text[], created_at timestamp without time zone NOT NULL, updated_at timestamp without time zone NOT NULL);CREATE TABLE ads ( id bigint NOT NULL, company_id bigint NOT NULL, campaign_id bigint NOT NULL, name text NOT NULL, image_url text, target_url text, impressions_count bigint DEFAULT 0, clicks_count bigint DEFAULT 0, created_at timestamp without time zone NOT NULL, updated_at timestamp without time zone NOT NULL);
ALTER TABLE companies ADD PRIMARY KEY (id);ALTER TABLE campaigns ADD PRIMARY KEY (id, company_id);ALTER TABLE ads ADD PRIMARY KEY (id, company_id);
Citus Distributed Processing
- Add a distributed table
It is very convenient, that is, the SELECT statement. You can call the function.
SELECT create_distributed_table(‘companies‘, ‘id‘);SELECT create_distributed_table(‘campaigns‘, ‘company_id‘);SELECT create_distributed_table(‘ads‘, ‘company_id‘);
- Import Data
After the citus environment is up, you can use the function to import data.
- Effect
SELECT campaigns.id, campaigns.name, campaigns.monthly_budget, sum(impressions_count) as total_impressions, sum(clicks_count) as total_clicksFROM ads, campaignsWHERE ads.company_id = campaigns.company_idAND campaigns.company_id = 5AND campaigns.state = ‘running‘GROUP BY campaigns.id, campaigns.name, campaigns.monthly_budgetORDER BY total_impressions, total_clicks;
Data Model Description
In fact, the core of the above is to create a distributed table, using create_distributed_table, and defining the multi-tenant data isolation ID company_id
The subsequent operations are basic SQL operations, and some good practices for citus multi-tenant application development will be introduced later.
References
Https://docs.citusdata.com/en/v7.5/get_started/tutorial_multi_tenant.html
Https://docs.citusdata.com/en/v7.5/sharding/data_modeling.html#distributing-by-tenant-id
Https://github.com/hasura/graphql-on-various-pg
Https://github.com/rongfengliang/citus-hasuar-graphql
Citus multi-tenant application development (Official Document)