简介
GitHub 地址:https://github.com/cloudberrydb/cloudberrydb
️官网主页:https://cloudberrydb.org/
官方文档:https://cloudberrydb.org/zh/docs/
Cloudberry Database(可简称为“CBDB”或“CloudberryDB”)是面向分析和 AI 场景打造的下一代统一型开源数据库,搭载了 PostgreSQL 14.4 内核,兼容 PostgreSQL 和 Greenplum Database 生态,采用 Apache License 2.0 许可协议,由北京酷克数据HashData科技有限公司开发,目前源码已公开。
更多请参考: GreenPlum闭源?可以了解一下国产CBDB(Cloudberry Database)
快速体验
1docker pull registry.cn-hangzhou.aliyuncs.com/lhrbest/cbdb:1.5.4
2docker tag registry.cn-hangzhou.aliyuncs.com/lhrbest/cbdb:1.5.4 lhrbest/cbdb:1.5.4
3
4
5docker rm -f cbdb
6docker run -d --name cbdb -h cbdb \
7 -p 24432:5432 -p 2422:22 \
8 -v /sys/fs/cgroup:/sys/fs/cgroup \
9 --privileged=true lhrbest/cbdb:1.5.4 \
10 /usr/sbin/init
11
12
13 docker exec -it cbdb bash
14
15
16
17 su - gpadmin
18 gpstart -a
日志:
1[root@alldb ~]# docker rm -f cbdb
2Error response from daemon: No such container: cbdb
3[root@alldb ~]# docker run -d --name cbdb -h cbdb \
4> -p 24432:5432 -p 2422:22 \
5> -v /sys/fs/cgroup:/sys/fs/cgroup \
6> --privileged=true lhrbest/cbdb:1.5.4 \
7> /usr/sbin/init
899b7f7f12d8e547451d27e8e6ed9d96f1888c52963f98ee9374020c825ae2fdf
9[root@alldb ~]#
10[root@alldb ~]#
11[root@alldb ~]# docker exec -it cbdb bash
12[root@cbdb /]# su - gpadmin
13Last login: Mon Jul 8 13:34:40 CST 2024 on pts/0
14[gpadmin@cbdb ~]$ psql
15psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No such file or directory
16 Is the server running locally and accepting connections on that socket?
17[gpadmin@cbdb ~]$ gpstart -a
1820240708:13:55:06:000410 gpstart:cbdb:gpadmin-[INFO]:-Starting gpstart with args: -a
1920240708:13:55:06:000410 gpstart:cbdb:gpadmin-[INFO]:-Gathering information and validating the environment...
2020240708:13:55:06:000410 gpstart:cbdb:gpadmin-[INFO]:-CloudberryDB Binary Version: 'postgres (Cloudberry Database) 1.0.0+ build dev'
2120240708:13:55:06:000410 gpstart:cbdb:gpadmin-[INFO]:-CloudberryDB Catalog Version: '302402231'
2220240708:13:55:06:000410 gpstart:cbdb:gpadmin-[WARNING]:-postmaster.pid file exists on Coordinator, checking if recovery startup required
2320240708:13:55:06:000410 gpstart:cbdb:gpadmin-[INFO]:-Commencing recovery startup checks
2420240708:13:55:06:000410 gpstart:cbdb:gpadmin-[INFO]:-No socket connection or lock file in /tmp found for port=5432
2520240708:13:55:06:000410 gpstart:cbdb:gpadmin-[INFO]:-No Coordinator instance process, entering recovery startup mode
2620240708:13:55:06:000410 gpstart:cbdb:gpadmin-[INFO]:-Clearing Coordinator instance pid file
2720240708:13:55:06:000410 gpstart:cbdb:gpadmin-[INFO]:-Starting Coordinator instance in admin mode
2820240708:13:55:06:000410 gpstart:cbdb:gpadmin-[INFO]:-CoordinatorStart pg_ctl cmd is env GPSESSID=0000000000 GPERA=None $GPHOME/bin/pg_ctl -D /opt/cloudberrydb/data/coordinator/gpseg-1 -l /opt/cloudberrydb/data/coordinator/gpseg-1/log/startup.log -w -t 600 -o " -p 5432 -c gp_role=utility " start
2920240708:13:55:07:000410 gpstart:cbdb:gpadmin-[INFO]:-Obtaining CloudberryDB Coordinator catalog information
3020240708:13:55:07:000410 gpstart:cbdb:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
3120240708:13:55:07:000410 gpstart:cbdb:gpadmin-[INFO]:-Setting new coordinator era
3220240708:13:55:07:000410 gpstart:cbdb:gpadmin-[INFO]:-Commencing forced instance shutdown
3320240708:13:55:08:000410 gpstart:cbdb:gpadmin-[INFO]:-Starting Coordinator instance in admin mode
3420240708:13:55:08:000410 gpstart:cbdb:gpadmin-[INFO]:-CoordinatorStart pg_ctl cmd is env GPSESSID=0000000000 GPERA=a4c6f619cebc193a_240708135507 $GPHOME/bin/pg_ctl -D /opt/cloudberrydb/data/coordinator/gpseg-1 -l /opt/cloudberrydb/data/coordinator/gpseg-1/log/startup.log -w -t 600 -o " -p 5432 -c gp_role=utility " start
3520240708:13:55:08:000410 gpstart:cbdb:gpadmin-[INFO]:-Obtaining CloudberryDB Coordinator catalog information
3620240708:13:55:08:000410 gpstart:cbdb:gpadmin-[INFO]:-Obtaining Segment details from coordinator...
3720240708:13:55:08:000410 gpstart:cbdb:gpadmin-[INFO]:-Setting new coordinator era
3820240708:13:55:08:000410 gpstart:cbdb:gpadmin-[INFO]:-Coordinator Started...
3920240708:13:55:08:000410 gpstart:cbdb:gpadmin-[INFO]:-Shutting down coordinator
4020240708:13:55:08:000410 gpstart:cbdb:gpadmin-[INFO]:-Commencing parallel primary and mirror segment instance startup, please wait...
41.
4220240708:13:55:10:000410 gpstart:cbdb:gpadmin-[INFO]:-Process results...
4320240708:13:55:10:000410 gpstart:cbdb:gpadmin-[INFO]:-
4420240708:13:55:10:000410 gpstart:cbdb:gpadmin-[INFO]:-----------------------------------------------------
4520240708:13:55:10:000410 gpstart:cbdb:gpadmin-[INFO]:- Successful segment starts = 4
4620240708:13:55:10:000410 gpstart:cbdb:gpadmin-[INFO]:- Failed segment starts = 0
4720240708:13:55:10:000410 gpstart:cbdb:gpadmin-[INFO]:- Skipped segment starts (segments are marked down in configuration) = 0
4820240708:13:55:10:000410 gpstart:cbdb:gpadmin-[INFO]:-----------------------------------------------------
4920240708:13:55:10:000410 gpstart:cbdb:gpadmin-[INFO]:-Successfully started 4 of 4 segment instances
5020240708:13:55:10:000410 gpstart:cbdb:gpadmin-[INFO]:-----------------------------------------------------
5120240708:13:55:10:000410 gpstart:cbdb:gpadmin-[INFO]:-Starting Coordinator instance cbdb directory /opt/cloudberrydb/data/coordinator/gpseg-1
5220240708:13:55:10:000410 gpstart:cbdb:gpadmin-[INFO]:-CoordinatorStart pg_ctl cmd is env GPSESSID=0000000000 GPERA=a4c6f619cebc193a_240708135508 $GPHOME/bin/pg_ctl -D /opt/cloudberrydb/data/coordinator/gpseg-1 -l /opt/cloudberrydb/data/coordinator/gpseg-1/log/startup.log -w -t 600 -o " -p 5432 -c gp_role=dispatch " start
5320240708:13:55:10:000410 gpstart:cbdb:gpadmin-[INFO]:-Command pg_ctl reports Coordinator cbdb instance active
5420240708:13:55:10:000410 gpstart:cbdb:gpadmin-[INFO]:-Connecting to db template1 on host localhost
5520240708:13:55:10:000410 gpstart:cbdb:gpadmin-[INFO]:-Starting standby coordinator
5620240708:13:55:10:000410 gpstart:cbdb:gpadmin-[INFO]:-Checking if standby coordinator is running on host: cbdb in directory: /opt/cloudberrydb/data/coordinator_standby/gpseg-1
5720240708:13:55:10:000410 gpstart:cbdb:gpadmin-[INFO]:-CoordinatorStart pg_ctl cmd is env GPSESSID=0000000000 GPERA=a4c6f619cebc193a_240708135508 $GPHOME/bin/pg_ctl -D /opt/cloudberrydb/data/coordinator_standby/gpseg-1 -l /opt/cloudberrydb/data/coordinator_standby/gpseg-1/log/startup.log -t 600 -o " -p 5433 -c gp_role=dispatch " start
5820240708:13:55:11:000410 gpstart:cbdb:gpadmin-[INFO]:-Database successfully started
59[gpadmin@cbdb ~]$ psql
60psql (14.4, server 14.4)
61Type "help" for help.
62
63postgres=# select * from gp_segment_configuration ;
64 dbid | content | role | preferred_role | mode | status | port | hostname | address | datadir | warehouseid
65------+---------+------+----------------+------+--------+------+----------+---------+----------------------------------------------------+-------------
66 1 | -1 | p | p | n | u | 5432 | cbdb | cbdb | /opt/cloudberrydb/data/coordinator/gpseg-1 | 0
67 6 | -1 | m | m | s | u | 5433 | cbdb | cbdb | /opt/cloudberrydb/data/coordinator_standby/gpseg-1 | 0
68 2 | 0 | p | p | s | u | 6000 | cbdb | cbdb | /opt/cloudberrydb/data/primary/gpseg0 | 0
69 4 | 0 | m | m | s | u | 7000 | cbdb | cbdb | /opt/cloudberrydb/data/mirror/gpseg0 | 0
70 3 | 1 | p | p | s | u | 6001 | cbdb | cbdb | /opt/cloudberrydb/data/primary/gpseg1 | 0
71 5 | 1 | m | m | s | u | 7001 | cbdb | cbdb | /opt/cloudberrydb/data/mirror/gpseg1 | 0
72(6 rows)
73
74postgres=#
与 Greenplum 的特性对比
有关CBDB和Greenplum 的特性对比可以参考: https://cloudberrydb.org/zh/docs/cbdb-vs-gp-features
Cloudberry Database 100% 兼容 Greenplum,能提供所有你需要的 Greenplum 特性。
除此之外,Cloudberry Database 还拥有一些 Greenplum 当前不具备或不支持的特性,详见下文。
一般特性对比
说明
在以下表格中,✅ 代表支持,❌ 代表不支持。
以下表格中的对比,基于 Greenplum 7.0 Beta.3 版本。
功能名 | Cloudberry Database | Greenplum |
---|---|---|
在 EXPLAIN 的结果中查看 WAL 的使用信息 | ✅ | ❌ |
Multiranges 类型 | ✅ | ❌ |
B 树自底向上索引清理 | ✅ | ❌ |
GiST的覆盖索引 (INCLUDE ) | ✅ | ✅(待发布) |
range_agg 范围类型聚合函数 | ✅ | ❌ |
CREATE ACCESS METHOD | ✅ | ✅(待发布) |
TOAST 表上的 LZ4 压缩支持 | ✅ | ❌ |
JSONB 通过下标读取元素 | ✅ | ❌ |
配置复制插槽的最大 WAL 保留 | ✅ | ❌ |
验证备份的完整性 (pg_verifybackup ) | ✅ | ❌ |
客户端可以要求 SCRAM 通道绑定 | ✅ | ❌ |
Vacuum "紧急模式" | ✅ | ❌ |
使用 postgres_fdw 的证书认证 | ✅ | ❌ |
UPSERT | ✅ | ✅(待发布) |
COPY FROM Where | ✅ | ❌ |
VACUUM / ANALYZE 跳过锁定表 | ✅ | ❌ |
HASH 分区表 | ✅ | ❌ |
CTE (SEARCH 和 CYCLE ) | ✅ | ❌ |
存储过程 OUT 参数 | ✅ | ❌ |
外键表的外键约束 | ✅ | ❌ |
pg_terminate_backend 的超时参数 | ✅ | ❌ |
Coordinator 自动故障转移 | ✅ | ❌ |
支持在 Kubernetes 上部署 | ✅ | ❌ |
性能特性对比
功能名 | Cloudberry Database | Greenplum |
---|---|---|
并发重建索引 REINDEX CONCURRENTLY | ✅ | ❌ |
聚合运算下推 | ✅ | ❌ |
CREATE STATISTICS - OR 和 IN/ANY 统计 | ✅ | ❌ |
增量排序 | ✅ | ❌ |
窗口函数的增量排序 | ✅ | ❌ |
查询流水线 | ✅ | ❌ |
BRIN 索引(多最小最大值,bloom) | ✅ | ❌ |
查询并行 | ✅ | ❌ |
基于 Abbreviated Keys 进行排序 | ✅ | ❌ |
哈希索引的 WAL 支持 | ✅ | ❌ |
postgres_fdw 聚合下推 | ✅ | ❌ |
添加列时无需重写整个表 | ✅ | ❌ |
表连接运算支持运行时过滤器 (Runtime Filter) | ✅ | ❌ |
AppendOnly 表支持索引扫描 | ✅ | ❌ |
安全特性对比
功能名 | Cloudberry Database | Greenplum |
---|---|---|
透明数据加密 (TDE) | ✅ | ❌ |
可信扩展 | ✅ | ❌ |
SCRAM-SHA-256 | ✅ | ❌ |
GSSAPI 时的加密 TCP/IP 连接 | ✅ | ❌ |
行级别安全策略 | ✅ | ❌ |
Index Only Scan测试
1db1=# create table t66(id int PRIMARY key, c1 text, crt_time timestamp);
2CREATE TABLE
3Time: 10.453 ms
4db1=#
5db1=# insert into t66
6db1-# SELECT id, md5(id::text),now()
7db1-# FROM generate_series(1, 2000000) AS id;
8INSERT 0 2000000
9Time: 2131.541 ms (00:02.132)
10db1=#
11db1=# create index idx_t66_1 on t66 (id) include(c1);
12Time: 478.527 ms
13db1=#
14db1=# explain select id,c1 from t66 where id =100;
15 QUERY PLAN
16-----------------------------------------------------------------------------
17 Gather Motion 1:1 (slice1; segments: 1) (cost=0.00..6.00 rows=1 width=12)
18 -> Index Scan using t66_pkey on t66 (cost=0.00..6.00 rows=1 width=12)
19 Index Cond: (id = 100)
20 Optimizer: Pivotal Optimizer (GPORCA)
21(4 rows)
22
23Time: 3.523 ms
24db1=# set optimizer=0;
25SET
26Time: 1.048 ms
27db1=# explain select id,c1 from t66 where id =100;
28 QUERY PLAN
29---------------------------------------------------------------------------------
30 Gather Motion 1:1 (slice1; segments: 1) (cost=0.18..8.21 rows=1 width=37)
31 -> Index Only Scan using idx_t66_1 on t66 (cost=0.18..8.20 rows=1 width=37)
32 Index Cond: (id = 100)
33 Optimizer: Postgres query optimizer
34(4 rows)
35
36Time: 1.636 ms
37db1=#
其它特性
暂时先测试这个,其它特性和GreenPlum很类似,不过也多了很多新特性,后期还有类似于gpcc的监控工具,等官方发布后,我们再体验一下。
参考
https://cloudberrydb.org/zh/docs/cbdb-linux-compile
https://cloudberrydb.org/zh/docs/deploy-cbdb-with-single-node
https://dbaup.com/cloudberry-databasecbdbjieshao.html
https://dbaup.com/cbdbdanjianzhuang.html