我是【Sean】, 微信: _iihero, 欢迎大家长按关注并加星公众号:数据库杂记。有好资源相送,同时为你提供及时更新。已关注的朋友,发送0、1到7,都有好资源相送。
为方便交流,建了一个数据库相关的群: 数据库Hacker,
现在还可以扫码入群。欢迎对这个领域有兴趣的朋友扫码入群。
前言
首先,咱们给HyBench做一个简单的介绍。内容来自于HyBench仓库本身。
Hybench是一款由中国软件评测中心、清华大学联合牵头,北京奥星贝斯科技有限公司、武汉达梦数据库股份有限公司、华为技术有限公司、腾讯云计算有限公司、阿里云计算有限公司共同研发的HTAP数据库基准测试工具。
Hybench针对HTAP数据库技术特点,参考实际典型应用场景进行设计,数据模型采用在线金融交易分析场景,提供OLTP、OLAP、OLXP三类典型HTAP负载,支持不同规模的数据集,可以计算出TPS、QPS、XPS、新鲜度等不同维度的评价指标,最终给出统一评价指标H-Score。为数据库厂商和第三方评测机构提供HTAP数据库基准性能的评价方法及工具,引导HTAP数据库的技术研究方向,帮助用户进行HTAP数据库选型。
其架构看起来是这样的:
因为它是集三种负载类型,所以,看起来似乎比BenchmarkSQL这种纯OLTP的要牛气!
这里就拿它默认支持的MySQL/PostgreSQL,我们就挑一个,拿PostgreSQL做一个实操实验吧。
实验验证
环境准备
1、clone HyBench仓库:
cd /iihero/source
git clone https://gitee.com/cstc2023/hybench
2、准备jdk17
https://jdk.java.net/java-se-ri/17-MR1 下载:jdk17
wget https://download.java.net/openjdk/jdk17.0.0.1/ri/openjdk-17.0.0.1+2_linux-x64_bin.tar.gz
注意,装的是JDK17不是JRE17,因为mvn要调用它来进行源码编译。
3、安装mvn
wget https://dlcdn.apache.org/maven/maven-3/3.9.9/binaries/apache-maven-3.9.9-bin.zip
一句题外话,对于HyBench这个工程的文档,实在有些不敢恭维。源码如何进行编译,通篇都不提。这还是在笔者反复试了多次以后,才知道必须要用jdk17(也许稍低版本的jdk也可以,但是运行起来不行)。
准备了上述内容之后,配置合适的环境变量,如下:
export JAVA_HOME=/iihero/tools/jdk-17.0.0.1
export MAVEN_HOME=/iihero/tools/apache-maven-3.9.9
export PATH=$JAVA_HOME/bin:$MAVEN_HOME/bin:$PATH
这里,假定mvn和jdk17都解压到目录:/iihero/tools下边。
[20:22:27-postgres@centos1:/var/lib/pgsql]$ java -version
openjdk version "17.0.0.1" 2024-07-02
OpenJDK Runtime Environment (build 17.0.0.1+2-3)
OpenJDK 64-Bit Server VM (build 17.0.0.1+2-3, mixed mode, sharing)
[20:22:31-postgres@centos1:/var/lib/pgsql]$ mvn -version
Apache Maven 3.9.9 (8e8579a9e76f7d015ee5ec7bfcdc97d260186937)
Maven home: /iihero/tools/apache-maven-3.9.9
Java version: 17.0.0.1, vendor: Oracle Corporation, runtime: /iihero/tools/jdk-17.0.0.1
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "3.10.0-1160.el7.x86_64", arch: "amd64", family: "unix"
4、编译HyBench
cd /iihero/source/hybench
mvn clean package
经过几分钟的资源下载及编译,它会为我们生成如下结构:
/iihero/source/hybench/target/HyBench-1.0-SNAPSHOT-jar-with-dependencies.jar
[20:25:13-postgres@centos1:/iihero/source/hybench]$ ls -la lib
total 13476
drwxr-xr-x 2 postgres postgres 4096 Oct 7 09:41 .
drwxr-xr-x 8 postgres postgres 210 Oct 8 20:24 ..
-rw-r--r-- 1 postgres postgres 214381 Oct 7 09:41 checker-qual-3.5.0.jar
-rw-r--r-- 1 postgres postgres 53820 Oct 7 09:41 commons-cli-1.4.jar
-rw-r--r-- 1 postgres postgres 2213560 Oct 7 09:41 commons-math3-3.6.1.jar
-rw-r--r-- 1 postgres postgres 232620 Oct 7 09:41 gson-2.8.1.jar
-rw-r--r-- 1 postgres postgres 2256213 Oct 7 09:41 guava-18.0.jar
-rw-r--r-- 1 postgres postgres 120028 Oct 7 09:41 HyBench-1.0-SNAPSHOT.jar
-rw-r--r-- 1 postgres postgres 317566 Oct 7 09:41 log4j-api-2.19.0.jar
-rw-r--r-- 1 postgres postgres 1864386 Oct 7 09:41 log4j-core-2.19.0.jar
-rw-r--r-- 1 postgres postgres 990924 Oct 7 09:41 mysql-connector-java-5.1.40.jar
-rw-r--r-- 1 postgres postgres 4398602 Oct 7 09:41 ojdbc8-19.7.0.0.jar
-rw-r--r-- 1 postgres postgres 1046274 Oct 7 09:41 postgresql-42.5.0.jar
-rw-r--r-- 1 postgres postgres 62621 Oct 7 09:41 toml4j-0.7.2.jar
而后边整个应用的运行,就要依赖lib目录下边的jar (HyBench-1.0-SNAPSHOT.jar)以及 conf目录下边的配置文件了。
相关配置
配置文件, 找到conf/db.prop, 修改添加如下内容
username=postgres
password=XXXXXXXX
url=jdbc:postgresql://192.168.0.6:5555/mydb
url_ap=jdbc:postgresql://192.168.0.6:5555/mydb
classname_ap=org.postgresql.Driver
username_ap=postgres
password_ap=XXXXXXXX
## support 1x, 10x
sf=1x
......
看到上边说明吗?只支持1倍、10倍规模。
运行过程
运行之前,调整一下hybench,它本身是一个shell文件。
# 依赖java 17,请更新jdk17的path
export JAVA_HOME=`your java PATH`
export PATH=$JAVA_HOME/bin:$PATH
java -cp ./lib/*:. com.hybench.HyBench $*
这是源文件,写的相当不正规。文件权限,最好给成755之类的吧。这里却是600。
将它改一下,并给权限755。
#!/bin/bash
# 依赖java 17,请更新jdk17的path
export JAVA_HOME=/iihero/tools/jdk-17.0.0.1
export PATH=$JAVA_HOME/bin:$PATH
java -cp ./lib/*:. com.hybench.HyBench $*
1、生成数据
[10:20:27-postgres@centos1:/iihero/source/hybench]$ ./hybench -t gendata -c conf/db.prop
2024-10-07 10:20:33 [main] INFO HyBench:324 - Hi~Bench, HyBench
2024-10-07 10:20:33 [main] INFO ConfigLoader:57 - ===============configuration==================
2024-10-07 10:20:33 [main] INFO ConfigLoader:59 - xapclient = 1
2024-10-07 10:20:33 [main] INFO ConfigLoader:59 - at2_percent = 25
2024-10-07 10:20:33 [main] INFO ConfigLoader:59 - at1_percent = 35
2024-10-07 10:20:33 [main] INFO ConfigLoader:59 - xpRunMins = 1
2024-10-07 10:20:33 [main] INFO ConfigLoader:59 - at4_percent = 15
2024-10-07 10:20:33 [main] INFO ConfigLoader:59 - at3_percent = 15
2024-10-07 10:20:33 [main] INFO ConfigLoader:59 - tpclient = 1
2024-10-07 10:20:33 [main] INFO ConfigLoader:59 - password = test123
2024-10-07 10:20:33 [main] INFO ConfigLoader:59 - sf = 1x
2024-10-07 10:20:33 [main] INFO ConfigLoader:59 - username_ap = mydb
2024-10-07 10:20:33 [main] INFO ConfigLoader:59 - xtpclient = 1
2024-10-07 10:20:33 [main] INFO ConfigLoader:59 - apRunMins = 1
2024-10-07 10:20:33 [main] INFO ConfigLoader:59 - at5_percent = 7
2024-10-07 10:20:33 [main] INFO ConfigLoader:59 - url_ap = jdbc:postgresql://192.168.0.6:5555/mydb
2024-10-07 10:20:33 [main] INFO ConfigLoader:59 - at6_percent = 3
2024-10-07 10:20:33 [main] INFO ConfigLoader:59 - classname_ap = org.postgresql.Driver
2024-10-07 10:20:33 [main] INFO ConfigLoader:59 - apclient = 1
2024-10-07 10:20:33 [main] INFO ConfigLoader:59 - apround = 1
2024-10-07 10:20:33 [main] INFO ConfigLoader:59 - url = jdbc:postgresql://192.168.0.6:5555/mydb
2024-10-07 10:20:33 [main] INFO ConfigLoader:59 - tpRunMins = 1
2024-10-07 10:20:33 [main] INFO ConfigLoader:59 - fresh_interval = 20
2024-10-07 10:20:33 [main] INFO ConfigLoader:59 - classname = org.postgresql.Driver
2024-10-07 10:20:33 [main] INFO ConfigLoader:59 - password_ap = test123
2024-10-07 10:20:33 [main] INFO ConfigLoader:59 - db = postgresql
2024-10-07 10:20:33 [main] INFO ConfigLoader:59 - username = mydb
2024-10-07 10:20:33 [main] INFO ConfigLoader:61 - ===============configuration==================
2024-10-07 10:20:33 [main] INFO ConfigLoader:62 -
This is a data generator of HyBench, Version 0.1
----------------
----------------
----------------
Data is generating...
----------------
----------------
----------------
WARNING: dir:/iihero/source/hybench/Data_1x not exists! will created.
Data generate not skipped!
Data is ready under the Data folder!
----------------
----------------
----------------
Data generation took 29764 ms
No autoloader, do nothing!
看看生成的数据的大小:
[10:21:13-postgres@centos1:/iihero/source/hybench]$ ls -lrth Data_1x
total 518M
-rw-r--r-- 1 postgres postgres 36M Oct 7 10:20 customer.csv
-rw-r--r-- 1 postgres postgres 14M Oct 7 10:20 savingAccount.csv
-rw-r--r-- 1 postgres postgres 313K Oct 7 10:20 company.csv
-rw-r--r-- 1 postgres postgres 14M Oct 7 10:20 checkingAccount.csv
-rw-r--r-- 1 postgres postgres 343M Oct 7 10:20 transfer.csv
-rw-r--r-- 1 postgres postgres 34M Oct 7 10:20 checking.csv
-rw-r--r-- 1 postgres postgres 44M Oct 7 10:21 loanTrans.csv
-rw-r--r-- 1 postgres postgres 34M Oct 7 10:21 loanApps.csv
2、生成相关表结构
[10:28:37-postgres@centos1:/iihero/source/hybench]$ ./hybench -t sql -f conf/ddl_pg.sql -c conf/db.prop
2024-10-07 10:28:48 [main] INFO HyBench:324 - Hi~Bench, HyBench
2024-10-07 10:28:48 [main] INFO ConfigLoader:57 - ===============configuration==================
2024-10-07 10:28:48 [main] INFO ConfigLoader:59 - xapclient = 1
2024-10-07 10:28:48 [main] INFO ConfigLoader:59 - at2_percent = 25
2024-10-07 10:28:48 [main] INFO ConfigLoader:59 - at1_percent = 35
2024-10-07 10:28:48 [main] INFO ConfigLoader:59 - xpRunMins = 1
2024-10-07 10:28:48 [main] INFO ConfigLoader:59 - at4_percent = 15
2024-10-07 10:28:48 [main] INFO ConfigLoader:59 - at3_percent = 15
2024-10-07 10:28:48 [main] INFO ConfigLoader:59 - tpclient = 1
2024-10-07 10:28:48 [main] INFO ConfigLoader:59 - password = test123
2024-10-07 10:28:48 [main] INFO ConfigLoader:59 - sf = 1x
2024-10-07 10:28:48 [main] INFO ConfigLoader:59 - username_ap = mydb
2024-10-07 10:28:48 [main] INFO ConfigLoader:59 - xtpclient = 1
2024-10-07 10:28:48 [main] INFO ConfigLoader:59 - apRunMins = 1
2024-10-07 10:28:48 [main] INFO ConfigLoader:59 - at5_percent = 7
2024-10-07 10:28:48 [main] INFO ConfigLoader:59 - url_ap = jdbc:postgresql://192.168.0.6:5555/mydb
2024-10-07 10:28:48 [main] INFO ConfigLoader:59 - at6_percent = 3
2024-10-07 10:28:48 [main] INFO ConfigLoader:59 - classname_ap = org.postgresql.Driver
2024-10-07 10:28:48 [main] INFO ConfigLoader:59 - apclient = 1
2024-10-07 10:28:48 [main] INFO ConfigLoader:59 - apround = 1
2024-10-07 10:28:48 [main] INFO ConfigLoader:59 - url = jdbc:postgresql://192.168.0.6:5555/mydb
2024-10-07 10:28:48 [main] INFO ConfigLoader:59 - tpRunMins = 1
2024-10-07 10:28:48 [main] INFO ConfigLoader:59 - fresh_interval = 20
2024-10-07 10:28:48 [main] INFO ConfigLoader:59 - classname = org.postgresql.Driver
2024-10-07 10:28:48 [main] INFO ConfigLoader:59 - password_ap = test123
2024-10-07 10:28:48 [main] INFO ConfigLoader:59 - db = postgresql
2024-10-07 10:28:48 [main] INFO ConfigLoader:59 - username = mydb
2024-10-07 10:28:48 [main] INFO ConfigLoader:61 - ===============configuration==================
2024-10-07 10:28:48 [main] INFO ConfigLoader:62 -
2024-10-07 10:28:48 [main] INFO ExecSQL:58 - execute query:CREATE TABLE customer (
custID int PRIMARY KEY,
companyID int,
gender char(6),
name char(15),
age int,
phone char(11),
province char(15),
city char(15),
loan_balance real,
saving_credit int,
checking_credit int,
loan_credit int,
Isblocked int,
created_date Date,
last_update_timestamp timestamp
);
2024-10-07 10:28:48 [main] INFO ExecSQL:58 - execute query:CREATE TABLE company (
companyID int PRIMARY KEY,
name varchar,
category varchar,
staff_size int,
loan_balance real,
phone char(11),
province char(15),
city char(15),
saving_credit int,
checking_credit int,
loan_credit int,
Isblocked int,
created_date Date,
last_update_timestamp timestamp
);
2024-10-07 10:28:48 [main] INFO ExecSQL:58 - execute query:CREATE TABLE savingAccount (
accountID int PRIMARY KEY,
userID int,
balance real,
Isblocked int,
timestamp timestamp
);
2024-10-07 10:28:48 [main] INFO ExecSQL:58 - execute query:CREATE TABLE checkingAccount (
accountID int PRIMARY KEY,
userID int,
balance real,
Isblocked int,
timestamp timestamp
);
2024-10-07 10:28:48 [main] INFO ExecSQL:58 - execute query:CREATE TABLE transfer (
id int PRIMARY KEY,
sourceID int,
targetID int,
amount real,
type char(10),
timestamp timestamp,
fresh_ts timestamp default current_timestamp
);
2024-10-07 10:28:48 [main] INFO ExecSQL:58 - execute query:CREATE TABLE checking (
id int PRIMARY KEY,
sourceID int,
targetID int,
amount real,
type char(10),
timestamp timestamp
);
2024-10-07 10:28:48 [main] INFO ExecSQL:58 - execute query:CREATE TABLE loanapps (
id int PRIMARY KEY,
applicantID int,
amount real,
duration int,
status char(12),
timestamp timestamp
);
2024-10-07 10:28:48 [main] INFO ExecSQL:58 - execute query:CREATE TABLE loantrans (
id int PRIMARY KEY,
applicantID int,
appID int,
amount real,
status char(12),
timestamp timestamp,
duration int,
contract_timestamp timestamp,
delinquency int
);
3、装载数据
./hybench -t sql -f conf/load_data_pg.sql -c conf/db.prop
......
出错:
2024-10-07 10:30:22 [main] INFO ExecSQL:58 - execute query:\copy customer from 'Data_1x/customer.csv' CSV DELIMITER ',' ;
org.postgresql.util.PSQLException: ERROR: syntax error at or near "\"
得用 psql命令行加载。这个就有点诡异了。
[10:38:40-postgres@centos1:/iihero/source/hybench]$ psql -h localhost -U mydb -d mydb -f conf/load_data_pg.sql
COPY 300000
COPY 2000
COPY 6000000
COPY 600000
COPY 302000
COPY 302000
COPY 600000
COPY 600000
4、建索引
又碰到了一些错误。必须使用postgres的role。
[10:39:45-postgres@centos1:/iihero/source/hybench]$ ./hybench -t sql -f conf/create_index_pg.sql -c conf/db.prop
2024-10-07 10:40:28 [main] INFO HyBench:324 - Hi~Bench, HyBench
2024-10-07 10:40:28 [main] INFO ConfigLoader:57 - ===============configuration==================
22024-10-07 10:40:28 [main] INFO ConfigLoader:59 - classname = org.postgresql.Driver
2024-10-07 10:40:28 [main] INFO ConfigLoader:61 - ===============configuration==================
2024-10-07 10:40:28 [main] INFO ConfigLoader:62 -
2024-10-07 10:40:28 [main] INFO ExecSQL:58 - execute query:create index idx_loanapps_1 on loanapps ( applicantid );
2024-10-07 10:40:29 [main] INFO ExecSQL:58 - execute query:create index idx_loanapps_2 on loanapps ( timestamp );
2024-10-07 10:40:29 [main] INFO ExecSQL:58 - execute query:create index idx_loantrans_1 on loantrans ( applicantid );
2024-10-07 10:40:30 [main] INFO ExecSQL:58 - execute query:create index idx_loantrans_2 on loantrans ( timestamp );
2024-10-07 10:40:30 [main] INFO ExecSQL:58 - execute query:create index idx_loantrans_3 on loantrans ( status);
2024-10-07 10:40:31 [main] INFO ExecSQL:58 - execute query:create index idx_transfer_1 on transfer ( sourceid );
2024-10-07 10:40:35 [main] INFO ExecSQL:58 - execute query:create index idx_transfer_2 on transfer ( targetid );
2024-10-07 10:40:40 [main] INFO ExecSQL:58 - execute query:create index idx_transfer_3 on transfer ( type );
2024-10-07 10:40:47 [main] INFO ExecSQL:58 - execute query:create index idx_checking_1 on checking ( sourceid );
2024-10-07 10:40:48 [main] INFO ExecSQL:58 - execute query:create index idx_checking_2 on checking ( targetid );
2024-10-07 10:40:48 [main] INFO ExecSQL:58 - execute query:create index idx_customer_1 on customer ( companyid );
2024-10-07 10:40:48 [main] INFO ExecSQL:58 - execute query:create index idx_company_1 on company ( category );
2024-10-07 10:40:48 [main] INFO ExecSQL:58 - execute query:create index idx_sa_1 on savingaccount ( userid );
2024-10-07 10:40:49 [main] INFO ExecSQL:58 - execute query:create index idx_ca_1 on checkingaccount ( userid );
2024-10-07 10:40:49 [main] INFO ExecSQL:58 - execute query:CREATE SEQUENCE IF NOT EXISTS transfer_id_seq;
2024-10-07 10:40:49 [main] INFO ExecSQL:58 - execute query:SELECT SETVAL('transfer_id_seq', (SELECT max(id) FROM transfer));
2024-10-07 10:40:49 [main] INFO ExecSQL:58 - execute query:ALTER TABLE transfer ALTER COLUMN id SET DEFAULT nextval('transfer_id_seq'::regclass);
2024-10-07 10:40:49 [main] INFO ExecSQL:58 - execute query:ALTER sequence transfer_id_seq owner to postgres;
org.postgresql.util.PSQLException: ERROR: must be member of role "postgres"
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2676)
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2366)
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:356)
at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:496)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:413)
at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:333)
at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:319)
at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:295)
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:290)
at com.hybench.load.ExecSQL.execute(ExecSQL.java:59)
at com.hybench.HyBench.main(HyBench.java:350)
跑到末尾发现,必须是超级用户才能调用成功。
好吧,我们把上边的db.prop里头的用户改成postgres (测试而已,上边已改。)
重头来过。
清理表数据:
./hybench -t sql -f conf/dropTables.sql -c conf/db.prop
然后再把2和3步执行一遍。得以成功。
5、运行:
bash hybench -t runall -c conf/db.prop -f conf/stmt_postgres.toml
-----------HTAP-Summary--------------------
-----------AP-Part--------------------
QPS : 6.90
-----------TP-Part--------------------
TPS : 107.88
-----------XP-Part--------------------
XP-QPS : 7.92
XP-TPS : 13.42
-----------Avg-Freshness-Score--------------------
Freshness(ms) : 42.39
-----------HTAP-Score--------------------
Geometric Mean : 24.12
====================Thank you!========================
当然,你也可以只运行其中的TP部分:
bash hybench -t runtp -c conf/db.prop -f conf/stmt_postgres.toml
总结
HyBench, 从框架上来看,还是不错的,虽然文档写的有点怪怪的。pom.xml中更是有些拼凑的感觉。这里就不过多点评了。作为压测套件的一个补充,应该是很好的。大致看了下代码结构,想在它的基础上扩展一下,提供其它数据库的支持,也应该不难。
当然,我以前也基于BenchmarkSQL做了若干扩展,提供了几个其它数据库的支持:(比如:Sybase ASE, HANA, DB2, MSSQL等)
详见:https://github.com/iihero/benchmarksql-v5/
关于HyBench工具的基本模型的思路及设计,可以参考:HTAP 测试工具-HyBench 初探
个人微信:_iihero
CSDN: iihero
墨天轮:https://www.modb.pro/u/16258 (Sean)
pgfans: iihero
往期导读:
1. PostgreSQL中配置单双向SSL连接详解
2. 提升PSQL使用技巧:PostgreSQL中PSQL使用技巧汇集(1)
3. 提升PSQL使用技巧:PostgreSQL中PSQL使用技巧汇集(2)
4. PostgreSQL SQL的基础使用及技巧
5. PostgreSQL开发技术基础:过程与函数
6. PostgreSQL中vacuum 物理文件truncate发生的条件
7. PostgreSQL中表的年龄与Vacuum的实验探索:Vacuum有大用
8. PostgreSQL利用分区表来弥补AutoVacuum的不足
9. 也聊聊PostgreSQL中的空间膨胀与AutoVacuum
10. 正确理解SAP BTP中hyperscaler PG中的IOPS (AWS篇)
11. 由浅入深高可用(HA)之: HAProxy