前言
今天同事发了一条SQL语句,让我优化,乍一看SQL中带有nolock,于是就让他把SQL Server的连接环境发我,结果同事说,这是PostgreSQL环境,惊呆了。。。,
SQL如下:
1SELECT
2 A.boumber,
3 COUNT ( DISTINCT seqnum ) bgs,
4 ( SELECT MIN ( b.ttime )
5 FROM tb_aaaaa b ( nolock )
6 WHERE A.boumber = b. boumber ) sjsy
7FROM
8 tb_aaaaa A ( nolock )
9WHERE
10 ( testabbreviation ='T4')
11 AND A.ttime between '2024-05-01'::TIMESTAMP and '2024-05-31 23:59:59' ::TIMESTAMP
12GROUP BY
13 A.boumber
14ORDER BY
15 sjsy
于是我登录PG环境,运行了一下,果然不报错。
SQL优化
先说这条SQL的优化,这个SQL优化很简单,创建一个include索引让其走Index Only Scan
即可:
1create index idx_tb_aaaaa_all on tb_aaaaa(ttime,boumber)
2 include(seqnum,testabbreviation );
运行了1下,大概2秒出结果,不再详细分析了。
优化前后的执行计划:
1-- 优化前
2Sort (cost=475894253.13..475894263.24 rows=4043 width=21)
3 Sort Key: ((SubPlan 1))
4 -> GroupAggregate (cost=88616.60..475894010.93 rows=4043 width=21)
5 Group Key: a.boumber
6 -> Gather Merge (cost=88616.60..89224.74 rows=5079 width=15)
7 Workers Planned: 4
8 -> Sort (cost=87616.55..87619.72 rows=1270 width=15)
9 Sort Key: a.boumber
10 -> Parallel Seq Scan on idx_tb_aaaaa_all a (cost=0.00..87551.07 rows=1270 width=15)
11 Filter: ((ttime >= '2024-05-01 00:00:00'::timestamp without time zone) AND (ttime <= '2024-05-31 23:59:59'::timestamp without time zone) AND ((testviation)::text = 'T4'::text))
12 SubPlan 1
13 -> Aggregate (cost=117686.04..117686.05 rows=1 width=8)
14 -> Seq Scan on idx_tb_aaaaa_all b (cost=0.00..117685.21 rows=333 width=8)
15 Filter: ((a.boumber)::text = (boumber)::text)
16
17
18
19-- 优化后
20Sort (cost=1648745.98..1648756.09 rows=4043 width=21)
21 Sort Key: ((SubPlan 2))
22 -> GroupAggregate (cost=78267.03..1648503.78 rows=4043 width=21)
23 Group Key: a.boumber
24 -> Gather Merge (cost=78267.03..78875.16 rows=5079 width=15)
25 Workers Planned: 4
26 -> Sort (cost=77266.97..77270.14 rows=1270 width=15)
27 Sort Key: a.boumber
28 -> Parallel Bitmap Heap Scan on idx_tb_aaaaa_all a (cost=5215.82..77201.50 rows=1270 width=15)
29 Recheck Cond: ((ttime >= '2024-05-01 00:00:00'::timestamp without time zone) AND (ttime <= '2024-05-31 23:59:59'::timestamp without time zone))
30 Filter: ((testviation)::text = 'T4'::text)
31 -> Bitmap Index Scan on idx_tb_aaaaa_all (cost=0.00..5214.55 rows=151012 width=0)
32 Index Cond: ((ttime >= '2024-05-01 00:00:00'::timestamp without time zone) AND (ttime <= '2024-05-31 23:59:59'::timestamp without time zone))
33 SubPlan 2
34 -> Result (cost=388.21..388.22 rows=1 width=8)
35 InitPlan 1 (returns $1)
36 -> Limit (cost=0.43..388.21 rows=1 width=8)
37 -> Index Only Scan using idx_tb_aaaaa_all on idx_tb_aaaaa_all b (cost=0.43..129130.29 rows=333 width=8)
38 Index Cond: ((ttime IS NOT NULL) AND (boumber = (a.boumber)::text))
PG中可以使用nolock吗
先测试一下:
1[root@alldb ~]# docker exec -it lhrpg12 psql -U postgres -d postgres
2psql (12.18 (Debian 12.18-1.pgdg120+2))
3Type "help" for help.
4
5postgres=# select version();
6 version
7-----------------------------------------------------------------------------------------------------------------------
8 PostgreSQL 12.18 (Debian 12.18-1.pgdg120+2) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
9(1 row)
10
11postgres=# select count(*) from pg_class a (nolock) ;
12 count
13-------
14 395
15(1 row)
16
17postgres=# \q
18You have mail in /var/spool/mail/root
19[root@alldb ~]# docker exec -it lhrpg16 psql -U postgres -d postgres
20psql (16.2 (Debian 16.2-1.pgdg120+2))
21Type "help" for help.
22
23postgres=# select version();
24 version
25---------------------------------------------------------------------------------------------------------------------
26 PostgreSQL 16.2 (Debian 16.2-1.pgdg120+2) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
27(1 row)
28
29postgres=# select count(*) from pg_class a (nolock) ;
30 count
31-------
32 415
33(1 row)
34
35postgres=#
36
可以正常运行,在Navicat里运行一下:
这里第1列的oid的别名变为了nolock,是个bug,可以参考:https://www.postgresql.org/message-id/5832.1232117138@sss.pgh.pa.us
可以看到,都可以正常运行,那nolock改为其它的呢?
1postgres=# select count(*) from pg_class a (nolock) ;
2 count
3-------
4 415
5(1 row)
6
7postgres=# select count(*) from pg_class a (nolock111) ;
8 count
9-------
10 415
11(1 row)
12
13postgres=# select count(*) from pg_class a (xxxxxxxx111) ;
14 count
15-------
16 415
17(1 row)
18
19postgres=# select * from pg_class a (xxxxxxxx111) limit 2;
20 xxxxxxxx111 | relname | relnamespace | reltype | reloftype | relowner | relam | relfilenode | reltablespace | relpages | reltuples | relallvisible | reltoastrelid | relhasindex | relisshared | relpersistence | relkind | relnatts | relchecks | relhasrules | relhastriggers | relhassubclass | relrowsecurity | relforcerowsecurity | relispopulated | relreplident | relispartition | relrewrite | relfrozenxid | relminmxid | relacl | reloptions | relpartbound
21-------------+-------------------------+--------------+---------+-----------+----------+-------+-------------+---------------+----------+-----------+---------------+---------------+-------------+-------------+----------------+---------+----------+-----------+-------------+----------------+----------------+----------------+---------------------+----------------+--------------+----------------+------------+--------------+------------+-----------------------------------------+------------+--------------
22 16403 | pg_stat_statements_info | 2200 | 16405 | 0 | 10 | 0 | 0 | 0 | 0 | -1 | 0 | 0 | f | f | p | v | 2 | 0 | t | f | f | f | f | t | n | f | 0 | 0 | 0 | {postgres=arwdDxt/postgres,=r/postgres} | |
23 16414 | pg_stat_statements | 2200 | 16416 | 0 | 10 | 0 | 0 | 0 | 0 | -1 | 0 | 0 | f | f | p | v | 43 | 0 | t | f | f | f | f | t | n | f | 0 | 0 | 0 | {postgres=arwdDxt/postgres,=r/postgres} | |
24(2 rows)
25
26postgres=#
27
28postgres=# explain (analyze,verbose,timing,costs,buffers) select * from pg_class a (nolock) ;
29 QUERY PLAN
30---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
31 Seq Scan on pg_catalog.pg_class a (cost=0.00..18.13 rows=413 width=273) (actual time=0.016..0.089 rows=415 loops=1)
32 Output: nolock, relname, relnamespace, reltype, reloftype, relowner, relam, relfilenode, reltablespace, relpages, reltuples, relallvisible, reltoastrelid, relhasindex, relisshared, relpersistence, relkind, relnatts, relchecks, relhasrules, relhastriggers, relhassubclass, relrowsecurity, relforcerowsecurity, relispopulated, relreplident, relispartition, relrewrite, relfrozenxid, relminmxid, relacl, reloptions, relpartbound
33 Buffers: shared hit=14
34 Query Identifier: -2946891049789830332
35 Planning Time: 0.094 ms
36 Execution Time: 0.135 ms
37(6 rows)
38
39postgres=# explain (analyze,verbose,timing,costs,buffers) select * from pg_class a (xxxx) ;
40 QUERY PLAN
41-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
42 Seq Scan on pg_catalog.pg_class a (cost=0.00..18.13 rows=413 width=273) (actual time=0.014..0.054 rows=415 loops=1)
43 Output: xxxx, relname, relnamespace, reltype, reloftype, relowner, relam, relfilenode, reltablespace, relpages, reltuples, relallvisible, reltoastrelid, relhasindex, relisshared, relpersistence, relkind, relnatts, relchecks, relhasrules, relhastriggers, relhassubclass, relrowsecurity, relforcerowsecurity, relispopulated, relreplident, relispartition, relrewrite, relfrozenxid, relminmxid, relacl, reloptions, relpartbound
44 Buffers: shared hit=14
45 Query Identifier: -2946891049789830332
46 Planning Time: 0.075 ms
47 Execution Time: 0.121 ms
48(6 rows)
49
50postgres=# explain (analyze,verbose,timing,costs,buffers) select count(*) from pg_class a (xxxxxxxx111) ;
51 QUERY PLAN
52-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
53 Aggregate (cost=15.38..15.39 rows=1 width=8) (actual time=0.099..0.100 rows=1 loops=1)
54 Output: count(*)
55 Buffers: shared hit=3
56 -> Index Only Scan using pg_class_tblspc_relfilenode_index on pg_catalog.pg_class a (cost=0.15..14.34 rows=413 width=0) (actual time=0.027..0.072 rows=415 loops=1)
57 Output: reltablespace, relfilenode
58 Heap Fetches: 7
59 Buffers: shared hit=3
60 Query Identifier: 6993869046713889110
61 Planning Time: 0.095 ms
62 Execution Time: 0.125 ms
63(10 rows)
64
65postgres=#
可以看到,也可以正常运行。
应该是PostgreSQL忽略了括号中的内容,将其视为注释或无效字符,因此不会报错。
具体深层次原因就需要去分析源代码了,这里就不去挖掘了。
结论
1、PG中使用nolock不会报错,nolock可以替换为任意字符串,但不建议写这个
2、这个问题在GP中同样存在