ablog

不器用で落着きのない技術者のメモ

Redshift の STL テーブルと SYS ビュー

Redshift の STL テーブルのクエリID(query列)と SYS ビューのクエリID(query_id)は別の値のため、クエリIDで結合することはできない。トランザクションID(STLとSYSのトランザクションIDが 1:1 の場合)で結合することができる。

dev=# \timing on
dev=# select count(distinct(lo_orderkey)) from lineorder;
   count
-----------
 150000000
(1 row)

Time: 13282.862 ms (00:13.283)
dev=# select pg_last_query_id();
 pg_last_query_id
------------------
         33269125
(1 row)

Time: 3.632 ms
dev=# \gset
Time: 3.703 ms
dev=# \x
Expanded display is on.
dev=# select * from sys_query_history where transaction_id = (select xid from stl_query where query = :pg_last_query_id);
-[ RECORD 1 ]----+----------------------------------------------------
user_id          | 100
query_id         | 33269123
query_label      | default
transaction_id   | 93596628
session_id       | 1073922183
database_name    | dev
query_type       | SELECT
status           | success
result_cache_hit | f
start_time       | 2024-09-05 04:25:48.409507
end_time         | 2024-09-05 04:26:01.689166
elapsed_time     | 13279659
queue_time       | 0
execution_time   | 13263519
error_message    |
returned_rows    | 1
returned_bytes   | 15
query_text       | select count(distinct(lo_orderkey)) from lineorder;
redshift_version | 1.0.73348
usage_limit      |
compute_type     | primary
compile_time     | 155
planning_time    | 6009
lock_wait_time   | 22

Time: 378.407 ms

参考

SYS monitoring views such as such as SYS_QUERY_HISTORY and SYS_QUERY_DETAIL contain the query_id column, which holds the identifier for users’ queries. Similarly, provisioned-only views such as STL_QUERY and SVL_QLOG contain the query column, which also holds the query identifiers. However, the query identifiers recorded in the SYS system views are different from those recorded in the provisioned-only views.

The difference between the SYS views’ query_id column values and the provisioned-only views’ query column values is as follows:

In SYS views, the query_id column records user-submitted queries in their original form. The Amazon Redshift optimizer might break them down into child queries for improved performance, but a single query you run will still only have a single row in SYS_QUERY_HISTORY. If you want to see the individual child queries, you can find them in SYS_QUERY_DETAIL.

In provisioned-only views, the query column records queries at the child query level. If the Amazon Redshift optimizer rewrites your original query into multiple child queries, there will be multiple rows in STL_QUERY with differing query identifier values for a single query you run.

When you migrate your monitoring and diagnostic queries from provisioned-only views to SYS views, consider this difference and edit your queries accordingly. For more information on how Amazon Redshift processes queries, see Query planning and execution workflow.

System tables and views reference - Amazon Redshift