中間結果書き出しが多いクエリを調べる(実行中のクエリ)
- クエリ
select query, sum(blocks_to_disk) as mbytes, max(max_blocks_to_disk) as max_mbytes, min(starttime) as query_start_time from STV_QUERY_METRICS where segment = -1 -- セグメントレベルの行を除外 and step_type = -1 -- ステップレベルの行を除外 and blocks_to_disk > 0 group by query order by sum(blocks_to_disk) desc;
- 実行結果
query | mbytes | max_mbytes | query_start_time ---------+---------+------------+---------------------------- 1258778 | 5650328 | 5641168 | 2022-09-07 05:10:49.017628 1258966 | 5377945 | 5377945 | 2022-09-07 05:24:59.447892 1259140 | 4745302 | 4745302 | 2022-09-07 05:38:32.176172 (3 rows)
中間結果書き出しが多いクエリを調べる(完了したクエリ)
- クエリ
select query, sum(blocks_to_disk) as mbytes, max(max_blocks_to_disk) as max_mbytes, min(starttime) as query_start_time from STL_QUERY_METRICS where segment = -1 -- セグメントレベルの行を除外 and step_type = -1 -- ステップレベルの行を除外 and blocks_to_disk > 0 group by query order by sum(blocks_to_disk) desc;
- 実行結果
query | mbytes | max_mbytes | query_start_time ---------+---------+------------+---------------------------- 1258778 | 5650328 | 5641168 | 2022-09-07 05:10:49.017628 1258966 | 5463719 | 5463719 | 2022-09-07 05:24:59.447892 1259140 | 4830242 | 4830242 | 2022-09-07 05:38:32.176172 (3 rows)
どのオペレーションで中間書き出しが多いか調べる
- クエリ
select query, stm, seg, step, maxtime, avgtime, rows, bytes, lpad(' ',stm+seg+step) || label as label, is_diskbased, workmem, is_rrscan, is_delayed_scan, rows_pre_filter from svl_query_summary where query = 1258778 order by query, stm, seg, step;
- 実行結果
- is_diskbased=t: 中間結果をディスクに書き出している
query | stm | seg | step | maxtime | avgtime | rows | bytes | label | is_diskbased | workmem | is_rrscan | is_delayed_scan | rows_pre_filter ---------+-----+-----+------+------------------+------------------+-------------+---------------+-----------------------------------------------+--------------+-------------+-----------+-----------------+----------------- 1258778 | 0 | 0 | 0 | 2565 | 1539 | 248320 | 1986560 | scan tbl=133863 name=lineorder | f | 0 | f | f | 248320 1258778 | 0 | 0 | 1 | 2565 | 1539 | 248320 | 0 | project | f | 0 | f | f | 0 1258778 | 0 | 0 | 2 | 2565 | 1539 | 248320 | 993280 | bcast | f | 0 | f | f | 0 1258778 | 0 | 1 | 0 | 4024 | 3949 | 496640 | 1986560 | scan tbl=100237 name=Internal Worktable | f | 0 | f | f | 0 1258778 | 0 | 1 | 1 | 4024 | 3949 | 496640 | 1986560 | save tbl=1011 | f | 13366198272 | f | f | 0 1258778 | 1 | 2 | 0 | 15633521587 | 15537247379 | 248320 | 22073077 | scan tbl=133863 name=lineorder | f | 0 | f | f | 248320 1258778 | 1 | 2 | 1 | 15633521587 | 15537247379 | 248320 | 0 | project | f | 0 | f | f | 0 1258778 | 1 | 2 | 2 | 15633521587 | 15537247379 | 61662822400 | 0 | nloop tbl=1011 | f | 0 | f | f | 0 1258778 | 1 | 2 | 3 | 15633521587 | 15537247379 | 61662822400 | 0 | project | f | 0 | f | f | 0 1258778 | 1 | 2 | 4 | 15633521587 | 15537247379 | 61662822400 | 0 | project | f | 0 | f | f | 0 1258778 | 1 | 2 | 7 | 15633521587 | 15537247379 | 61662822400 | 5919913041920 | dist | f | 0 | f | f | 0 1258778 | 1 | 3 | 0 | 15653443281 | 15651931806 | 61662822400 | 5919913041920 | scan tbl=100239 name=Internal Worktable | f | 0 | f | f | 0 1258778 | 1 | 3 | 2 | 15653443281 | 15651931806 | 61662822400 | 5919913041920 | sort tbl=1022 | t | 13364625408 | f | f | 0 1258778 | 2 | 4 | 0 | -715858302482022 | -715858302483356 | 3916119040 | 375987109152 | scan tbl=1022 name=Internal Worktable | f | 0 | f | f | 0 1258778 | 2 | 4 | 1 | -715858302482022 | -715858302483356 | 3916119040 | 0 | project | f | 0 | f | f | 0 1258778 | 2 | 4 | 2 | -715858302482022 | -715858302483356 | 3916119981 | 0 | insert tbl=284818 | f | 0 | f | f | 0 1258778 | 2 | 4 | 3 | -715858302482022 | -715858302483356 | 0 | 0 | aggr tbl=1010 | f | 0 | f | f | 0 (17 rows)
1TB超の中間結果を書き出しているクエリを調べる例
select query, stm, seg, step, maxtime, avgtime, rows, bytes, lpad(' ',stm+seg+step) || label as label, is_diskbased, workmem, is_rrscan, is_delayed_scan, rows_pre_filter from svl_query_summary where query in (select query from STL_QUERY_METRICS where segment = -1 and step_type = -1 and blocks_to_disk > 1048576 group by query) order by query, stm, seg, step;
参考
aws-misc/redshift.md at trunk · kdgregory/aws-misc · GitHubDisk space used for temporary data
Indicates queries that might have cartesian joins or large redistributions.
STV_QUERY_METRICS contains data for in-flight queries; STL_QUERY_METRICS contains recent historical data. Both tables also provide statistics for blocks scanned, rows output, queue time, and execution time.
Stats are aggregated at three different levels: query, step, and segment. These queries look at the query-level stats by only selecting rows where segment and step_type are -1.
Version 1: in-flight queries, by query
select query, sum(blocks_to_disk) as mbytes, max(max_blocks_to_disk) as max_mbytes from STV_QUERY_METRICS where segment = -1 and step_type = -1 and blocks_to_disk > 0 group by query;Version 2: in-flight queries, sum by user. First cut at identifying users who are running excessive queries.
select trim(u.usename) as user, coalesce(sum(blocks_to_disk), 0) as mbytes from STV_QUERY_METRICS m join PG_USER u on u.usesysid = m.userid where m.segment = -1 and m.step_type = -1 and m.blocks_to_disk > 0 group by u.usename;Version 3: queries in the last day, by user and hour. This also includes the highest single-segment usage within the hour, which can be used to identify the query (as well as indicate peak disk pressure).
select date_trunc('hour', m.starttime), trim(u.usename) as user, coalesce(sum(blocks_to_disk), 0) as total_mbytes, coalesce(max(max_blocks_to_disk), 0) as max_segment_mbytes from STL_QUERY_METRICS m join PG_USER u on u.usesysid = m.userid where m.starttime > dateadd('hour', -24, sysdate) and m.segment = -1 and m.step_type = -1 and m.blocks_to_disk > 0 group by 1, 2 order by 1, 2;