*24.07.02 노션 복구본입니다.
파티셔닝 인덱스 효율을 타지 못했던 내용을 정리했던 기록이다.
상황 : 인덱스 효율 0
전체 컬럼은 약 3000만개 가량, 12GB 정도 되는 용량의 테이블을 기준으로 한다.
프로토타입 조회 버전. (지금 보니 repository 분리 안한게 눈에 띈다..)
특정 테이블 기준이지만, 현재 수집된 데이터는 1차적으로 ‘시간값(Created_At)’ 파티션 테이블 사이의 값, 그리고 +a의 추가적인 조건을 가지고 있다.
이것은 ID(UUID)일 수도, Long(특정 id)일 수도, 최악의 경우 Varchar일 수도 있다.
여기서는 UUID를 기준으로 Explain을 통해 검색 성능의 차이를 확인해 볼 것이다.
tomahawk=# explain analyze select * from tbl_apm_trace_span where created_at between '2024-07-06 14:30:00' and '2024-07-08 15:30:00' and trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------
Gather (cost=1000.00..1437367.23 rows=8479 width=277) (actual time=82.545..4150.425 rows=4 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Append (cost=0.00..1435519.33 rows=3540 width=277) (actual time=2719.087..4040.432 rows=1 loops=3)
-> Parallel Seq Scan on tbl_apm_trace_span_2024070810 tbl_apm_trace_span_19 (cost=0.00..252562.58 rows=578 width=273) (actual time=2334.585..2334.586 rows=
0 loops=1)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35'::uuid))
Rows Removed by Filter: 5436744
-> Parallel Seq Scan on tbl_apm_trace_span_2024070809 tbl_apm_trace_span_18 (cost=0.00..251058.90 rows=572 width=272) (actual time=2422.520..2422.521 rows=
0 loops=1)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35'::uuid))
Rows Removed by Filter: 5412775
-> Parallel Seq Scan on tbl_apm_trace_span_2024070813 tbl_apm_trace_span_22 (cost=0.00..244249.11 rows=540 width=274) (actual time=772.936..772.937 rows=0
loops=3)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35'::uuid))
Rows Removed by Filter: 1722784
-> Parallel Seq Scan on tbl_apm_trace_span_2024070812 tbl_apm_trace_span_21 (cost=0.00..217825.37 rows=538 width=275) (actual time=1055.548..1055.548 rows=
0 loops=2)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35'::uuid))
Rows Removed by Filter: 2305552
-> Parallel Seq Scan on tbl_apm_trace_span_2024070811 tbl_apm_trace_span_20 (cost=0.00..212655.13 rows=521 width=274) (actual time=1480.020..1480.021 rows=
0 loops=1)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35'::uuid))
Rows Removed by Filter: 4560835
-> Parallel Seq Scan on tbl_apm_trace_span_2024070814 tbl_apm_trace_span_23 (cost=0.00..168879.58 rows=493 width=279) (actual time=1009.880..1009.881 rows=
0 loops=1)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35'::uuid))
Rows Removed by Filter: 3538139
-> Parallel Seq Scan on tbl_apm_trace_span_2024070808 tbl_apm_trace_span_17 (cost=0.00..42393.04 rows=190 width=280) (actual time=224.318..224.318 rows=0 l
oops=1)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35'::uuid))
Rows Removed by Filter: 892100
-> Parallel Seq Scan on tbl_apm_trace_span_2024070815 tbl_apm_trace_span_24 (cost=0.00..8798.40 rows=60 width=281) (actual time=50.265..50.266 rows=0 loops
=1)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35'::uuid))
Rows Removed by Filter: 183003
-> Parallel Seq Scan on tbl_apm_trace_span_2024070800 tbl_apm_trace_span_9 (cost=0.00..2338.79 rows=4 width=528) (actual time=9.860..9.860 rows=0 loops=1)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35'::uuid))
Rows Removed by Filter: 28668
-> Parallel Seq Scan on tbl_apm_trace_span_2024070804 tbl_apm_trace_span_13 (cost=0.00..2334.56 rows=4 width=526) (actual time=11.359..11.360 rows=0 loops=
1)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35'::uuid))
Rows Removed by Filter: 28694
-> Parallel Seq Scan on tbl_apm_trace_span_2024070802 tbl_apm_trace_span_11 (cost=0.00..2334.49 rows=4 width=527) (actual time=11.492..11.492 rows=0 loops=
1)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35'::uuid))
Rows Removed by Filter: 28676
-> Parallel Seq Scan on tbl_apm_trace_span_2024070801 tbl_apm_trace_span_10 (cost=0.00..2331.88 rows=4 width=527) (actual time=11.411..11.412 rows=0 loops=
1)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35'::uuid))
Rows Removed by Filter: 28721
-> Parallel Seq Scan on tbl_apm_trace_span_2024070803 tbl_apm_trace_span_12 (cost=0.00..2331.41 rows=4 width=527) (actual time=11.323..11.323 rows=0 loops=
1)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35'::uuid))
Rows Removed by Filter: 28676
-> Parallel Seq Scan on tbl_apm_trace_span_2024070807 tbl_apm_trace_span_16 (cost=0.00..2328.20 rows=4 width=521) (actual time=10.979..10.979 rows=0 loops=
1)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35'::uuid))
Rows Removed by Filter: 28871
-> Parallel Seq Scan on tbl_apm_trace_span_2024070805 tbl_apm_trace_span_14 (cost=0.00..2326.54 rows=4 width=526) (actual time=10.657..10.658 rows=0 loops=
1)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35'::uuid))
Rows Removed by Filter: 28700
-> Parallel Seq Scan on tbl_apm_trace_span_2024070716 tbl_apm_trace_span_1 (cost=0.00..2326.40 rows=4 width=519) (actual time=0.003..11.443 rows=4 loops=1)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35'::uuid))
Rows Removed by Filter: 28873
-> Parallel Seq Scan on tbl_apm_trace_span_2024070806 tbl_apm_trace_span_15 (cost=0.00..2325.98 rows=4 width=524) (actual time=10.455..10.457 rows=0 loops=
1)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35'::uuid))
Rows Removed by Filter: 28684
-> Parallel Seq Scan on tbl_apm_trace_span_2024070722 tbl_apm_trace_span_7 (cost=0.00..2314.92 rows=4 width=525) (actual time=11.417..11.417 rows=0 loops=1
)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35'::uuid))
Rows Removed by Filter: 28631
-> Parallel Seq Scan on tbl_apm_trace_span_2024070717 tbl_apm_trace_span_2 (cost=0.00..2302.80 rows=4 width=526) (actual time=10.456..10.457 rows=0 loops=1
)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35'::uuid))
Rows Removed by Filter: 28366
-> Parallel Seq Scan on tbl_apm_trace_span_2024070723 tbl_apm_trace_span_8 (cost=0.00..2302.50 rows=4 width=525) (actual time=8.961..8.961 rows=0 loops=1)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35'::uuid))
Rows Removed by Filter: 28402
-> Parallel Seq Scan on tbl_apm_trace_span_2024070718 tbl_apm_trace_span_3 (cost=0.00..2300.91 rows=4 width=526) (actual time=9.024..9.025 rows=0 loops=1)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35'::uuid))
Rows Removed by Filter: 28336
-> Parallel Seq Scan on tbl_apm_trace_span_2024070719 tbl_apm_trace_span_4 (cost=0.00..2297.47 rows=4 width=525) (actual time=9.721..9.722 rows=0 loops=1)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35'::uuid))
Rows Removed by Filter: 28331
-> Parallel Seq Scan on tbl_apm_trace_span_2024070721 tbl_apm_trace_span_6 (cost=0.00..2297.17 rows=4 width=524) (actual time=9.742..9.743 rows=0 loops=1)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35'::uuid))
Rows Removed by Filter: 28362
-> Parallel Seq Scan on tbl_apm_trace_span_2024070720 tbl_apm_trace_span_5 (cost=0.00..2285.51 rows=4 width=524) (actual time=11.234..11.235 rows=0 loops=1
)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35'::uuid))
Rows Removed by Filter: 28208
Planning Time: 1.460 ms
Execution Time: 4150.593 ms
(78 rows)
SELECT * FROM tbl_apm_trace_span
WHERE created_at BETWEEN '2024-07-06 14:30:00' AND '2024-07-08 15:30:00'
AND trace_id = 'd641125b-8e71-7885-c1a2-c8328eeeba35';
분석
- Parallel Seq Scan:
- 각 파티션에서 병렬 순차 스캔을 수행하고 있다. 이는 인덱스를 전혀 타지 않으며 각 파티션을 순차적으로 스캔하여 조건에 맞는 데이터를 찾고 있다는 의미이다.
- 각 파티션에서 상당히 많은 Rows Removed by Filter가 있다. 즉, 많은 데이터를 스캔하지만 실제로 조건에 맞는 데이터는 거의 없다.
- Gather:
- 각 파티션에서의 결과를 모으는 단계이다. Workers Planned는 2이고, Workers Launched도 2이다. 즉, 두 개의 워커가 병렬로 작업하고 있다.
- 전체 성능:
- 전체 쿼리 실행 시간은 4150.593 ms (약 4.15초)이다.
- 병렬 처리를 통해 어느 정도 성능 향상을 기대할 수 있지만, 여전히 많은 데이터를 스캔해야 하기 때문에 시간이 소요된다.
위의 실행 로그를 보면 간단한 결론이 나온다.
결국 파티션 자체를 전부 하나하나 순차 스캔하고 있다는 뜻이다.
복합 인덱스 추가 시도와 Analyze
테이블을 파티셔닝 해놨지만, 결국 하나하나 모든 테이블을 끄집어서 보고 있다는 것을 알 수 있었다.
이는 결국 Full Scan과 동일하다는 뜻이며 파티셔닝의 효율을 전혀 사용할 수 없었다.
효율적인 조회 성능을 위해 테이블 구조를 확인해보자면 다음과 같다.
tomahawk=# \d+ tbl_apm_trace_span
Partitioned table "public.tbl_apm_trace_span"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
----------------------+-----------------------------+-----------+----------+---------+----------+-------------+--------------+-------------
id | bigint | | not null | | plain | | |
scope_name | character varying | | | | extended | | |
trace_id | uuid | | | | plain | | |
span_id | bytea | | | | extended | | |
parent_span_id | bytea | | | | extended | | |
flag | integer | | | | plain | | |
name | character varying | | | | extended | | |
kind | integer | | | | plain | | |
start_time_unix_nano | bigint | | | | plain | | |
end_time_unix_nano | bigint | | | | plain | | |
status_is_error | boolean | | | | plain | | |
attributes | jsonb | | | | extended | | |
has_event | boolean | | | | plain | | |
agent_id | bigint | | | | plain | | |
created_at | timestamp without time zone | | not null | | plain | | |
Partition key: RANGE (created_at)
Indexes:
"tbl_apm_trace_span_pkey" PRIMARY KEY, btree (id, created_at)
Partitions: tbl_apm_trace_span_2024070716 FOR VALUES FROM ('2024-07-07 16:00:00') TO ('2024-07-07 17:00:00'),
tbl_apm_trace_span_2024070717 FOR VALUES FROM ('2024-07-07 17:00:00') TO ('2024-07-07 18:00:00'),
tbl_apm_trace_span_2024070718 FOR VALUES FROM ('2024-07-07 18:00:00') TO ('2024-07-07 19:00:00'),
tbl_apm_trace_span_2024070719 FOR VALUES FROM ('2024-07-07 19:00:00') TO ('2024-07-07 20:00:00'),
tbl_apm_trace_span_2024070720 FOR VALUES FROM ('2024-07-07 20:00:00') TO ('2024-07-07 21:00:00'),
tbl_apm_trace_span_2024070721 FOR VALUES FROM ('2024-07-07 21:00:00') TO ('2024-07-07 22:00:00'),
tbl_apm_trace_span_2024070722 FOR VALUES FROM ('2024-07-07 22:00:00') TO ('2024-07-07 23:00:00'),
tbl_apm_trace_span_2024070723 FOR VALUES FROM ('2024-07-07 23:00:00') TO ('2024-07-08 00:00:00'),
tbl_apm_trace_span_2024070800 FOR VALUES FROM ('2024-07-08 00:00:00') TO ('2024-07-08 01:00:00'),
tbl_apm_trace_span_2024070801 FOR VALUES FROM ('2024-07-08 01:00:00') TO ('2024-07-08 02:00:00'),
tbl_apm_trace_span_2024070802 FOR VALUES FROM ('2024-07-08 02:00:00') TO ('2024-07-08 03:00:00'),
tbl_apm_trace_span_2024070803 FOR VALUES FROM ('2024-07-08 03:00:00') TO ('2024-07-08 04:00:00'),
tbl_apm_trace_span_2024070804 FOR VALUES FROM ('2024-07-08 04:00:00') TO ('2024-07-08 05:00:00'),
tbl_apm_trace_span_2024070805 FOR VALUES FROM ('2024-07-08 05:00:00') TO ('2024-07-08 06:00:00'),
tbl_apm_trace_span_2024070806 FOR VALUES FROM ('2024-07-08 06:00:00') TO ('2024-07-08 07:00:00'),
tbl_apm_trace_span_2024070807 FOR VALUES FROM ('2024-07-08 07:00:00') TO ('2024-07-08 08:00:00'),
tbl_apm_trace_span_2024070808 FOR VALUES FROM ('2024-07-08 08:00:00') TO ('2024-07-08 09:00:00'),
tbl_apm_trace_span_2024070809 FOR VALUES FROM ('2024-07-08 09:00:00') TO ('2024-07-08 10:00:00'),
tbl_apm_trace_span_2024070810 FOR VALUES FROM ('2024-07-08 10:00:00') TO ('2024-07-08 11:00:00'),
tbl_apm_trace_span_2024070811 FOR VALUES FROM ('2024-07-08 11:00:00') TO ('2024-07-08 12:00:00'),
tbl_apm_trace_span_2024070812 FOR VALUES FROM ('2024-07-08 12:00:00') TO ('2024-07-08 13:00:00'),
tbl_apm_trace_span_2024070813 FOR VALUES FROM ('2024-07-08 13:00:00') TO ('2024-07-08 14:00:00'),
tbl_apm_trace_span_2024070814 FOR VALUES FROM ('2024-07-08 14:00:00') TO ('2024-07-08 15:00:00'),
tbl_apm_trace_span_2024070815 FOR VALUES FROM ('2024-07-08 15:00:00') TO ('2024-07-08 16:00:00'),
tbl_apm_trace_span_2024070816 FOR VALUES FROM ('2024-07-08 16:00:00') TO ('2024-07-08 17:00:00')
PK(id)와 created_at(시간)을 유니크하게 설정한 뒤, 파티셔닝 키를 시간(created_at) 단위로 설정하였다.
만약 이 복합키를 기준으로 인덱스를 걸고 조회를 할 경우 다음과 같은 결과를 확인할 수 있다.
-- Embedded PK (id, created_at) 사용
explain analyze SELECT * FROM tbl_apm_trace_span
WHERE id = '33560130120000002' AND created_at BETWEEN '2024-07-06 14:30:00' AND '2024-07-08 15:30:00';
-- 부가적으로 아래와 같이 쿼리를 시도해보았다.
-- 1. Between 조건 배제 : 여전히 Parallel Seq Scan
explain analyze SELECT * FROM tbl_apm_trace_span
WHERE id = '33560130120000002' AND created_at > '2024-07-06 14:30:00';
-- 2. 인덱스 순서 변경 : 여전히 Parallel Seq Scan
explain analyze SELECT * FROM tbl_apm_trace_span
WHERE created_at > '2024-07-06 14:30:00' AND id = '33560130120000002';
-- 3. 통계 정보 업데이트 후 재시행시에도 결과는 동일하다.
ANALYZE tbl_apm_trace_span;
tomahawk=# explain analyze SELECT * FROM tbl_apm_trace_span
WHERE id = '33560130120000002' AND created_at BETWEEN '2024-07-06 14:30:00' AND '2024-07-08 15:30:00' ;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------
Gather (cost=1000.00..1436504.15 rows=24 width=441) (actual time=211.878..5746.947 rows=1 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Parallel Append (cost=0.00..1435501.75 rows=24 width=441) (actual time=3816.646..5622.706 rows=0 loops=3)
-> Parallel Seq Scan on tbl_apm_trace_span_2024070810 tbl_apm_trace_span_19 (cost=0.00..252562.58 rows=1 width=273) (actual time=2977.644..2977.646 rows=0
loops=1)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(id = '33560130120000002'::bigint))
Rows Removed by Filter: 5436744
-> Parallel Seq Scan on tbl_apm_trace_span_2024070809 tbl_apm_trace_span_18 (cost=0.00..251058.90 rows=1 width=272) (actual time=2775.838..2775.843 rows=0
loops=1)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(id = '33560130120000002'::bigint))
Rows Removed by Filter: 5412775
-> Parallel Seq Scan on tbl_apm_trace_span_2024070813 tbl_apm_trace_span_22 (cost=0.00..244249.11 rows=1 width=274) (actual time=1003.250..1003.251 rows=0
loops=3)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone) AND
(id = '33560130120000002'::bigint))
Rows Removed by Filter: 1722784
-> Parallel Seq Scan on tbl_apm_trace_span_2024070812 tbl_apm_trace_span_21 (cost=0.00..217825.37 rows=1 width=275) (actual time=2114.824..2114.825 rows=0
loops=1)
... (생략)
각 파티션 내에서 id 조건을 만족하는 데이터를 찾기 위해 순차적으로 스캔(Parallel Scan)하고 있다.
순차 스캔이라는 의미 자체가.. 사실 인덱스를 통해 랜덤 접근을 하지 못한다는 것을 의미한다.
즉, 이는 PRIMARY KEY인 id, created_at 조합이 각 파티션 내에서 인덱스로 사용되지 않고 있다는 것을 의미한다.
실제로 아래와 같이, 단일 인덱스를 추가하는 순간 Index Scan을 사용하는 것을 확인할 수 있다.
-- 테스트 검증용
CREATE INDEX idx_id ON tbl_apm_trace_span (id); -- 임시 단일 테이블 추가 .. 많은 시간..
tomahawk=# EXPLAIN ANALYZE SELECT * FROM tbl_apm_trace_span
tomahawk-# WHERE id = '33560130120000002'
tomahawk-# AND created_at BETWEEN '2024-07-06 14:30:00' AND '2024-07-08 15:30:00';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------
Append (cost=0.29..61.50 rows=24 width=441) (actual time=0.029..0.392 rows=1 loops=1)
-> Index Scan using tbl_apm_trace_span_2024070716_id_idx on tbl_apm_trace_span_2024070716 tbl_apm_trace_span_1 (cost=0.29..2.51 rows=1 width=519) (actual time=0.
028..0.029 rows=1 loops=1)
Index Cond: (id = '33560130120000002'::bigint)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone))
-> Index Scan using tbl_apm_trace_span_2024070717_id_idx on tbl_apm_trace_span_2024070717 tbl_apm_trace_span_2 (cost=0.29..2.51 rows=1 width=526) (actual time=0.
017..0.017 rows=0 loops=1)
Index Cond: (id = '33560130120000002'::bigint)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone))
-> Index Scan using tbl_apm_trace_span_2024070718_id_idx on tbl_apm_trace_span_2024070718 tbl_apm_trace_span_3 (cost=0.29..2.51 rows=1 width=526) (actual time=0.
013..0.013 rows=0 loops=1)
Index Cond: (id = '33560130120000002'::bigint)
Filter: ((created_at >= '2024-07-06 14:30:00'::timestamp without time zone) AND (created_at <= '2024-07-08 15:30:00'::timestamp without time zone))
-> Index Scan using tbl_apm_trace_span_2024070719_id_idx on tbl_apm_trace_span_2024070719 tbl_apm_trace_span_4 (cost=0.29..2.51 rows=1 width=525) (actual time=0.
013..0.013 rows=0 loops=1)
... (생략)
찾아볼 수 있었던 결론
파티셔닝 테이블에 맞게 인덱스를 적용하지 않았을 경우, 실제 쿼리의 동작 과정은 비효율적(정확히 말하자면 인덱스 자체를 전혀 타지 못한다.)으로 동작한다.
이는 당시에는 인지하지 못했던 점이지만, 지금 와서 복기하였을 때 다음과 같은 내용이 정확하다.
파티셔닝의 기준값(created_at)을 복합 인덱스의 선두 prefix로 두지 않았기에, id 기준으로 먼저 탐색했었다.
이는 시간 단위의 파티셔닝의 효율을 전혀 타지 못했고 결국 모든 테이블을 스캔하게 된다.