ablog

不器用で落着きのない技術者のメモ

Laravel からクラスターモード有効な ElastiCache(Redis) へのアクセスでタイムアウトが発生する

事象

解決策

  • Laravel の database.php を以下の通り設定
    'redis' => [

        'client' => env('REDIS_CLIENT', 'predis'),

        'options' => [
            'cluster' => env('REDIS_CLUSTER', 'redis'),
            'prefix' => env('REDIS_PREFIX', Str::slug(env('APP_NAME', 'laravel'), '_').'_database_'),
            'parameters' => [
                'scheme'   => 'tls', # ★ default ではなく options に記述する
                'password' => env('REDIS_PASSWORD', null),
            ],
        ],

        'clusters' => [
            'default' => [
                [
                    'scheme' => 'tls', # ★ tls を記述する
                    'host' => env('REDIS_HOST', '127.0.0.1'),
                    'password' => env('REDIS_PASSWORD', null),
                    'port' => env('REDIS_PORT', 6379),
                    'database' => env('REDIS_DB', '0'),
                ],
                'options' => [
                    'cluster' => env('REDIS_CLUSTER', 'redis'),
                ],
            ],
            'cache' => [
                [
                    'scheme' => 'tls', # ★ tls を記述する
                    'url' => env('REDIS_URL'),
                    'host' => env('REDIS_HOST', '127.0.0.1'),
                    'password' => env('REDIS_PASSWORD', null),
                    'port' => env('REDIS_PORT', 6379),
                    'database' => env('REDIS_CACHE_DB', '0'),
                ],
                'options' => [
                    'cluster' => env('REDIS_CLUSTER', 'redis'),
                ],
            ],
        ],
    ],
];

構成

  • Laravel 8.38
  • ElastiCache(Redis)
    • バージョン:6.0.5
    • クラスターモード:有効
    • マルチAZ:有効
    • シャード数:1
    • ノード数:2
    • 送信中の暗号化:有効

参考

I was able to get it to work!
You need to move 'scheme' from 'options' to 'default':
My working config:

'redis' => [
    'client' => 'predis',
    'cluster' => env('REDIS_CLUSTER', false),

    'default' => [
        'scheme' => 'tls',
        'host' => env('REDIS_HOST', 'localhost'),
        'password' => env('REDIS_PASSWORD', null),
        'port' => env('REDIS_PORT', 6379),
        'database' => 0,
    ],

    'options' => [
        'parameters' => ['password' => env('REDIS_PASSWORD', null)],
    ],
],

Edit: The above config only works with single-node redis. If you happen to enable clustering and TLS then you'll need a different config entirely.

'redis' => [
        'client' => 'predis',
        'cluster' => env('REDIS_CLUSTER', false),

        // Note! for single redis nodes, the default is defined here.
        // keeping it here for clusters will actually prevent the cluster config
        // from being used, it'll assume single node only.
        //'default' => [
        //    ...
        //],

        // #pro-tip, you can use the Cluster config even for single instances!
        'clusters' => [
            'default' => [
                [
                    'scheme'   => env('REDIS_SCHEME', 'tcp'),
                    'host'     => env('REDIS_HOST', 'localhost'),
                    'password' => env('REDIS_PASSWORD', null),
                    'port'     => env('REDIS_PORT', 6379),
                    'database' => env('REDIS_DATABASE', 0),
                ],
            ],
            'options' => [ // Clustering specific options
                'cluster' => 'redis', // This tells Redis Client lib to follow redirects (from cluster)
            ]
        ],
        'options' => [
            'parameters' => [ // Parameters provide defaults for the Connection Factory
                'password' => env('REDIS_PASSWORD', null), // Redirects need PW for the other nodes
                'scheme'   => env('REDIS_SCHEME', 'tcp'),  // Redirects also must match scheme
            ],
        ]
    ]

I was able to get it to work!

You need to move 'scheme' from 'options' to 'default':

My working config:

'redis' => [
'client' => 'predis',
'cluster' => env('REDIS_CLUSTER', false),

'default' => [
'scheme' => 'tls',
'host' => env('REDIS_HOST', 'localhost'),
'password' => env('REDIS_PASSWORD', null),
'port' => env('REDIS_PORT', 6379),
'database' => 0,
],

'options' => [
'parameters' => ['password' => env('REDIS_PASSWORD', null)],
],
],
Note: I had also removed the 'cluster' option from 'options', but I don't suspect this to be the make-or-break with this problem.

In my final-final config, I changed it to: 'scheme' => env('REDIS_SCHEME', 'tcp'), and then defined REDIS_SCHEME=tls in my env file instead.

Tested with AWS ElastiCache with TLS enabled.

Edit: The above config only works with single-node redis. If you happen to enable clustering and TLS then you'll need a different config entirely.

'redis' => [
'client' => 'predis',
'cluster' => env('REDIS_CLUSTER', false),

// Note! for single redis nodes, the default is defined here.
// keeping it here for clusters will actually prevent the cluster config
// from being used, it'll assume single node only.
//'default' => [
// ...
//],

// #pro-tip, you can use the Cluster config even for single instances!
'clusters' => [
'default' => [
[
'scheme' => env('REDIS_SCHEME', 'tcp'),
'host' => env('REDIS_HOST', 'localhost'),
'password' => env('REDIS_PASSWORD', null),
'port' => env('REDIS_PORT', 6379),
'database' => env('REDIS_DATABASE', 0),
],
],
'options' => [ // Clustering specific options
'cluster' => 'redis', // This tells Redis Client lib to follow redirects (from cluster)
]
],
'options' => [
'parameters' => [ // Parameters provide defaults for the Connection Factory
'password' => env('REDIS_PASSWORD', null), // Redirects need PW for the other nodes
'scheme' => env('REDIS_SCHEME', 'tcp'), // Redirects also must match scheme
],
]
]
|

php - Laravel + Redis Cache via SSL? - Stack Overflow

Explaining the above:

  • 'client' => 'predis': This specifies the PHP Library Redis driver to use (predis).
  • 'cluster' => 'redis': This tells Predis to assume server-side clustering. Which just means "follow redirects" (e.g. -MOVED responses). When running with a cluster, a node will respond with a -MOVED to the node that you must ask for a specific key.
  • If you don't have this enabled with Redis Clusters, Laravel will throw a -MOVED exception 1/n times, n being the number of nodes in Redis cluster (it'll get lucky and ask the right node every once in awhile)
  • 'clusters' => [...] : Specifies a list of nodes, but setting just a 'default' and pointing it to the AWS 'Configuration endpoint' will let it find any/all other nodes dynamically (recommended for Elasticache, because you don't know when nodes are comin' or goin').
  • 'options': For Laravel, can be specified at the top-level, cluster-level, and node option. (they get combined in Illuminate before being passed off to Predis)
  • 'parameters': These 'override' the default connection settings/assumptions that Predis uses for new connections. Since we set them explicitly for the 'default' connection, these aren't used. But for a cluster setup, they are critical. A 'master' node may send back a redirect (-MOVED) and unless the parameters are set for password and scheme it'll assume defaults, and that new connection to the new node will fail.

Aurora PostgreSQL に RDS Proxy 経由で同時多重でクエリを発行中に Reader インスタンスを削除してみる

Aurora PostgreSQL に RDS Proxy 経由で同時多重でクエリを発行中に Reader インスタンスを削除すると、クエリでエラーが発生するか確認してみた。

To Do

  • pgbench で -d オプション付きで試す。

検証手順

  • psql で接続する
$ psql -h apg117-2-test-read-only.endpoint.proxy-********.ap-northeast-1.rds.amazonaws.com -p 5432 -d postgres -U awsuser
  • pgbench でテーブルを作成してデータをロードする
$ pgbench -i -s 1000 -U awsuser -h apg117-2-test.proxy-********.ap-northeast-1.rds.amazonaws.com -d postgres
  • pgbench で負荷をかける
$  nohup pgbench -Sn -c 300 -j 300 -t 10000 -U awsuser -h apg117-2-test-read-only.endpoint.proxy-********.ap-northeast-1.rds.amazonaws.com -d postgres -p 5432 > reader_delete_202109301213.log 2>&1 &

検証結果

  • reader_delete_202109301213.log
pghost: apg117-2-test-read-only.endpoint.proxy-********.ap-northeast-1.rds.amazonaws.com pgport: 5432 nclients: 300 nxacts: 10000 dbName: postgres
transaction type: SELECT only
scaling factor: 1000
query mode: simple
number of clients: 300
number of threads: 300
number of transactions per client: 10000
number of transactions actually processed: 2498607/3000000 ★
tps = 1171.672954 (including connections establishing)
tps = 1172.147579 (excluding connections establishing)

構成

RDS Proxy 経由で Aurora の Reader エンドポイントに同時多重でクエリを発行して均等分散することを確認した

Aurora PostgreSQL の Reader エンドポイントでクエリが均等分散しない場合 - ablog について RDS Proxy 経由だとどうだろうと思って検証してみたら、概ね均等分散してくれてた。

検証手順

  • psql で接続する
$ psql -h apg117-2-test-read-only.endpoint.proxy-********.ap-northeast-1.rds.amazonaws.com -p 5432 -d postgres -U awsuser
  • pgbench でテーブルを作成してデータをロードする
$ pgbench -i -s 1000 -U awsuser -h apg117-2-test.proxy-********.ap-northeast-1.rds.amazonaws.com -d postgres
  • pgbench で負荷をかける
$ pgbench -Sn -c 300 -j 300 -t 100000 -U awsuser -h apg117-2-test-read-only.endpoint.proxy-********.ap-northeast-1.rds.amazonaws.com -d postgres -p 5432 > /dev/null 2>&1

検証結果

  • CPU使用率、DB接続は概ね均等分散している。

f:id:yohei-a:20210930104132p:plain

  • RDS Proxy の CloudWatch メトリクスはこんな感じ。

f:id:yohei-a:20210930105236p:plain

構成

pcp_detach_node -g で既存セッションに影響なくバックエンドのDBを切り離せるか確認した

確認したいこと

  • pcp_detach_node -g で既存セッションに影響なくバックエンドのDBを切り離せるか
    • デタッチ実行後に新規にクエリに振分けられ、セッションがなくなるまでデタッチされない
  • pcp_detach_node だと強制的にデタッチされるが既存セッションに影響がある

手順

  • pgbench で Pgpool-II 経由で PostgreSQL に繰り返しクエリを発行する
while :
do
    pgbench -Sn -c 10 -j 10 -t 10000 -U awsuser -h localhost -d postgres -p 9999
done
$ psql "host=localhost dbname=postgres port=9999 user=awsuser"
postgres=> show pool_nodes;
 node_id |                                          hostname                                           | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | r
eplication_state | replication_sync_state | last_status_change
---------+---------------------------------------------------------------------------------------------+------+--------+-----------+---------+------------+-------------------+-------------------+--
-----------------+------------------------+---------------------
 0       | aurora-postgres124.cluster-********.ap-northeast-1.rds.amazonaws.com                    | 5432 | up     | 0.000000  | primary | 6          | false             | 0                 |
                 |                        | 2021-09-29 04:35:21
 1       | aurora-postgres124-instance-1-ap-northeast-1a.********.ap-northeast-1.rds.amazonaws.com | 5432 | up     | 0.500000  | standby | 256822     | true              | 0                 |
                 |                        | 2021-09-29 04:35:21
 2       | reader2.********.ap-northeast-1.rds.amazonaws.com                                       | 5432 | up     | 0.500000  | standby | 246942     | false             | 0                 |
                 |                        | 2021-09-29 05:03:07
(3)
  • Pgpool-II でバックエンドのDBインスタンスを切り離す
    • 実行後も "show pool_nodes" で確認すると、select_cnt が増えていく
$ pcp_detach_node -U pgpool -h localhost -n 2 -g
  • Pgpool-II でクエリの発行状況を確認する
    • 接続しぱなしだとPgpool-IIから切り離されいので、切断する
postgres=> show pool_nodes;
 node_id |                                          hostname                                           | port | status | lb_weight |  role   | select_cnt | load_balance_node | replication_delay | r
eplication_state | replication_sync_state | last_status_change
---------+---------------------------------------------------------------------------------------------+------+--------+-----------+---------+------------+-------------------+-------------------+--
-----------------+------------------------+---------------------
 0       | aurora-postgres124.cluster-********.ap-northeast-1.rds.amazonaws.com                    | 5432 | up     | 0.000000  | primary | 8          | false             | 0                 |
                 |                        | 2021-09-29 04:35:21
 1       | aurora-postgres124-instance-1-ap-northeast-1a.********.ap-northeast-1.rds.amazonaws.com | 5432 | up     | 0.500000  | standby | 321493     | true              | 0                 |
                 |                        | 2021-09-29 04:35:21
 2       | reader2.********.ap-northeast-1.rds.amazonaws.com                                       | 5432 | up     | 0.500000  | standby | 311211     | false             | 0                 |
                 |                        | 2021-09-29 05:03:07
(3)

postgres=> \q

参考

  • アタッチするコマンド
$ pcp_attach_node -U pgpool -h localhost -n 1

Pgpool-II に PostgreSQL をデタッチ/アタッチする

Pgpool-II に PostgreSQL をデタッチ/アタッチしてみたメモ。

  • 状態を確認する
$  psql "host=localhost dbname=postgres port=9999 user=awsuser"
psql (9.2.24, server 12.4)
WARNING: psql version 9.2, server version 12.0.
         Some psql features might not work.
Type "help" for help.

postgres=> show pool_nodes;
 node_id |                                          hostname                                           | port | status | lb_weight |  role   | selec
t_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
---------+---------------------------------------------------------------------------------------------+------+--------+-----------+---------+------
------+-------------------+-------------------+-------------------+------------------------+---------------------
 0       | aurora-postgres124.cluster-********.ap-northeast-1.rds.amazonaws.com                    | 5432 | up     | 0.000000  | primary | 0
      | false             | 0                 |                   |                        | 2021-09-15 21:30:55
 1       | aurora-postgres124-instance-1-ap-northeast-1a.********.ap-northeast-1.rds.amazonaws.com | 5432 | up★     | 0.500000  | standby | 0
      | true              | 0                 |                   |                        | 2021-09-15 21:34:06
 2       | reader2.********.ap-northeast-1.rds.amazonaws.com                                       | 5432 | up     | 0.500000  | standby | 0
      | false             | 0                 |                   |                        | 2021-09-15 22:39:21
(3 rows)

postgres=>
  • デタッチする
  • U: PCP ユーザー
  • h: Pgpool-II ホスト
  • n: Pgpool-II のバックエンドのデータベース
  • g: すべてのクライアントが接続を終了するまで待機する
$ pcp_detach_node -U pgpool -h localhost -n 1 -g
Password: pgpool # PCP パスワードを入力
pcp_detach_node -- Command Successful
  • アタッチする
$ pcp_attach_node -U pgpool -h localhost -n 1
  • ステータスを確認する
  • "-g" オプションをつけると、down にならないので要調査
postgres=> show pool_nodes;
 node_id |                                          hostname                                           | port | status | lb_weight |  role   | selec
t_cnt | load_balance_node | replication_delay | replication_state | replication_sync_state | last_status_change
---------+---------------------------------------------------------------------------------------------+------+--------+-----------+---------+------
------+-------------------+-------------------+-------------------+------------------------+---------------------
 0       | aurora-postgres124.cluster-********.ap-northeast-1.rds.amazonaws.com                    | 5432 | up     | 0.000000  | primary | 0
      | false             | 0                 |                   |                        | 2021-09-15 21:30:55
 1       | aurora-postgres124-instance-1-ap-northeast-1a.*******.ap-northeast-1.rds.amazonaws.com | 5432 | down★   | 0.500000  | standby | 0
      | false             | 0                 |                   |                        | 2021-09-15 22:54:08
 2       | reader2.*******.ap-northeast-1.rds.amazonaws.com                                       | 5432 | up     | 0.500000  | standby | 0
      | true              | 0                 |                   |                        | 2021-09-15 22:39:21

Aurora のクラスターパラメータグループの特定のパラメータ値をリセットする

AWS CLI でリセットすることができる
以下はパラメータグループ "aurora-postgresql11-custom-cluster" の "log_min_duration_statement" をリセットする例

$ aws rds reset-db-cluster-parameter-group --db-cluster-parameter-group-name aurora-postgresql11-custom-cluster --parameters "ParameterName=log_min_duration_statement,ApplyMethod=immediate"