View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0000455 | Pgpool-II | Bug | public | 2019-01-10 11:17 | 2019-01-21 13:03 |
| Reporter | nagata | Assigned To | nagata | ||
| Priority | normal | Severity | minor | Reproducibility | sometimes |
| Status | closed | Resolution | fixed | ||
| Product Version | 4.0.2 | ||||
| Summary | 0000455: watchdog lifecheck process has segfalut in query mode | ||||
| Description | When wd_lifecheck_method is 'query', lifecheck process has a segmentation fault. This happens not always, but I can reproduce this about in five minutes from start up of the cluster with the configuration produced by watchdog_setup. In my analysis, this is due to a new feature of 4.0, AES password support in wd_lifecheck_password. This is performed by get_pgpool_config_user_password(), but this is not multi-thread safe function since this uses pstrdup and pfree. And unfortunately, 'query' mode uses pthread. pstrdup could allocate the same address in different threads, and this led double-free when pfree was called. To fix, we have to prohibit to use get_pgpool_config_user_password() in query mode, that is, AES password can not be supported in wd_lifecheck_password. Scanning pool_passwd when wd_lifecheck_passwrd is empty can not be supported, either. The patch for wd_lifecheck.c is attached, but additionally some documentation fix will be necessary. We might be able to fix get_pgpool_config_user_password() to be multi-thread safe, or fix 'query' mode to not use pthread, but I don't think it is worth, because 'query' mode is deprecated. (Maybe, this should have been removed when 4.0 was released....) | ||||
| Steps To Reproduce | This happens not always, but I can reproduce this about in five minutes from start up of the cluster with the configuration produced by watchdog_setup with wd_lifecheck_method = 'query'. | ||||
| Tags | No tags attached. | ||||
|
|
|
|
|
Thank you. We will confirm this issue and test your patch. |
|
|
A patch for this issue is created by Usama. If possible, could you test this patch? See more details: https://www.pgpool.net/pipermail/pgpool-hackers/2019-January/003220.html |
|
|
I have tested this patch, but this was broken. The position where pfree is called in check_pgpool_status_by_query was so wrong that password was freed before all threads exited. This caused another segmentation fault. Also, there was a memory leak in is_wd_lifecheck_ready though it was a not big issue. Attached is the fixed patch. This works well in my environment avoiding segfault. Could you please test this? |
|
|
Thank you for testing and fixing the patch. Patch is commited: https://git.postgresql.org/gitweb?p=pgpool2.git;a=commitdiff;h=3ae4acfa1037ae1ed4101465ea880909b59dd30e |
|
|
Thanks. The fixed version will be released at Feb. 21st, right? |
|
|
Yes, we plan to do next minor release on Feb. 21st. |
|
|
Thank you! I'll close this. |
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2019-01-10 11:17 | nagata | New Issue | |
| 2019-01-10 11:17 | nagata | File Added: wd_lifecheck_segfault.patch | |
| 2019-01-10 11:19 | nagata | Description Updated | |
| 2019-01-10 12:11 | nagata | Steps to Reproduce Updated | |
| 2019-01-10 14:01 | pengbo | Note Added: 0002317 | |
| 2019-01-15 14:48 | pengbo | File Added: crash_fix.diff | |
| 2019-01-15 14:48 | pengbo | Note Added: 0002319 | |
| 2019-01-16 15:29 | nagata | File Added: fixed_crash_fix.diff | |
| 2019-01-16 15:29 | nagata | Note Added: 0002320 | |
| 2019-01-16 15:32 | nagata | Note Edited: 0002320 | |
| 2019-01-16 15:36 | nagata | Note Edited: 0002320 | |
| 2019-01-21 11:42 | pengbo | Note Added: 0002323 | |
| 2019-01-21 11:59 | nagata | Note Added: 0002324 | |
| 2019-01-21 12:40 | pengbo | Note Added: 0002325 | |
| 2019-01-21 12:41 | pengbo | Note Edited: 0002325 | |
| 2019-01-21 13:03 | nagata | Note Added: 0002326 | |
| 2019-01-21 13:03 | nagata | Assigned To | => nagata |
| 2019-01-21 13:03 | nagata | Status | new => closed |
| 2019-01-21 13:03 | nagata | Resolution | open => fixed |