ITPub博客

首页 > 数据库 > PostgreSQL > postgresql物理备份工具pg_rman的使用详解

postgresql物理备份工具pg_rman的使用详解

原创 PostgreSQL 作者:lusklusklusk 时间:2020-01-17 22:53:12 0 删除 编辑

pg_rman是一个开源的PostgreSQL备份软件,使用的是pg_start_backup(), copy, pg_stop_backup()的备份模式,因为是文件拷贝模式,所以pg_rman必须在数据库节点上运行

pg_rman的使用方法说明
http://ossc-db.github.io/pg_rman/index.html

pg_rman的下载地址
https://github.com/ossc-db/pg_rman/releases

OS是CentOS Linux release 7.6,postgresql版本是PostgreSQL 11.3,则pg_rman的下载版本可以1.3.8或1.3.9
https://github.com/ossc-db/pg_rman/releases/download/V1.3.8/pg_rman-1.3.8-1.pg11.rhel7.x86_64.rpm
https://github.com/ossc-db/pg_rman/releases/download/V1.3.9/pg_rman-1.3.9-1.pg11.rhel7.x86_64.rpm

pg_rman的安装
[root@FRSPGSQLDEV2 ~]# rpm -ivh pg_rman-1.3.9-1.pg11.rhel7.x86_64.rpm
安装好后,pg_rman会生成到/usr/pgsql-11/bin/目录下,我们直接配置/etc/profile就可以使用pg_rman了




pg_rman的使用方法
1、初始化,实际上就是需要一个目录,这个目录将用于存放备份的文件
-bash-4.2$ pg_rman init -B /pgdata/backup
WARNING: ARCLOG_PATH is not set yet
DETAIL: The archive_command is not set in postgresql.conf.
HINT: Please set ARCLOG_PATH in pg_rman.ini or environmental variable.
INFO: SRVLOG_PATH is set to '/pgdata/log'

配置归档,主要是修改postgresql.conf中的以下三个参数,以下/XX就是归档目录
wal_level=replica
archive_mode =on
archive_command ='cp %p /XX/%f'
配置ARCLOG_PATH,就是修改备份目的地目录/pgdata/backup中的pg_rman.ini文件的ARCLOG_PATH参数,这个值就是归档目录/XX
ARCLOG_PATH=/XX

2、全量备份(会备份数据文件和归档的日志文件)
-bash-4.2$ pg_rman backup -b full -B /pgdata/backup

3、验证备份
-bash-4.2$ pg_rman validate -B /pgdata/backup
备注:如果好几个备份都没有执行验证的话,执行一次以上命令就会都验证这些没有验证的备份
      没有验证的话,pg_rman show也能看到这个没有验证的备份,但是状态显示DONE而非OK

4、查看备份
-bash-4.2$ pg_rman show -B /pgdata/backup

5、增量备份(会备份数据文件和归档的日志文件)
-bash-4.2$ pg_rman backup -b incremental  -B /pgdata/backup

6、备份归档日志
-bash-4.2$ pg_rman backup -b archive  -B /pgdata/backup

7、删除备份,后面必须接具体的时间,否则会报错ERROR: delete range option not specified  HINT: Please run with 'pg_rman delete DATE'.
-bash-4.2$ pg_rman delete -B /pgdata/backup "2020-01-05 23:13:19"
--例如我只需要我的备份集能恢复到"2020-01-05 23:13:19",只会保留一份在"2020-01-05 23:13:19"之前的最新备份,其他的备份都会删除。
--以上删除后pg_rman show看不到这些备份,每次备份的物理目录比如/pgdata/backup/20200115还是存在的,但是/pgdata/backup/20200115下面的子目录arclog、database、srvlog被物理删除了
--当然也可以采用备份策略来自动删除,这个自动删除的动作发生在每次备份的时候,以下配置可以写在pg_rman.ini文件中
KEEP_DATA_GENERATIONS = 3   -- 备份集冗余度是3,比如做了3次全备份,做第4次全备份的时候会自动把第1次的备份从catalog里面删除,第四次备份的时候会出现这样的信息INFO: delete the backup with start time:"第1次备份的时间点"
KEEP_DATA_DAYS = 10         -- 备份集保留日期10天

8、删除每次物理备份的目录,基于上面7的基础,即只有执行了delete的才会被purge掉
-bash-4.2$ pg_rman purge -B /pgdata/backup
--delete的时候保留了每次备份的物理目录,执行purge后会把delete残留下来的物理目录比如/pgdata/backup/20200115删除了
--没有执行步骤7的delete,直接执行步骤8的purge不会删除任何东西

9、恢复
-bash-4.2$ pg_rman restore -B /pgdata/backup --recovery-target-time "2020-01-15 22:11:39" --hard-copy
--如果不指定recovery-target-time,则恢复到最新时间
--如果不指定hard-copy,则归档日志目录里的归档日志是使用的硬连接指向备份目录中的归档日志,加了这个参数的话,则是直接把备份目录中的归档日志拷贝到归档日志目录

恢复的一些备注:
9.1、恢复好后,如果遇到数据库启动后变成了只读模式ERROR:  cannot execute XX in a read-only transaction,请检查数据目录下的recovery.conf文件里的信息,把它改名或删除再重新启动
9.2、正常情况数据目录没有recovery.conf文件的,备份目录里也没有recovery.conf文件的,这个文件是每次执行恢复后自动生成的,存放在数据目录中,可以删除或改名。
9.3、恢复时,可以选择原地恢复(覆盖式),也可以使用新的$PGDATA作为恢复目标。原地恢复(覆盖式)时pg_rman会覆盖原有的数据文件,arch, pg_wal目录中的文件,所以,如果你要保留原数据,建议先将原数据目录重命名。
9.4、恢复除了recovery-target-time和hard-copy两个参数外,还有recovery-target-timeline和recovery-target-xid两个参数
--recovery-target-timeline TIMELINE  
如果不指定时间线,则使用$PGDATA/global/pg_control,如果没有$PGDATA/global/pg_control,则使用最新的全量备份集的时间线。
Specifies recovering into a particular timeline. If not specified, the current timeline from ($PGDATA/global/pg_control) is used.  
-bash-4.2$ pg_rman restore -B /home/backup
WARNING: pg_controldata file "/pgdata/global/pg_control" does not exist
INFO: the recovery target timeline ID is not given
INFO: use timeline ID of latest full backup as recovery target: 1

--recovery-target-xid XID
如果不指定,则恢复到最新xid
This parameter specifies the transaction ID up to which recovery will proceed. If not specified, continue recovery to the latest xid.  



实践发现的问题
备份目录一定不能是数据目录的子目录,否则会发现备份越来越大,incremental都比full备份大,原因就是因为数据目录是/pgdata,而备份目录是/pgdata/backup,如此一来,每次备份都会把/pgdata下面的目录都备份一遍,没错,也就是说每次备份都会把上次的备份再备份一遍






备份恢复案例
数据目录是/pgdata,备份目录是/home/backup

1、全备份(时间2020-01-15 21:53:48,database有test1,schema有s1,表有public.table1)
postgres=# create database test1;
postgres=# \c test1
test1=# create schema s1;
test1=# create table table1 (hid int);
test1=# insert into table1 values(1);

-bash-4.2$ pg_rman backup -b full -B /home/backup
-bash-4.2$ pg_rman validate -B /home/backup
INFO: validate: "2020-01-15 21:53:48" backup and archive log files by CRC
INFO: backup "2020-01-15 21:53:48" is valid

2、增量备份(时间2020-01-15 22:11:39,database有test1,schema有s2,表有s2.t1)
postgres=# \c test1
test1=# create table table2(hid int);
test1=# create schema s2;
test1=# create table s2.t1(hid int);

-bash-4.2$ pg_rman backup -b incremental -B /home/backup
-bash-4.2$ pg_rman validate -B /home/backup
INFO: validate: "2020-01-15 22:11:39" backup and archive log files by CRC
INFO: backup "2020-01-15 22:11:39" is valid

3、全量备份(时间2020-01-15 22:21:48,database有test2,schema有s3,表有public.t3和s3.s3)
postgres=# create database test2;
postgres=# \c test2
test2=# create schema s3;
test2=# create table t3(hid int);
test2=# create table s3.s3(hid int);
test2=# insert into t3 values(1);
test2=# insert into s3.s3 values(1);

-bash-4.2$ pg_rman backup -b full -B /home/backup
-bash-4.2$ pg_rman validate -B /home/backup
INFO: validate: "2020-01-15 22:21:48" backup and archive log files by CRC
INFO: backup "2020-01-15 22:21:48" is valid


4、停止服务,删除目录/pgdata
[root@FRSPGSQLDEV2 ~]pkill -9 postgres

5、恢复,不加任何参数
-bash-4.2$ pg_rman restore -B /home/backup
WARNING: pg_controldata file "/pgdata/global/pg_control" does not exist
INFO: the recovery target timeline ID is not given
INFO: use timeline ID of latest full backup as recovery target: 1
INFO: calculating timeline branches to be used to recovery target point
INFO: searching latest full backup which can be used as restore start point
INFO: found the full backup can be used as base in recovery: "2020-01-15 22:21:48"
INFO: copying online WAL files and server log files
INFO: clearing restore destination
INFO: validate: "2020-01-15 22:21:48" backup and archive log files by SIZE
INFO: backup "2020-01-15 22:21:48" is valid
INFO: restoring database files from the full mode backup "2020-01-15 22:21:48"
INFO: searching incremental backup to be restored
INFO: searching backup which contained archived WAL files to be restored
INFO: backup "2020-01-15 22:21:48" is valid
INFO: restoring WAL files from backup "2020-01-15 22:21:48"
INFO: restoring online WAL files and server log files
INFO: generating recovery.conf
INFO: restore complete
HINT: Recovery will start automatically when the PostgreSQL server is started.
--经检测,恢复到最近的数据

6、停止服务,删除目录/pgdata,恢复到2020-01-15 22:11:39
-bash-4.2$ pg_rman restore -B /home/backup --recovery-target-time "2020-01-15 22:11:39"
WARNING: pg_controldata file "/pgdata/global/pg_control" does not exist
INFO: the recovery target timeline ID is not given
INFO: use timeline ID of latest full backup as recovery target: 1
INFO: calculating timeline branches to be used to recovery target point
INFO: searching latest full backup which can be used as restore start point
INFO: found the full backup can be used as base in recovery: "2020-01-15 21:53:48"
INFO: copying online WAL files and server log files
INFO: clearing restore destination
INFO: validate: "2020-01-15 21:53:48" backup and archive log files by SIZE
INFO: backup "2020-01-15 21:53:48" is valid
INFO: restoring database files from the full mode backup "2020-01-15 21:53:48"
INFO: searching incremental backup to be restored
INFO: searching backup which contained archived WAL files to be restored
INFO: backup "2020-01-15 21:53:48" is valid
INFO: restoring WAL files from backup "2020-01-15 21:53:48"
INFO: backup "2020-01-15 22:11:39" is valid
INFO: restoring WAL files from backup "2020-01-15 22:11:39"
INFO: backup "2020-01-15 22:21:48" is valid
INFO: restoring WAL files from backup "2020-01-15 22:21:48"
INFO: restoring online WAL files and server log files
INFO: generating recovery.conf
INFO: restore complete
HINT: Recovery will start automatically when the PostgreSQL server is started.
--经检测,恢复到了2020-01-15 22:11:39,即增量备份的时刻点,有test1库,没有test2库,test1库有s1和s2两个schema,s2.t1存在


7、停止服务,删除目录/pgdata,恢复到2020-01-15 22:11:39
-bash-4.2$ pg_rman restore -B /home/backup --recovery-target-time "2020-01-15 21:53:48"
WARNING: pg_controldata file "/pgdata/global/pg_control" does not exist
INFO: the recovery target timeline ID is not given
INFO: use timeline ID of latest full backup as recovery target: 1
INFO: calculating timeline branches to be used to recovery target point
INFO: searching latest full backup which can be used as restore start point
INFO: found the full backup can be used as base in recovery: "2020-01-15 02:04:57"
INFO: copying online WAL files and server log files
INFO: clearing restore destination
INFO: validate: "2020-01-15 02:04:57" backup and archive log files by SIZE
INFO: backup "2020-01-15 02:04:57" is valid
INFO: restoring database files from the full mode backup "2020-01-15 02:04:57"
INFO: searching incremental backup to be restored
INFO: searching backup which contained archived WAL files to be restored
INFO: backup "2020-01-15 02:04:57" is valid
INFO: restoring WAL files from backup "2020-01-15 02:04:57"
INFO: backup "2020-01-15 21:53:48" is valid
INFO: restoring WAL files from backup "2020-01-15 21:53:48"
INFO: backup "2020-01-15 22:11:39" is valid
INFO: restoring WAL files from backup "2020-01-15 22:11:39"
INFO: backup "2020-01-15 22:21:48" is valid
INFO: restoring WAL files from backup "2020-01-15 22:21:48"
INFO: restoring online WAL files and server log files
INFO: generating recovery.conf
INFO: restore complete
HINT: Recovery will start automatically when the PostgreSQL server is started.
--经检测,恢复到了2020-01-15 22:11:39,即第一次备份时刻点,有test1库,没有test2库,test1库有s1但是没有s2这个schema

来自 “ ITPUB博客 ” ,链接:/30126024/viewspace-2673866/,如需转载,请注明出处,否则将追究法律责任。

请登录后发表评论 登录
全部评论
Welcome to Lukes DB HOME。 10G OCP、11G OCM, 8年以上DBA工作经验,博客仅记录自己的一个学习过程,不代表完全准确,如有需要,欢迎转载。

注册时间:2015-02-02

  • 博文量
    396
  • 访问量
    582795