요약:Integrated Data Protection Appliance (IDPA) model DP4400 requires a specific shutdown procedure to ensure that all of the software components are shutdown gracefully. Failure to follow this procedure may result in problems when restarting the appliance.
문서 콘텐츠
지침
Applies to:
- DP4400
Preamble:
DP4400 version 2.2 ,currently, has a detailed shutdown procedure to ensure that all components of the appliance are shut down in a graceful manner.
Please note that the shut down procedure is lengthy and will take up to 45 minutes to complete.
Skipping any of the steps could result in inability to restart all of the DP4400 appliance Backup Server or Storage components.
Enhancements to reduce the complexity and time required to shutdown a DP4400 are forthcoming in future releases.
Documentation:
The full (8 pages) shutdown procedure, along with shutdown troubleshooting steps, is detailed in the DP4400 Field Service Guide
https://support.emc.com/docu89622_DP4400_Field_Service_Guide.pdf?language=en_US
CAUTION==================================================================================================================
If you use the power button icon in the ACM to shutdown the IDPA DP4400, you must ensure you have a way to power it back on.
There are only two ways to power it back on after a shutdown request from the ACM:
1. physically press the power button at the top left of the appliance
2. iDRAC (integrated Dell Remote Access Console) "Power-on" function. This requires RJ45 ethernet cable, connected to the iDRAC port with an IP address which is network accessible
============================================================================================================================
Procedure:
Pre-shutdown activity and verifying the status of Backup Server and Storage
1. Use ssh to log in to AVE IP on the ACM dashboard. Use "admin" as user and the common password for the appliance.
2. From the root login, run the /usr/local/avamar/bin/avinstaller.pl -- checkPrcessingPackage command to check if any package installation in progress on AVE or not. If it is, wait for package installation to complete.
3. Run the dpnctl status all command.
Examine the output and ensure that all important back up server services are up and running as shown in the following screen shot. If not, contact support.
admin@xxxxxxx~/>: dpnctl status all
Identity added: /home/admin/.ssh/admin_key (/home/admin/.ssh/
admin_key)
dpnctl: INFO: gsan status: up
dpnctl: INFO: MCS status: up
dpnctl: INFO: emt status: up
dpnctl: INFO: Backup scheduler status: up
dpnctl: INFO: Maintenance windows scheduler status: enabled
dpnctl: INFO: Unattended startup status: disabled
dpnctl: INFO: avinstaller status: up
dpnctl: INFO: ConnectEMC status: up
dpnctl: INFO: ddrmaint-service status: up
4. Run the mccli checkpoint show command to check all the checkpoints available on the Avamar system.
Tip: take a screen shot of the output from running this command; it will be helpful in the later stages of the shutdown procedure.
admin@xxxxxxx:/home/admin/>:mccli checkpoint show
0,23000,CLI command completed successfully
Tag Time Validated Deletable
cp. 20180523033106 2018-05-23 09:01:06 IST Validated No
cp. 20180523033444 2018-05-23 09:04:44 IST No
cp. 20180523054859 2018-05-23 11:18:59 IST No
5. Run the mccli checkpoint create--override_maintenance_scheduler command to create a checkpoint on AVE.
admin@xxxxxxx:/home/admin/>mccli checkpoint
create --override_maintenance_scheduler
0,22624, Starting to create a server checkpoint.
6. After the previous command executes, run the mccli checkpoint show on the AVE again to see the checkpoint tag which was newly created and assigned to the checkpoint you initiated in the previous step. The entry may take some time to get reflected in the output of this command (you may need to repeat this command 2-3 times). The newly created checkpoint entry can be validated from the timestamp associated with the entries. In the following screen shot, cp.20180523033444 is the tag of the newly created checkpoint.
admin@xxxxxxx:/home/admin/>:
mccli checkpoint show
0,23000,CLI command completed successfully
Tag Time Validated Deletable
cp.20180523033106 2018-05-23 09:01:06 IST Validated No
cp.20180523033444 2018-05-23 09:04:44 IST Yes
cp.20180523054859 2018-05-23 11:18:59 IST No
cp.20180523055705 2018-05-23 11:27:05 IST No
7. Run the following command mccli checkpoint validate -- cptag=<cp_tag_of_new_checkpoint> --override_maintenance_scheduler to validate the checkpoint.
admin@xxxxxxx:/home/admin/>: mccli checkpoint validate --cptag=cp.20180523033444 -- override_maintenance_scheduler
0,22612,Starting to validate a server checkpoijnt
Attribute Value
tag cp. 20180523033444
type Full
8. Run the mccli checkpoint show command to check the status of the validation process of the checkpoint.
The screen will display In Progress for an extended period of time.
Wait until the screen displays a Validated status for the checkpoint tag.
admin@xxxxxxx:/home/admin/>:mccli checkpoint show
0,23000,CLI command completed successfully
Tag Time Validated Deletable
cp. 20180523033106 2018-05-23 09:01:06 IST Validated No
cp. 20180523033444 2018-05-23 09:04:44 IST In Progress Yes
cp. 20180523054859 2018-05-23 11:18:59 IST No
cp. 20180523055705 2018-05-23 11:27:05 IST No
admin@xxxxxxx:/home/admin/>:mccli checkpoint show
0,23000,CLI command completed successfully
Tag Time Validated Deletable
cp. 20180523033106 2018-05-23 09:01:06 IST Validated No
cp. 20180523033444 2018-05-23 09:04:44 IST Validated Yes
cp. 20180523054859 2018-05-23 11:18:59 IST No
cp. 20180523055705 2018-05-23 11:27:05 IST No
9. From the root login , run the avmaint hfscheckstatus <checkpoint_tag> -- ava command to check the status of the job.
If necessary, run the avmaint hfscheck --checkpoint=<checkpoint tag> --ava to perform an hfscheck on the checkpoint.
Wait until above hfscheck job status command gives a 'percent completed (100.0)' status.
root@xxxxxxx:/home/admin/#:avmaint hfscheckstatus cp.20180524033103 --ava
<?xml version="1.0" encoding-"UTF-8" standalone-"yes"?>
<hfscheckstatus
nodes-queried="1"
nodes-replied="1"
nodes-total="1"
checkpoint="cp.20180524033103"
status="waitcomplete"
type="full"
checks="full"
elapsed-time="114"
start-time="1527154524"
end-time="0"
check-start-time="1527154524"
check-end-time="1527154562"
generation-time="1527154565"
stripes-checking="31"
stripes-completed="31"
offline-stripes="0"
minutes-to-completion="100.00">
<hfscheckerrors/>
</hfscheckstatus>
RECHECK
root@xxxxxxx:/home/admin/#:avmaint hfscheck cp.20180524033103 --ava
<?xml version="1.0" encoding-"UTF-8" standalone-"yes"?>
<hfscheck
checkpoint="cp.20180524033103"
status="waitcgsan"
type="full"
checks="full"
elapsed-time="73"
start-time="1527154451"
end-time="0"
check-start-time="0"
check-end-time="0"
generation-time="1527154524"
percent-complete="0.00">
<hfscheckerrors/>
</hfscheck>
RECHECK
root@xxxxxxx:/home/admin/#:avmaint hfscheckstatus cp.20180524033103 --ava
<?xml version="1.0" encoding-"UTF-8" standalone-"yes"?>
<hfscheckstatus
nodes-queried="1"
nodes-replied="1"
nodes-total="1"
checkpoint="cp.20180524033103"
status="completed"
result="OK"
type="full"
checks="full"
elapsed-time="103"
start-time="1527154451"
end-time="1527154554"
check-start-time="1527154524"
check-end-time="1527154554"
generation-time="1527154651"
stripes-checking="31"
stripes-completed="31"
offline-stripes="0"
percent completion="100.00">
<hfscheckerrors/>
</hfscheckstatus>
10. Run the dpnctl stop sched command to stop all the backup job that will be scheduled by AVE(current jobs will still continue to run).
admin@xxxxxxx:~/>: dpnctl stop sched
Identity added: /home/admin/.ssh/admin_key (/home/admin/.ssh/admin_key)
dpnctl: INFO: Suspending backup scheduler...
dpnctl: INFO: Backup scheduler suspended.
11. Run the dpnctl stop maint command to stop maintenance services running on Avamar.
admin@xxxxxxx:~/>: dpnctl stop maint
Identity added: /home/admin/.ssh/admin_key (/home/admin/.ssh/admin_key)
dpnctl: INFO: Suspending maintenance windows scheduler...
dpnctl: INFO: Maintenance windows scheduler suspended.
12. From the root login, run the cplist command and verify the following:
a. Check if hfschecked checkpoint is present within 36hrs of time.
b. Check whether there is a hfs entry for at least one checkpoint which was created within last 36hrs of time.
root@xxxxxxx://ust/local/avamar/bin/#: cplist
cp. 20180524033103 Thu May 24 09:01:03 2018 valid hfs --- nodes
1/1 stripes 32
cp. 20180524033441 Thu May 24 09:04:03 2018 valid hfs --- nodes
1/1 stripes 32
13. Run the avmaint sessions on AVE. This stops all active sessions on Avamar.
It will list all the sessions currently running on AVE.
To kill each session, select the sessionid and run the avmaint kill <sessionid> command.
Do this for every session until no session entries are found on AVE.
admin@xxxxxxx://>: avmaint sessions
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<nodesessionlist count="1">
<sessionlist
id="0.0"
count="1"
<session
numthreads="1"
type="avtarbackup"
ndispatchers="1"
expires="1532240626"
domain=""
workorderid="MOD-1527056601340"
pidnum="1001"
numconns="1"
path="/clients/acmpun059.lss.emc.com"
starttime="1527056651"
encrypt=="tls-sa"
dispatcher0="xxxxxxxxxxxxxx"
sessionid="9152705660134709"
root="/"
pluginid="Unix"
encrypt-strength="high"
clientid="86752318de80049804395b0756fde3fa034a9846"
user=""
clientip=xxxxxxxxxxxxx>
<host
numprocs="4"
speed="16777200"
osuser="root"
name="xxxxxxx"
memory="32175">
<build
msgversion="13-10"
time="06:46:59"
appname="avtar"
zlibversion="1.2.8"
lzoversion=1.08 Jul 12 2002"
date "Mar 22 2018"
appversion="7.5.101-101_HF294929"
processortype="x86_64"
osversion="SLES-64"
Prepare for assembly replacement
14 Dell EMC IDPA DP4400 Service Procedures
sslversion="TLSv1 OpenSSL 1.0.2a-fips 19 Mar 2015
osname="Linux"/>
admin@xxxxxxx://>: avmaint kill 9152705692533109
kill: killed 9152705692533109
RECHECK
admin@xxxxxxx:/home/admin/>: avmaint sessions
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<nodesessionlist count="1">
<sessionlist
id="0.0"
count="0"/>
</nodesessionlist>
14. From the Avamar root login, run the avmaint cpstatus to verify that no checkpoint is in progress.
Verify that all the checkpoints listed are in a completed state. Wait for checkpoints to complete if they are running.
roor@xxxxxxx://#: avmaint cpstatus
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<cpstatus
generation-time="1527099528"
tag="cp.20180523055705"
status="completed"
stripes-completed="32"
stripes-total="32"
start-time="1527055025"
end-time="1527055044"
result="OK"
refcount="1"/>
15. Run the avmgr getb --path=/MC_BACKUPS --mr=1 --format=xml to verify that the MCS has been flushed within the last 12 hours.
You can check the actual time of the MCS flush by running the t.pl <time_tag> entry (execute in /usr/local/avamar/bin directory).
If the MCS has not been flushed in the last 12 hours, run the mcserver.sh --flush to flush the MCS on AVE.
admin@xxxxxxx:~/>: avmgr getb --path=/MC_BACKUPS --mr=1 --format=xml
1 Request succeeded
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<backuplist version="3.0">
<backuplistrec flags="32768001" labelnum="418" label=""
created="157174902"
roothash="587c90ceea90e7523366025b3955a8ed142170f"
totalbytes="48514156.00"
ispresentbytes="0.00" pidnum="1001" percentnew"0" expires="0"
created_prectime="0x1d3f371fd3ffb5a" partial="0"
retentiontype="daily,weekly,monthly,yearly
backuptype="full" ddrindex="0" locked="1" direct_restore="1"
tier="0"
appconsistent="not_available"/>
</backuplist>
admin@xxxxxxx:~/>: mcserver.sh--flush
=== BEGIN === check.mcs (preflush)
check.mcs passed
=== PASS === check.mcs PASSED OVERALL (preflush)
Flushing Administrator Server...
Adminstrator Server Flushed.
RECHECK after manual flush
admin@xxxxxxx:~/>: avmgr getb --path=/MC_BACKUPS --mr=1 --format=xml
1 Request succeeded
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<backuplist version="3.0">
<backuplistrec flags="32768001" labelnum="419" label=""
created="157178362"
roothash="5876c90ceea90e7523366025b3955a8ed1422170f"
totalbytes="48514156.00"
Prepare for assembly replacement
ispresentbytes="0.00" pidnum="100" percentnew"0" expires="0"
created_prectime="0x1d3f371fd3ffb5a" partial="0"
retentiontype="daily,weekly,monthly,yearly
backuptype="full" ddrindex="0" locked="1" direct_restore="1"
tier="0"
appconsistent="not_available"/>
</backuplist>
Verify last flush time
admin@xxxxxxx:~/>:/usr/local/avamar/bin/>: t.pl 1527178362
local: Thu May 24 21:42:42 2018 gmt: Thu May 24 16:12:42 2018
admin@xxxxxxx:~/>: mcserver.sh--flush
=== BEGIN === check.ms (preflush)
check.mcs
===PASS === check.mcs PASSED OVERALL (preflush)
Flushing Administrator Server...
Administrator Server Flushed
16. In the /usr/local/avamar/bin directory, run the hfscheck_kill to kill the hfscheck jobs (if there are still any running).
admin@xxxxxxx:/usr/local/avamar/bin/#: hfscheck_kill
Using /usr/local/avamar/ver/probe.xml
17. Run the avmaint gckill --ava command to kill all garbage collector jobs.
admin@xxxxxxx:/usr/local/avamar/bin/#: avmaint gckill --ava
18. Run the dpnctl shutdown --precheck command to check whether all the shutdown requirements are satisfied.
admin@xxxxxxx:~/>: dpnctl shutdown --precheck
Identity added /home/admin/.ssh/admin_key)
dpnctl: INFO: Checking for validated checkpoint
dpnctl: INFO: found the most recently validated checkpoint: cp.20180523033444
at'Wed May 23 03:34:44 2018 UTC'
dpnctl: INFO: VALIDATED CHECKPOINT PASSED
dpnctl: INFO:
[##############------------------------------------20%]
dpnctl: INFO: Starting MCS flush check
dpnctl: INFO: Last MCS flush at 'Wed May 23 15:45:02 2018'
dpnctl: INFO: LAST MCS PASSED
dpnctl: INFO:
[##############------------------------------------30%]
dpnctl: INFO: Checking for file system and gsan percentage
dpnctl: INFO: FS/GSAN PERCENTAGE PASSED
dpnctl: INFO:
[##############------------------------------------50%]
dpnctl: INFO: GSAN tasks: idle
dpnctl: INFO: Checking for hfscheck.
dpnctl: INFO: No hfsceck maintenance task is running.
dpnctl: INFO:
[##############------------------------------------70%]
dpnctl: INFO: Checking for GC.
dpnctl: INFO: No GC task is running.
dpnctl: INFO:
[##############------------------------------------80%]
dpnctl: INFO: Checking for active sessions (backup/restore).
dpnctl: INFO: No backup/restore is running.
dpnctl: INFO:
[##############------------------------------------90%]
dpnctl: INFO: Checking for active checkpoint.
dpnctl: INFO: No checkpoint task is running.
dpnctl: INFO:
[##############------------------------------------100%]
Additional Precautions during DP4400 shutdown
19. Verify the status of Garbage Collection (cleaning) process of the Backup Storage (DD).
Shutdown should be delayed until cleaning is completed.
If shutdown is necessary while cleaning is running, cleaning can be restarted manually after the DP4400 is powered back on with the command #filesys clean start
#filesys clean status
-lists last succesful cleaning, or progress of any ongoing cleaning operation.
20. Verify passwords are synchronized.
Changing a password for a component causes the ACM UI to display the password out of sync error message.
Ensure that all passwords are synchronized by checking each panel in the dashboard.
If any password is not synchronized, the shutdown process cannot start.
To allow the ACM to gather health information for the component, you must ensure that passwords are synchronized across all the panels
To update an unsynchronized password, click the error text which indicates an out-of-sync PW and follow the prompts to resolve it before shutdown..
After the above procedure has been completed, use the shutdown button in the ACM to begin the shutdown
21. On the dashboard Home tab, click the Shutdown Appliance icon.
22. Type the administrator password, click Authenticate, and then click Yes.
23. Click Logout.
CAUTION==================================================================================================================
Monitor the shutdown process through the ACM CLI /usr/local/dataprotection/var/configmgr/server_data/logs (ShutdownActivity.log & server.log) .
It will a long time (estimated 45 minutes) between the ACM stops responding, and the system physically powering off.
While the appliance is shutting down, the Login screen displays a message indicating shutdown is in progress.
To view the progress of the shutdown after the ACM is powered off , Log into ESX to monitor the shutdown.
============================================================================================================================
Troubleshooting shutdown of the DP4400
If any part of the shutdown process fails to complete automatically, troubleshoot as
follows.
1. Login to the Avamar server with SSH by using the Avamar IP address.
2. Create a checkpoint by running the following command: mccli checkpoint create --override_maintenance_scheduler
3. Stop all Avamar services by running the following command: dpnctl stop all
4. Log in to the Data Domain with SSH using the Data Domain IP address.
5. Shut down the Data Domain system by running the following command: #system poweroff
6. Open the vCenter by typing the IP address in the browser.
7. Log in to vCenter by using the customer-specified username and password. If the ACM is down, connect to VC and ESX using username "idpauser" and
appliance password as a password.
8. Power off the Data Protection virtual application. All virtual machines and virtual applications under the Data Protection virtual
application are automatically shut down.
9. Shut down the IDPA Virtual Machine guest operating system, and power off the virtual machine.
10. Log in to the ESXi server on which the vCenter resides.
11. Log in to each ESXi host.
12. Place each ESXi host into maintenance mode by running the following command on each host: esxcli system maintenanceMode set -e true -m noAction
13. Use the vSphere Client or the ESXi host to shut down all of the ESXi hosts.
추가 정보
Reference:
문서 속성
영향을 받는 제품
Integrated Data Protection Appliance Family
제품
PowerProtect DP4400, Integrated Data Protection Appliance Family
마지막 게시 날짜
20 11월 2020
버전
2
문서 유형
How To
댓글