Navigation
GuidesUpdated July 3, 2026

ODB Snapshot Refresh Process

guideodbirisdatabasesnapshotrefreshawxansibleenvcopyazureautomationepic

Introduction

This is a newly developed process and breaking changes are likely. While the usage in AWX from
a user's perspective might not change much, many of the links below may be invalid. Contact
the development team for further assistance.

This outlines the automated database refresh process for Epic ODB instances (also known as Iris databases). The workflow leverages Iris's envcopy tool to orchestrate the refresh, utilizing a custom BACKUP_RESTORE hook script for cloud disk management and data integrity validation. By coordinating a series of scripted and automated steps, this process ensures that database environments can be refreshed quickly, reliably, and with minimal manual intervention.

The major stages of the process include preparing the source database for copying, invoking infrastructure automation through AWX and Ansible, managing Azure managed disk snapshots and attachments, performing file system integrity checks, and returning control to envcopy for finalization. This method supports safe migration and refresh of database environments, enabling consistent and predictable updates for Epic ODB instances.

Getting Started

AWX provides a web-based user interface, REST API, and task engine built on top of Ansible. It is one of the upstream projects for Red Hat Ansible Automation Platform.

AWX

The entire process can be run from AWX. You can reach AWX at the links below:

Getting Access

To gain access to the snapshot refresh jobs, a user must belong to the follow groups in Secure:

  • eoa_awx_users
  • eoa_awx_infra_ops

Pre-Refresh Considerations

  • Review the required variables and ensure their correctness. See this section for more information on altering variables
  • Ensure source and target environments are healthy and ready.
  • There could be Azure Resource Quota limits that prevent the hook script from creating enough Managed Disks. Work with cloud teams to address this.
  • The hook script uses Azure SPNs to authenticate. Work with cloud teams if authentication errors are observed.

Scenarios

This solution supports two types of scenarios: Inter-VM (multiple VMs) and Intra-VM (same VM). The hook script that runs detects they scenario type by inspecting snap_refresh_source.hostname and snap_refresh_target.hostname. If they match, it performs intra-VM steps. If they do not match, those steps are omitted.

During an intra-VM refresh, the source instance's logical volume (LV) and volume group (VG) are renamed and source LV/VG UUID are rewritten. This is done to prevent name/UUID collisions when the new disks are attached.
AWX GroupScenario Name (if needed)TypeDescriptionSource VMSource InstanceTarget VMTarget Instance
wsupmir_to_wsup2WSUPMIR_TO_WSUP2Inter-VMWSUPMIR prod mirror to WSUP2zwplodbew302.ms.ds.uhc.comWSUPMIRzwplodbew303.ms.ds.uhc.comWSUP2
wsupclmir_to_wsup2clWSUPCLMIR_TO_WSUP2CLInter-VMWSUPCLMIR prod mirror to WSUP2CLzwplodbcl302.ms.ds.uhc.comWSUPCLMIRzwplodbcl303.ms.ds.uhc.comWSUP2CL
wsupmir_to_prdWSUPMIR_TO_PRDInter-VMWSUPMIR prod mirror to PRDzwplodbew302.ms.ds.uhc.comWSUPMIRzwplodbew501.ms.ds.uhc.comPRD
wsupmir_to_drWSUPMIR_TO_DRInter-VMWSUPMIR prod mirror to Prod DRzwplodbew302.ms.ds.uhc.comWSUPMIRzcrplodbew601.ms.ds.uhc.comPRD
wsupmir_to_rptWSUPMIR_TO_RPTInter-VMWSUPMIR prod mirror to RPTzwplodbew302.ms.ds.uhc.comWSUPMIRzwplodbew401.ms.ds.uhc.comRPT
ex_source_to_targetEX_SOURCE_TO_TARGETIntra-VMExample group for intra-VM refresheszwplodbew302.ms.ds.uhc.comSOURCEINSTzwplodbew302.ms.ds.uhc.comTARGETINST

Running a Refresh

Refresh TypeInventory GroupAWX Link
ODB Refresh w/ envcopywestODB Refresh
ODB Refresh w/o envcopywestODB Refresh Without Envcopy
  1. To start the process, open the AWX Job Template linked above or find the desired template from the list by clicking "Templates" under "Resources" on the left sidebar.
  2. This opens the Launch modal. On the "Inventory" step, the appropriate inventory should be set to a default. Click "Next"
  3. On the "Inventory Groups" step, ensure that the group matching the intended scenario is selected (see above). Click "Next"
  4. On the "Other prompts" step, ensure that the "Limit" field contains the correct AWX group name listed above. Increase "Verbosity" if desired. Leave "Job Tags" and "Skip Tags" should be blank.
  5. The "Preview" step is the final step before launch. Ensure the Iris environments are ready for refresh. Once confirmed, click "Launch"

Making Variable Changes

  1. Open the AWX Job Template linked above or find the desired template from the list by clicking "Templates" under "Resources" on the left sidebar.
  2. On the "Details" tab, click the link found in the "Inventory" field. This takes you to the AWX Inventory configured for this Job Template
  3. You can view/edit variable data in YAML or JSON. Use the toggle beside "Variables" to switch. Click "Edit".
  4. Make changes to the variables necessary and click "Save" Refer to the variable explanations below for more information.

Required Variables

Use caution when specifying `disk_pattern` variables. These rely on regular expressions (regex) to search for which disks to manipulate during the process. Be sure to validate the supplied patterns against the existing list of Azure Managed Disk names before proceeding
VariableExampleDescription
snap_refresh_project_nameA unique identifier for this database refresh effort
snap_refresh_sourceDictionary describing necessary inputs related to the source (see below)
snap_refresh_source.vm_namezwplodbew001Host name containing source instance
snap_refresh_source.ipIP Address for source VM
snap_refresh_source.rgAzure resource group containing source instance VM
snap_refresh_source.disk_pattern^.*supmir[0-9]{2}01.*$Regular expression that can filter the list of disk names attached to the source to only those desired for refresh.
snap_refresh_source.instancesupmirIris source instance name
snap_refresh_source.vg_namesupmir01vgLVM volume group name of the source disks
snap_refresh_source.lv_namesupmir01lvLVM logical volume name of the source disks
snap_refresh_targetDictionary describing necessary inputs related to the target (see below).
snap_refresh_target.vm_namezwplodbew001Host name containing target instance
snap_refresh_target.ipIP Address for target VMS
snap_refresh_target.rgAzure resource group containing target instance VM
snap_refresh_target.disk_pattern^.*rpt[0-9]{2}01.*$Regular expression that can filter the list of disk names attached to the target to only those desired for refresh. This is the list of disks that will be detached from the target
snap_refresh_target.instancerptIris target instance name
snap_refresh_target.vg_namerpt01vgLVM volume group name of the target disks
snap_refresh_target.lv_namerpt01lvLVM logical volume name of the target disks
snap_refresh_target.mount_point/epic/rpt01Directory to which the target instance's data logical volume is mounted

Optional Variables

These variables are not required but can be used to alter the behavior of the refresh process:

VariableDefaultDescription
snap_refresh_evt_id<undefined>Specific, pre-existing EVT record ID to use. Do not use with snap_refresh_evt_template
snap_refresh_logfile_path/epic/logs/snap_refresh.logHook script logfile location
snap_refresh_bin_path/usr/local/bin/snap_refresh.shFull path to the hook script on the target VM
snap_refresh_bin_perms.ownerepicadmUser ownership for hook script location and script
snap_refresh_bin_perms.groupepicsysGroup ownership for hook script location and script
snap_refresh_bin_perms.mode0770File permissions for the hook script location
snap_refresh_start_after_completionfalseIndicates whether to start the target database once the refresh is done.
snap_refresh_default_disk_patternadhocAppended to disk name to name the snapshots
snap_refresh_polling_mins5Time in minutes to wait for disk/snapshot hydration
snap_refresh_polling_retries48Number of times to retry checks for disk/snapshot hydration. Note: snap_refresh_polling_mins * snap_refresh_polling_retries represents the maximum amount of time the process will wait for hydration.
snap_refresh_freeze_sourcefalseIndicates whether to freeze the source instance before taking snapshots. envcopy typically handles this but can be toggled to ensure consistent snapshots when testing

Hook Script Config File

The hook script relies on a .cfg file for things like Azure client secrets as well as the source/target configuration defined in the AWX groups for each scenario described above. This is placed onto the server for the duration of the scenario but is deleted on success or failure.

# Refresh Project Info
SNAP_REFRESH_PROJECT_NAME={{ snap_refresh_project_name }}
SNAP_REFRESH_INTRA_VM={{ snap_refresh_intra_vm }}
# ODB Host Info
SRC_VM_HOSTNAME={{ snap_refresh_source.vm_name }}
SRC_VM_RG={{ snap_refresh_source.rg }}
SRC_VM_DATADISK_PATTERN={{ snap_refresh_source.disk_pattern }}
SRC_VM_INST_NAME={{ snap_refresh_source.instance }}
SRC_VM_LV_NAME={{ snap_refresh_source.lv_name }}
SRC_VM_VG_NAME={{ snap_refresh_source.vg_name }}
SRC_VM_MOUNT_POINT={{ snap_refresh_source.mount_point }}
TAR_VM_HOSTNAME={{ snap_refresh_target.vm_name }}
TAR_VM_RG={{ snap_refresh_target.rg }}
TAR_VM_DATADISK_PATTERN={{ snap_refresh_target.disk_pattern }}
TAR_VM_INST_NAME={{ snap_refresh_target.instance }}
TAR_VM_LV_NAME={{ snap_refresh_target.lv_name }}
TAR_VM_VG_NAME={{ snap_refresh_target.vg_name }}
TAR_VM_MOUNT_POINT={{ snap_refresh_target.mount_point }}
# Azure CLI Config
AZ_CLIENT_ID={{ az_auth_client_id }}
AZ_CLIENT_SECRET={{ az_auth_client_secret }}
AZ_TENANT_ID={{ az_auth_tenant_id }}
AZ_SUBSCRIPTION_ID={{ az_auth_subscription_id }}
# Script Defaults
POLLING_INTERVAL={{ snap_refresh_polling_interval }}
LOGFILE_PATH={{ snap_refresh_logfile_path }}
```text

## Hook Script Error Codes

| No. | Name | Description |
| --- | ---- | ----------- |
| 4 | ERR_AZ_LOGIN | Error logging into Azure |
| 5 | ERR_AZ_LOGOUT | Error logging out of Azure |
| 6 | ERR_AZ_SNAP_CREATE | Error during snapshot creation |
| 7 | ERR_AZ_SRC_VM_DISK_INFO | Error looking up source VM disk info |
| 8 | ERR_AZ_TAR_VM_DISK_CREATE | Error creating disks |
| 10 | ERR_AZ_DISK_ATTACHMENT | Error attaching disks to target VM |
| 11 | ERR_AZ_DISK_MOUNT | Error mounting attached disks on the target VM OS |
| 12 | ERR_AZ_DISK_DETACHMENT | Error detaching "old" disks from target VM |
| 13 | ERR_AZ_DISK_MOUNT | Error mounting donor LVs to target mount point |
| 14 | ERR_DONOR_NOT_FOUND | Error finding donor LVs on target VM |

## High Level Flow

```mermaid
sequenceDiagram
    participant awx as AWX
    participant target as Target
    participant azure as Azure

    awx->>awx: 1. invoke start playbook
    awx->>target: 2. create envcopy.conf
    awx->>target: 3. run scenario
    target->>target: 4. freeze
    target->>target: 5. fileCopy/backupRestore phase
    target->>azure: 6. snapshot disks
    target->>azure: 7. create disks from snapshots
    target->>azure: 8. attach disks to target VM
    azure->>target: 9. validate donated disks
    target->>azure: 10. detach "old" disks
    azure->>target: 11. mount donated LV
    target->>target: 12. envcopy completion
```text

1. A user launches an AWX Job Template to initiate the process.
2. A new envcopy scenario is created. A new envcopy.conf configuration file is generated from a template and playbook inputs/role defaults.
3. The new scenario is run.
4. As part of the envcopy process and scenario config, the source instance is frozen.
5. Executes our custom data transfer method. This is where the custom snapshot-refresh hookscript is invoked.
6. The hook script creates snapshots from the source instance's data disks.
7. Via Azure CLI, the hook script creates new managed disks from the snapshots.
8. Via Azure CLI, the hook script attaches the new managed disks to the target VM.
9. On the target instance, the hook script checks the donated disks for valid LVM and filesystem configuration.
10. Via Azure CLI, the hook script detaches the "old" or "previous" data disks from the target VM.
11. On the target VM, the hook script mounts the donated logical volume (LV) to the mount point configured in playbook inputs.
12. On the target instance, envcopy resumes control and finishes its tasks.

### Intra-VM Refresh Branches

During an intra-VM refresh, a few extra steps are performed. Prior to snapshot creation, the source instance's logical volume (LV) and volume group (VG) are renamed and source LV/VG UUID are rewritten. This is done to prevent name/UUID collisions when the new disks are attached. Following snapshot hydration, those LVs/VGs are renamed back, unmounted, UUIDs rewritten, reactivated, then remounted.

## `envcopy` Phases

| Stage | Action |
| ----- | ------ |
| preExecution | Execute PREEXECUTION hook scripts |
| saveConfig | Run `saveConfig^%ZeENV` |
| preCopy | Run `preCopy^ZeENV` |
| preRefresh | Run PREREFRESH hook scripts |
| fileCopy/backupRestore | Copy database files, or use the BACKUP_RESTORE hook script if configured |
| postFileCopy | Execute POSTFILECOPY hook scripts |
| startDestination | Start the target environment |
| preRunTasks | Execute PRERUNTASKS hook scripts |
| runTasks | Run `runTasks^%ZeENV` |
| postRunTasks | Execute POSTRUNTASKS hook scripts |
| destFinalRunlevel | Bring the target environment to its final runlevel |
| sourceFinalRunlevel | Bring the source environment to its final runlevel |
| final | Execute the FINAL hook scripts |

## Azure Authentication

This process expects the existence of an Azure Service Principal Name (SPN) with the permissions below:

- Create snapshots
- Create managed disks (from snapshots)
- Attach/detach Disks
- Show managed disk information
- Show VM information

It is recommended practice to pull these credentials from a secure source such as HCP Vault. See the example playbook below.

## More Information

1. [Non-Production ODB Migrations Strategy Guide (Galaxy)](https://galaxy.epic.com/?#Browse/page=1!68!50!100315688)