InfrastructureUpdated July 3, 2026
Infrastructure Automation Overview
infrastructureautomationterraformazureepichealthcare
Infrastructure Automation Overview
Comprehensive overview of OHEMR Epic infrastructure automation, deployment standards, and operational procedures.
๐ฏ Strategic Overview
OHEMR Epic infrastructure is built on Azure cloud platform using Infrastructure as Code (IaC) principles. This approach ensures consistent, secure, and compliant deployment of healthcare infrastructure supporting Epic clinical systems across multiple environments.
Architecture Principles
- Epic-First Design: All infrastructure decisions prioritize Epic certification and clinical workflow requirements
- HIPAA Compliance: Built-in security controls and audit logging for PHI protection
- Automation-Driven: Terraform-based IaC eliminates manual deployment errors
- Multi-Environment: Consistent deployment across Production, Non-Production, and Training environments
๐๏ธ Infrastructure Components
Core Infrastructure Stack
| Layer | Technology | Purpose | Epic Integration |
|---|---|---|---|
| Compute | Azure VMs, Scale Sets | Epic application hosting | Hyperspace, BCA, Training VDAs |
| Storage | Azure NetApp Files, Managed Disks | Epic data storage | Epic databases, NAS shares |
| Network | Virtual Networks, Load Balancers | Epic connectivity | Clinical workflow traffic |
| Security | NSGs, Key Vault, Private Endpoints | PHI protection | Epic authentication, encryption |
| Monitoring | Azure Monitor, Log Analytics | Epic system monitoring | Clinical system health |
Epic Environment Architecture
graph TB
subgraph "Epic Production"
EpicProd[Epic Hyperspace VMs]
EpicDB[(Epic Database)]
EpicNAS[Epic NAS Storage]
end
subgraph "Epic Training"
EpicTrain[Epic Training VMs]
TrainDB[(Training Database)]
end
subgraph "Shared Infrastructure"
Citrix[Citrix Infrastructure]
AD[Active Directory]
NetScale[NetScaler Load Balancer]
end
EpicProd --> EpicDB
EpicProd --> EpicNAS
EpicTrain --> TrainDB
Citrix --> EpicProd
Citrix --> EpicTrain
AD --> Citrix
NetScale --> Citrix
```text
---
## ๐ Infrastructure Standards
### **Naming Conventions**
All infrastructure follows standardized naming patterns for operational efficiency:
- **Virtual Machines**: `[Platform][Region][Environment][OS][Purpose][Role][Series]`
- **Resource Groups**: `rg-[purpose]-[environment]-[region]`
- **Storage Accounts**: `st[purpose][environment][uniqueid]`
**Example**: `ZWPWEPSEP001` = Azure, West US, Production, Windows, Epic, Epic Prod VDA, Instance 001
### **Tagging Strategy**
All resources include standardized tags for governance and cost allocation:
```hcl
# Standard tags for all Epic infrastructure
tags = {
aide-id = "AIDE_0085665"
item-assignment-group = "EPIC NATIONAL INSTANCE โ SPT"
Division = "Optum Health"
Product = "Epic EMR"
environment = "prd"
DataClassification = "PHI"
epic-app = "hyperspace"
epic-stamp = "production"
}
```text
### **Security Baseline**
All infrastructure includes mandatory security controls:
- **Network Security Groups**: Micro-segmentation for Epic traffic
- **Private Endpoints**: No public internet access for Epic data
- **Encryption**: TLS 1.3 in transit, AES-256 at rest
- **Identity**: Managed identities for service authentication
- **Monitoring**: Centralized logging to Event Hubs
---
## ๐ง Automation Framework
### **Terraform-Based Deployment**
Infrastructure deployments use standardized Terraform modules:
```text
terraform/
โโโ modules/
โ โโโ vm-deployment/ # Standard VM deployment
โ โโโ network-foundation/ # VNet, subnets, NSGs
โ โโโ storage-services/ # ANF, managed disks
โ โโโ backup-recovery/ # Recovery Services Vault
โโโ environments/
โ โโโ production/ # Epic production configs
โ โโโ non-production/ # Epic NPD configs
โ โโโ training/ # Epic training configs
โโโ shared/
โโโ variables.tf # Global variables
โโโ providers.tf # Azure provider config
```text
### **Deployment Pipeline**
```mermaid
graph LR
A[Code Commit] --> B[Terraform Validate]
B --> C[Security Scan]
C --> D[Epic Compliance Check]
D --> E[Terraform Plan]
E --> F[Approval Gate]
F --> G[Terraform Apply]
G --> H[Health Validation]
H --> I[Epic Testing]
```text
### **Configuration Management**
| **Environment** | **Terraform Workspace** | **Epic Scope** | **Change Control** |
|-----------------|-------------------------|----------------|-------------------|
| **Production** | `ohemr-epic-pro-001` | Live clinical systems | 5-day approval process |
| **Non-Production** | `ohemr-epic-npd-001` | Epic development/testing | 2-day approval process |
| **Training** | `ohemr-epic-train-001` | Clinical staff training | 1-day approval process |
---
## ๐ฅ Epic Integration Standards
### **Performance Requirements**
Epic infrastructure must meet specific performance criteria:
| **Component** | **Requirement** | **Monitoring** |
|---------------|-----------------|----------------|
| **VM Performance** | <2ms disk latency | Azure Monitor metrics |
| **Network Latency** | <1ms between Epic tiers | Network Watcher |
| **Storage IOPS** | >50,000 4K random | ANF performance metrics |
| **Backup RTO** | <4 hours for Epic data | Recovery Services monitoring |
### **Availability Standards**
```yaml
epic_sla_requirements:
production:
uptime: "99.9%"
rto: "4 hours"
rpo: "1 hour"
availability_zones: 3
training:
uptime: "99.5%"
rto: "8 hours"
rpo: "4 hours"
availability_zones: 2
```text
### **Clinical Workflow Considerations**
- **Maintenance Windows**: Epic downtime coordinated with clinical operations
- **Capacity Planning**: Surge capacity for training and go-live events
- **Disaster Recovery**: Cross-region replication for Epic data
- **Performance Testing**: Load testing validates Epic user experience
---
## ๐ Operational Procedures
### **Deployment Validation**
All infrastructure deployments include automated validation:
```bash
# Terraform deployment validation
terraform init
terraform validate
terraform plan -detailed-exitcode
# Epic-specific validation
./scripts/validate-epic-compliance.sh
./scripts/check-performance-requirements.sh
./scripts/verify-backup-configuration.sh
```text
### **Health Monitoring**
Infrastructure health monitoring includes:
- **Resource Health**: Azure Resource Health monitoring
- **Performance Metrics**: CPU, memory, disk, network utilization
- **Security Monitoring**: NSG flow logs, authentication events
- **Epic-Specific**: Application response times, database connectivity
### **Incident Response**
| **Severity** | **Response Time** | **Epic Impact** | **Escalation** |
|--------------|-------------------|-----------------|----------------|
| **P1 - Critical** | 15 minutes | Epic production down | Epic on-call + Infrastructure |
| **P2 - High** | 1 hour | Epic performance degraded | Infrastructure team |
| **P3 - Medium** | 4 hours | Epic non-critical impact | Standard support |
| **P4 - Low** | Next business day | No Epic impact | Standard support |
---
## ๐ Compliance & Governance
### **HIPAA Compliance**
Infrastructure compliance includes:
- **Access Controls**: Role-based access with MFA
- **Audit Logging**: All infrastructure changes logged
- **Data Encryption**: PHI encrypted in transit and at rest
- **Network Security**: Micro-segmentation and private connectivity
### **Azure Policy Enforcement**
Automated governance through Azure Policies:
- **Naming Convention**: VM names must follow OHEMR standards
- **Tagging Requirements**: All resources must include required tags
- **Security Baseline**: NSGs, encryption, backup policies enforced
- **Cost Controls**: Resource sizing and lifecycle management
### **Change Management**
```mermaid
graph TD
A[Infrastructure Change Request] --> B[Epic Impact Assessment]
B --> C[Security Review]
C --> D[Architecture Approval]
D --> E[Terraform Code Development]
E --> F[Testing in Non-Production]
F --> G[Production Deployment]
G --> H[Post-Deployment Validation]
```text
---
## ๐จ Troubleshooting & Support
### **Common Issues**
#### **VM Deployment Failures**
- **Naming Convention Violations**: Use OHEMR VM naming validator
- **Resource Quota Limits**: Check Azure subscription quotas
- **Network Connectivity**: Validate subnet and NSG configurations
#### **Epic Performance Issues**
- **Storage Latency**: Check ANF performance metrics and service levels
- **Network Latency**: Validate ExpressRoute connectivity and routing
- **Resource Sizing**: Verify VM sizes meet Epic requirements
#### **Backup and Recovery**
- **Backup Policy Failures**: Check Recovery Services Vault permissions
- **Data Restoration**: Validate backup retention and geo-replication
- **DR Testing**: Verify cross-region failover procedures
### **Support Escalation**
| **Issue Type** | **Primary Contact** | **Secondary Contact** |
|----------------|-------------------|-------------------|
| **Epic Infrastructure** | <[email protected]> | <[email protected]> |
| **Azure Platform** | <[email protected]> | <[email protected]> |
| **Network Connectivity** | <[email protected]> | <[email protected]> |
| **Security/Compliance** | <[email protected]> | <[email protected]> |
---
## ๐ Emergency Contacts
### **24/7 Support**
- **Epic Production Emergency**: <[email protected]>
- **Infrastructure Emergency**: <[email protected]>
- **Security Incident**: <[email protected]>
### **Business Hours Support**
- **Infrastructure Team**: <[email protected]>
- **Epic Integration Team**: <[email protected]>
- **Platform Engineering**: <[email protected]>
---
## ๐ Related Documentation
- [Epic Architecture Requirements](../ohemr-architecture-hub/standards/index.md): Epic-specific infrastructure standards
- [Operational Procedures](../operations/index.md): Day-to-day operations and maintenance
- [Monitoring Strategy](../monitoring/index.md): Infrastructure monitoring and alerting
- [Security Baseline](../security/index.md): Security controls and compliance
- [Operations Runbooks](../operations/index.md): Standard operating procedures
---
---
*๐๏ธ **Infrastructure Excellence**: Automated, secure, and compliant infrastructure foundation supporting Epic clinical systems and healthcare operations.*