Azure Tagging Troubleshooting and Implementation Guide
Azure Tagging Troubleshooting and Implementation Guide
Comprehensive guide for troubleshooting tag enforcement, implementing Epic standardized tagging, and resolving common tagging issues in Azure infrastructure.
๐ฏ Overview
Azure resource tagging is critical for Epic infrastructure governance, cost allocation, and compliance. This guide provides troubleshooting techniques for tag enforcement policies, implementation patterns for Epic-specific tagging requirements, and solutions to common tagging issues.
Key Topics Covered
- Policy Troubleshooting: Identify which policies are blocking or modifying tags
- Epic Tagging Standards: Implementation of HCP v1.8 and Epic-specific requirements
- Terraform Integration: Handling tag drift and lifecycle management
- Common Issues: Validation failures, policy conflicts, and remediation strategies
๐ Scope and Prerequisites
Azure Policy Effects
Azure Policy enforcement occurs through deny, modify, and audit effects. Terraform deploys policy definitions, but enforcement happens at the Azure Resource Manager level.
Common Policy Effects
- Deny: Blocks resource creation/updates that don't meet tagging requirements
- Modify: Automatically applies or updates tags based on policy rules
- Audit: Records compliance state without blocking operations
๐ง Prerequisites and Tools
Required Access
- Resource Access: Reader permissions on the target resource minimum
- Policy Access: Reader permissions on Policy at subscription or management group level
- Azure CLI: Updated Azure CLI with active login session (
az login)
๐ Policy Troubleshooting Workflow
Step 1: Check Effective Policy in Portal
- Open Azure Portal. Go to Policy.
- Open Compliance.
- Select the Resources tab.
- Search your resource. Open it.
- Review Effective assignments. Note the policy name, definition, and effect.
Tip: Also open Policy then Assignments at the subscription or management group, filter for tag. You will see built-in and custom tag policies and initiatives.
Step 2. List policy assignments with CLI
Subscription scope:
az policy assignment list \
--query "[].{name:name,displayName:displayName,scope:scope}" \
-o table
```text
**Resource group scope** (Filter for tag policies):
```bash
az policy assignment list -g <rg> \
--query "[?contains(displayName, 'tag') || contains(name, 'tag')]" \
-o table
```text
**Show assignment details** (parameters and effect):
```bash
az policy assignment show --name <assignmentNameOrId> -o jsonc
```text
### **Step 3: Identify Policy Events on Resources**
**Recent policy events on a single resource:**
```bash
az policy event list --resource <resourceId> \
--select "timestamp,policyAssignmentName,policyDefinitionName,policyDefinitionAction" \
--order-by "timestamp desc" --top 20 -o table
```text
**Current compliance states for the same resource:**
```bash
az policy state list --resource <resourceId> \
--select "timestamp,complianceState,policyAssignmentName,policyDefinitionName,policyDefinitionAction" \
--order-by "timestamp desc" --top 20 -o table
```text
**If the initiative uses DeployIfNotExists, list remediations:**
```bash
az policy remediation list --scope <subscriptionOrRgId> -o table
```text
### **Step 4: Trace Tag Changes and Attribution**
### Activity Log
#### **Activity Log Analysis**
Look for tag writes. A modify effect usually shows the policy assignment managed identity as the caller:
```bash
az monitor activity-log list \
--resource-id <resourceId> --offset 7d \
--query "[?contains(operationName.value, 'Microsoft.Resources/tags/write') || contains(operationName.value, 'write')]" \
-o table
```text
#### **Change Analysis in the Portal**
1. Open the resource.
2. Open Activity log, pick the relevant operation.
3. Open Change history.
You will see tag diffs, changedBy, client type, and operation.
#### **Change Analysis with Resource Graph CLI**
**Install the extension if needed:**
```bash
az extension add --name resource-graph
```text
**Query recent changes for one resource** (includes tag diffs when present):
```bash
az graph query -q "
resourcechanges
| where targetResourceId == '<resourceId>'
| order by changeAttributes.timestamp desc
| project timestamp=changeAttributes.timestamp,
changedBy=changeAttributes.changedBy,
client=changeAttributes.clientType,
operation=changeAttributes.operation,
changes"
```text
### **Step 5: Interpret Common Policy Results**
1. **Effect: Deny** - Creation or update was blocked. The Assignment and Definition identify who enforced the rule.
2. **Effect: Modify** - A tag was appended or overwritten. Activity Log shows the managed identity for the assignment as caller. Change Analysis shows old and new values.
3. **Effect: Audit** - No enforcement. Only noncompliance recorded in policy state.
### **Step 6: Handle Terraform Drift**
When Azure Policy modifies tags, Terraform may detect drift. Use lifecycle rules to prevent constant redeployment:
```hcl
resource "azurerm_resource_group" "example" {
name = "rg-example"
location = "West US 3"
tags = {
aide-id = "aide_0085665"
environment = "dev"
service-tier = "p2"
}
lifecycle {
ignore_changes = [tags]
}
}
```text
**Alternative approach** - Use `prevent_deletion_if_contains_resources`:
```hcl
lifecycle {
prevent_deletion_if_contains_resources = true
ignore_changes = [
tags["CreatedBy"],
tags["CreatedDate"],
tags["ModifiedBy"],
tags["ModifiedDate"]
]
}
```text
---
## Epic Component Tagging Examples
### Cogito Workspace Tags
Cogito workspaces typically use these tag configurations:
```json
{
"inputs": {
"tags": {
"aide-id": "aide_0085665",
"environment": "dev",
"service-tier": "p3",
"Component": "Epic Cogito",
"Division": "Optum Health",
"Product": "Epic EMR",
"dr-tier": "standby"
}
}
}
```text
### NetApp Storage Tags
NetApp storage resources require specific tagging for Epic workloads:
```json
{
"inputs": {
"tags": {
"aide-id": "aide_0085665",
"environment": "prd",
"service-tier": "p1",
"Component": "Epic NetApp Storage",
"Division": "Optum Health",
"Product": "Epic EMR",
"dr-tier": "active",
"DataClassification": "PHI",
"platform-managed": "true"
}
}
}
```text
### Citrix Infrastructure Tags
Citrix shared services use specific tagging patterns:
```json
{
"inputs": {
"tags": {
"aide-id": "aide_0085665",
"environment": "shared",
"service-tier": "p2",
"Component": "Citrix Shared Services",
"Division": "Optum Health",
"Product": "Epic EMR",
"SharedService": "true"
}
}
}
```text
---
## Quick Reference
### Required Tags (HCP v1.8)
| Tag | Epic Format | Purpose | Example |
|-----|-------------|---------|---------|
| `aide-id` | `aide_0085665` | Project identifier | `aide_0085665` |
| `environment` | Lowercase | Environment type | `dev`, `test`, `prd` |
| `service-tier` | `p1`/`p2`/`p3` | Service level | `p1` (production) |
| `platform-managed` | `true`/`false` | Platform team management | `true` |
| `workspace` | Full TFE workspace name | Workspace identifier | `aide-0085665-tfews-epic-cogito-westepic-npd-wus3-01` |
| `component` | Descriptive | Workload name | `Epic Cogito` |
| `workload` | Lowercase enum | Workload type | `epic`, `citrix`, `shared` |
| `region` | Azure region | Geographic location | `westus3` |
### Optional Tags (Epic Specific)
| Tag | Purpose | Examples |
|-----|---------|----------|
| `dr-tier` | Disaster recovery | `active`, `standby`, `restoration` |
| `DataClassification` | HIPAA data sensitivity (Azure Policy) | `PHI`, `NONPHI` |
| `data-classification` | Operational data classification (module tag, distinct from DataClassification) | `public`, `internal`, `confidential`, `restricted` |
| `backup-required` | Backup policy | `true`, `false` |
| `managed-by` | Management tool | `terraform` |
| `SharedService` | Service type | `true` (for shared resources) |
### Common Commands
**Check tag compliance:**
```bash
az graph query -q "Resources | where tags['aide-id'] == '' or isnull(tags['aide-id']) | project name, resourceGroup, type, location"
```text
**List resources by environment:**
```bash
az graph query -q "Resources | where tags.environment == 'prd' | project name, type, tags"
```text
**Find untagged resources:**
```bash
az graph query -q "Resources | where isnull(tags) or array_length(bag_keys(tags)) == 0 | project name, resourceGroup, type"
```text
Policy can append or change tags after Terraform creates a resource. Terraform may see drift. Use the lifecycle ignore where appropriate.
```hcl
lifecycle {
ignore_changes = [
tags,
# or for specific keys
# tags["CostCenter"], tags["Environment"]
]
}
```text
## Step 7. A quick end to end runbook
1. Get the resource ID.
```bash
az resource show -g <rg> -n <name> --resource-type <type> --query id -o tsv
```text
2. Pull the last ten policy events for that ID.
```bash
az policy event list --resource "$RID" \
--select "timestamp,policyAssignmentName,policyDefinitionName,policyDefinitionAction" \
--order-by "timestamp desc" --top 10 -o table
```text
3. Open the top assignment. check parameters.
```bash
az policy assignment show --name <assignmentNameOrId> -o jsonc
```text
4. Correlate the time window with Activity Log tag writes and Change Analysis.
---
## ๐ท๏ธ Epic Standardized Tagging Implementation
### **HCP v1.8 Compliance Requirements**
Epic infrastructure must implement Healthcare Cloud Platform (HCP) v1.8 tagging specifications for audit compliance and cost tracking.
#### **Required Tags (All Lowercase)**
| Tag Key | Variable Name | Description | Example | Validation |
|---------|---------------|-------------|---------|------------|
| `aide-id` | `aide_id` | AIDE identifier for resource ownership | `aide_0085665` | Regex: `^(aide_\d+\|uhgwm_[a-z]+)$` |
| `environment` | `environment` | Environment designation | `dev`, `test`, `prod` | Must start with: dev, qa, int, stg, tst, prf, uat, dmo, prd |
| `service-tier` | `service_tier` | Service tier classification | `p1`, `p2`, `p3` | Enum: p1, p2, p3 |
| `platform-managed` | `platform_managed` | Platform team management flag | `true`, `false` | Enum: "true" or "false" |
| `workspace` | `workspace` | Workspace identifier for deployment | `odbwus3` | Regex: `^[a-z0-9][a-z0-9-]*[a-z0-9]$` |
!!! note "Tag Key Mapping"
Variable names use underscores (`aide_id`) but map to hyphenated tag keys (`aide-id`) for HCP compliance.
#### **HCP Recommended Tags**
```hcl
# Optional but recommended for full compliance
tags = {
"itsm-assignment-group" = "EPIC NATIONAL INSTANCE - SPT"
"risk-record" = "RR-12345"
"source-code-repo" = "github.com/optum-tech-compute/ohemr-epic-pro-001"
"dr-tier" = "active"
}
```text
### **Epic Operational Tags**
Epic workloads require additional operational tags for proper environment management:
```hcl
epic_operational_tags = {
"workload" = "epic"
"region" = "westus3"
"managed-by" = "terraform"
"data-classification" = "confidential"
"backup-required" = "true"
}
```text
### **Organizational Tags**
OHEMR Epic infrastructure includes organizational metadata for cost allocation:
```hcl
organizational_tags = {
"Division" = "Optum Health"
"Product" = "Epic EMR"
"component" = "Cogito"
"ComponentVersion" = "1.0.0"
"gl-code" = "44770-01530-USASS800-169950" # Auto-resolved by module from workload x environment
"itsm-assignment-group" = "EPIC NATIONAL INSTANCE - SPT"
"managed-by" = "terraform"
}
```text
---
## ๐ง Implementation Patterns
### **Standardized Tagging Module**
The OHEMR Epic private registry provides centralized tagging through the `ohemr-epic-azurerm` module:
```hcl
module "epic_tagging" {
source = "app.terraform.io/optum-tech-compute/azurerm/ohemr-epic"
version = "~> 1.0"
# Enable standardized tagging
enable_standardized_tagging = true
create_example_resources = false
# HCP Required Tags v1.8
aide_id = "aide_0085665"
environment = "dev"
service_tier = "p2"
platform_managed = "true"
workspace = "cogito-wus3"
# Epic Configuration
component = "Cogito"
workload = "epic"
region = "westus3"
# Organizational Configuration
division = "Optum Health"
product = "Epic EMR"
component_version = "1.0.0"
# Epic Operational Configuration
data_classification = "confidential"
backup_required = "true"
managed_by = "terraform"
}
# Apply tags to resources
resource "azurerm_virtual_machine" "epic_vm" {
name = "vm-epic-cogito-wus3"
location = "West US 3"
resource_group_name = azurerm_resource_group.epic.name
# Use complete standardized tags
tags = module.epic_tagging.standardized_tags
}
```text
### **Tag Categories and Usage**
The module provides different tag outputs for specific use cases:
#### **Complete Tag Set**
```hcl
# All tags for general resources
tags = module.epic_tagging.standardized_tags
```text
#### **Business Tags (Cost Allocation)**
```hcl
# For cost center reporting
tags = module.epic_tagging.business_tags
# Includes: aide-id, Division, Product, Component, GLCode, workload
```text
#### **Technical Tags (Operations)**
```hcl
# For operational automation
tags = module.epic_tagging.technical_tags
# Includes: environment, service-tier, region, managed-by, data-classification
```text
#### **HCP Compliance Only**
```hcl
# Minimal HCP v1.8 compliance
tags = module.epic_tagging.hcp_tags
# Includes: aide-id, environment, service-tier + recommended tags
```text
### **Workspace Integration Pattern**
Epic workspaces follow a consistent pattern for tag integration:
```hcl
# acn-main.tf - Standard Epic workspace pattern
module "standardized_tagging" {
source = "app.terraform.io/optum-tech-compute/azurerm/ohemr-epic"
enable_standardized_tagging = true
aide_id = "aide_0085665"
environment = "dev" # NPD maps to dev
service_tier = "p2"
platform_managed = "true"
workspace = "cogito-wus3"
component = "Cogito"
workload = "epic"
region = "westus3"
}
locals {
enhanced_inputs = merge(
var.inputs,
{
standardized_tags = module.standardized_tagging.standardized_tags
business_tags = module.standardized_tagging.business_tags
technical_tags = module.standardized_tagging.technical_tags
}
)
}
module "deploy" {
source = "../terraform-deploy"
inputs = local.enhanced_inputs
# ...other configuration
}
```text
---
## ๐จ Common Tagging Issues and Solutions
### **Issue 1: Tag Limit Exceeded**
**Error:**
```text
Error: Tag count exceeds Azure limit: 26/25 tags used
```text
**Diagnosis:**
The module enforces a configurable tag limit (default 25, range 15-50). Azure has a standard 50-tag limit, but HCP v1.8 enforces 15 for limited services (Automation, CDN, DNS, Log Analytics). The `core_tags` output (14 tags) is available for these limited services.
```hcl
# Check current tag usage
terraform output tag_validation
```text
**Solution:**
```hcl
# Use custom_tags sparingly and exclude optional categories
module "epic_tagging" {
source = "app.terraform.io/optum-tech-compute/azurerm/ohemr-epic"
# Required configuration...
# Reduce tags if approaching limit
exclude_recommended_tags = true # Removes optional HCP tags
# Limit custom tags
custom_tags = {
"critical-tag-only" = "value"
}
}
```text
### **Issue 2: Invalid AIDE ID Format**
**Error:**
```text
Error: AIDE ID must be in format 'aide_xxxxxxx' or 'uhgwm_xxxxxxx'
```text
**Solution:**
```hcl
# Correct AIDE ID format (lowercase with underscore)
aide_id = "aide_0085665" # โ Correct
# aide_id = "AIDE_0085665" # โ Wrong - must be lowercase when passed to module
# aide_id = "aide-0085665" # โ Wrong - hyphen instead of underscore
```text
### **Issue 3: HCP Policy Violations**
**Error:**
```text
Policy violation: Missing required tag 'service-tier'
```text
**Diagnosis:**
```bash
# Check which policy is blocking deployment
az policy event list --resource "$RESOURCE_ID" \
--select "timestamp,policyAssignmentName,policyDefinitionName,policyDefinitionAction" \
--order-by "timestamp desc" --top 5 -o table
```text
**Solution:**
Ensure all HCP required tags are present:
```hcl
module "epic_tagging" {
source = "app.terraform.io/optum-tech-compute/azurerm/ohemr-epic"
aide_id = "aide_0085665" # Required
environment = "dev" # Required
service_tier = "p2" # Required - must be p1, p2, or p3
platform_managed = "true" # Required - must be "true" or "false"
workspace = "cogito-wus3" # Required - lowercase alphanumeric with hyphens
}
```text
### **Issue 4: Terraform Tag Drift**
**Problem:**
Azure Policy modifies tags after Terraform deployment, causing drift detection.
**Solution:**
Use lifecycle rules to ignore policy-managed tags:
```hcl
resource "azurerm_virtual_machine" "epic_vm" {
# ... configuration
tags = module.epic_tagging.standardized_tags
lifecycle {
ignore_changes = [
# Ignore tags that Azure Policy might modify
tags["last-policy-update"],
tags["policy-managed"],
# Or ignore all tag changes if policies heavily modify tags
# tags
]
}
}
```text
### **Issue 5: Environment Mapping Errors**
**Problem:**
Epic NPD environments need to map to HCP-compliant values.
**Solution:**
```hcl
# NPD environment mapping
locals {
# Map Epic environment names to HCP compliant values
environment_mapping = {
"npd" = "dev" # Non-prod development
"test" = "test" # Test environments
"prod" = "prod" # Production
}
}
module "epic_tagging" {
source = "app.terraform.io/optum-tech-compute/azurerm/ohemr-epic"
# Use mapped environment value
environment = local.environment_mapping[var.epic_environment]
}
```text
---
## ๐ Tag Validation and Monitoring
### **Built-in Validation**
The Epic tagging module includes comprehensive validation:
```hcl
# Access validation metrics
output "tag_validation" {
value = module.epic_tagging.tag_validation
}
# Example output:
# {
# total_tags = 12
# hcp_required_count = 3
# tag_count_valid = true
# max_tags_allowed = 15
# tags_remaining = 3
# hcp_spec_version = "v1.8"
# }
```text
### **Configuration Summary**
Monitor current tagging configuration:
```hcl
output "tagging_config" {
value = module.epic_tagging.tagging_config
}
```text
### **Validation Commands**
Check tag compliance:
```bash
# Verify tag counts don't exceed limits
terraform output tag_validation | jq '.tags_remaining'
# Check HCP compliance
terraform output tag_validation | jq '.hcp_spec_version'
# Validate AIDE ID format
terraform output tagging_config | jq '.aide_id'
```text
---
## ๐ Epic Component Configurations
### **Cogito (Clinical Decision Support)**
```hcl
module "cogito_tagging" {
source = "app.terraform.io/optum-tech-compute/azurerm/ohemr-epic"
component = "Cogito"
service_tier = "p2" # Clinical decision support
custom_tags = {
"epic-service" = "clinical-decision-support"
"performance-tier" = "high"
}
}
```text
### **Hyperspace (Web Interface)**
```hcl
module "hyperspace_tagging" {
source = "app.terraform.io/optum-tech-compute/azurerm/ohemr-epic"
component = "Hyperspace"
service_tier = "p1" # User-facing interface
custom_tags = {
"epic-service" = "web-interface"
"user-facing" = "true"
}
}
```text
### **NetApp Storage**
```hcl
module "netapp_tagging" {
source = "app.terraform.io/optum-tech-compute/azurerm/ohemr-epic"
component = "NetApp Storage"
service_tier = "p1" # Critical storage
custom_tags = {
"epic-service" = "shared-storage"
"storage-tier" = "premium"
"backup-frequency" = "4-hour"
}
}
```text
---
## ๐ฏ Regional Deployment Considerations
### **Multi-Region Tagging**
Epic infrastructure spans multiple Azure regions with consistent tagging:
#### **West US 3 (Primary)**
```hcl
module "westus3_tagging" {
source = "app.terraform.io/optum-tech-compute/azurerm/ohemr-epic"
region = "westus3"
custom_tags = {
"deployment-region" = "primary"
"disaster-recovery" = "source"
}
}
```text
#### **Central US (DR)**
```hcl
module "centralus_tagging" {
source = "app.terraform.io/optum-tech-compute/azurerm/ohemr-epic"
region = "centralus"
dr_tier = "passive"
custom_tags = {
"deployment-region" = "disaster-recovery"
"disaster-recovery" = "target"
}
}
```text
---
## ๐ Where to Look in OHEMR Environment
### **Policy Enforcement Locations**
Public Cloud Governance enforces required tags across Azure subscriptions:
- **Management Group Level**: HCP tagging policies
- **Subscription Level**: Epic-specific requirements
- **Resource Group Level**: Inherited tagging policies
### **Epic-Specific Resources**
| **Repository** | **Scope** | **Tagging Module** |
|----------------|-----------|-------------------|
| `ohemr-epic-npd-001` | Non-production development | `testlogworkspace-wus3` (pilot) |
| `ohemr-epic-pro-001` | Production infrastructure | Migration planned |
| `ohemr-epic-shared-001` | Shared services (AWX, NetApp) | Migration planned |
| `ohemr-epic-test-001` | Test infrastructure | Migration planned |
### **Key Documentation**
- **Cloud Resource Tagging Overview**: `https://docs.hcp.uhg.com/public-cloud-governance/cloud-resource-tagging-overview`
- **PCG Roadmap**: `https://docs.hcp.uhg.com/public-cloud-governance/pcg-roadmap`
- **Epic Tagging Migration Guide**: Located in `docs/TAGGING_MIGRATION.md` in Epic repositories
---
## ๐ References and Resources
### **Azure Documentation**
- [Azure CLI Policy Events](https://learn.microsoft.com/azure/cli/azure/policy/event)
- [Azure CLI Policy States](https://learn.microsoft.com/azure/cli/azure/policy/state)
- [Azure Policy CLI Overview](https://learn.microsoft.com/azure/cli/azure/policy)
- [Change Analysis with Resource Graph](https://learn.microsoft.com/azure/governance/resource-graph/changes/resource-graph-changes)
- [View Resource Changes in Portal](https://learn.microsoft.com/azure/governance/resource-graph/changes/view-resource-changes)
### **OHEMR Epic Resources**
- **Private Registry Module**: `app.terraform.io/optum-tech-compute/azurerm/ohemr-epic`
- **Module Documentation**: Located in `ohemr-epic-private-registry-azurerm/docs/`
- **Migration Examples**: Available in Epic repository `docs/TAGGING_MIGRATION.md` files
- **Tag Reference**: Comprehensive tag structure in `docs/TAGGING_REFERENCE.md`
### **Compliance Specifications**
- **HCP v1.8**: Healthcare Cloud Platform tagging specification
- **Azure Tag Limits**: 15 tags per resource maximum
- **Policy Enforcement**: June 26, 2025 (non-prod) / July 17, 2025 (prod)