Need email infrastructure? Try PostScale -- transactional email API built in the EU. PostScale

    OperationsIntermediate

    DNS Operations

    A practical operating model for DNS changes, migrations, monitoring, troubleshooting, incident response, and rollback.

    Answer snapshot

    DNS operations is the discipline of changing production DNS without surprises. Treat DNS like traffic infrastructure: assign owners, review high-risk changes, lower TTLs before migrations, verify against authoritative and recursive resolvers, monitor query behavior, and keep rollback ready before the change window starts.

    What you'll learn

    • Build a repeatable DNS change workflow
    • Identify high-risk records and review gates
    • Verify DNS changes from authoritative and recursive viewpoints
    • Plan migration, monitoring, incident response, and rollback

    DNS operations is how teams keep domains, email, certificates, and production traffic stable while records keep changing.

    The goal is not to make DNS complicated. The goal is to make every change predictable:

    owner -> plan -> review -> change -> verify -> monitor -> rollback if needed

    Use this page as the operating hub for DNScale DNS workflows. For automation patterns, see DNS Automation. For a symptom-first diagnostic tree, start with DNS Troubleshooting.

    Operating Principles

    Production DNS should have a small set of rules:

    • Every production zone has an owner.
    • High-risk records require review.
    • Planned cutovers include TTL preparation.
    • Verification checks both authoritative and recursive DNS.
    • Emergency changes are backfilled into code or documentation.
    • Rollback is written before the change begins.

    DNS is not only a routing system. It controls web traffic, email delivery, certificate issuance, service discovery, and third-party domain verification.

    Change Workflow

    Use a repeatable workflow for every meaningful production DNS change.

    StepPurposeUseful guide
    ClassifyDecide whether the record is low, medium, or high riskDNS record types
    Plan TTLAvoid long stale-cache windows during cutoverDNS TTL best practices
    ReviewCatch destructive changes before publishDNS as code best practices
    ApplyMake the change through the dashboard, API, Terraform, or DNSControlDNS Automation
    VerifyQuery authoritative and recursive resolversdig tutorial
    MonitorWatch traffic, errors, latency, and query patternsDNS network performance monitoring
    Roll backRestore the previous known-good state if criteria are metDNS migration guide

    For small TXT verification records, this workflow can be lightweight. For MX, NS, DS, DNSKEY, CAA, apex, and migration changes, write the steps down before you start.

    High-Risk Records

    Some records deserve stricter handling because mistakes create outages outside the DNS dashboard.

    RecordWhat can breakExtra check
    A / AAAA / ALIAS at apexWebsite and API trafficQuery authoritative DNS and production health checks
    CNAMEHost routing, SaaS delegation, verificationConfirm no conflicting A/AAAA/MX records exist at the same name
    MXEmail deliveryQuery MX plus SPF, DKIM, and DMARC records
    TXTSPF, DKIM, DMARC, ownership verificationCheck quoting, length splitting, and stale verification values
    CAACertificate issuanceConfirm all issuing CAs are still authorized
    NSDelegationVerify parent-zone NS and child-zone authoritative answers
    DS / DNSKEYDNSSEC validationUse DNSSEC chain checks before and after

    If the record controls production traffic, email, certificates, or DNSSEC, treat it as high risk.

    Verification Commands

    Start with authoritative data. This bypasses resolver cache and proves what the provider is serving:

    dig @ns1.dnscale.eu example.com A +short
    dig @ns1.dnscale.eu example.com SOA +short

    Then compare recursive resolvers:

    dig @1.1.1.1 example.com A +short
    dig @8.8.8.8 example.com A +short
    dig @9.9.9.9 example.com A +short

    For delegation and DNSSEC path issues:

    dig +trace example.com
    dig +dnssec example.com A

    If authoritative DNS is correct but recursive resolvers differ, the issue is usually TTL/cache state. If authoritative DNS is wrong, fix the zone data first.

    Migration Runbook

    DNS migrations are where operations discipline matters most.

    Use this shape:

    1. Inventory every record, including hidden platform records.
    2. Lower TTLs before the change window.
    3. Import and validate the zone at the new provider.
    4. Run old and new providers in parallel where possible.
    5. Switch NS records at the registrar.
    6. Verify parent delegation, authoritative answers, recursive answers, DNSSEC, email, certificates, and application health.
    7. Keep the old provider active until query traffic has stopped and parent NS TTLs have expired.

    Read DNS Migration Guide before touching registrar NS records.

    Troubleshooting Paths

    Use symptoms to choose the right path:

    SymptomStart here
    Website does not loadDNS Troubleshooting
    Name returns NXDOMAINNXDOMAIN Errors
    Resolver returns SERVFAILSERVFAIL Errors
    SERVFAIL appears after DNSSEC, DS, or key changesDNSSEC Failure Troubleshooting
    Certificate issuance or renewal fails on CAACAA Debugging Guide
    Some users see old answersDNS Propagation Explained
    Local machine is staleHow to Flush DNS Cache
    Lookup is slowHow to Fix Slow DNS Lookup

    Avoid changing records until you know which layer is wrong.

    Monitoring

    DNS monitoring should answer four questions:

    • Are authoritative servers reachable?
    • Are answers correct from major resolver networks?
    • Are latency and error rates changing?
    • Are old providers still receiving queries during a migration?

    For network-level measurement, see DNS Network Performance Monitoring. For redundancy design, read Primary DNS vs Secondary DNS and DNS Failover Design Patterns.

    Rollback Checklist

    Before a high-risk change, write down:

    • previous record values
    • previous TTLs
    • commands that prove the previous state is restored
    • rollback owner
    • rollback deadline
    • health checks that trigger rollback
    • known cache behavior after rollback

    Rollback is not always instant. Resolver caches may hold the changed value until TTL expiry. This is why planned TTL reduction matters.

    Frequently asked questions

    What is DNS operations?
    DNS operations is the set of practices used to run authoritative DNS safely: ownership, change review, TTL planning, migration control, troubleshooting, monitoring, incident response, and rollback.
    Which DNS changes are high risk?
    Apex A/AAAA/ALIAS records, MX, SPF, DKIM, DMARC, CAA, NS, DS, DNSKEY, low TTL changes, and any record used for production traffic, email, certificates, or customer onboarding deserve extra review.
    How should I verify a DNS change?
    Query the authoritative server directly first, then compare major recursive resolvers such as 1.1.1.1, 8.8.8.8, and 9.9.9.9. Use dig +trace for delegation and DNSSEC path issues.
    When should TTLs be lowered?
    Lower TTLs 24-48 hours before a planned migration or cutover, depending on the previous TTL. Lowering a TTL only helps after existing resolver caches have expired.
    What should a DNS rollback plan include?
    It should include the exact previous record values, authoritative verification commands, resolver verification commands, expected TTL behavior, owner contact, and the criteria for deciding whether to roll back.
    Is DNS operations different from DNS automation?
    Yes. DNS operations defines the safe workflow. DNS automation implements parts of that workflow with APIs, CI/CD, IaC, previews, and drift detection.

    Related guides

    Ready to manage your DNS with confidence?

    DNScale provides anycast DNS hosting with a global network, real-time analytics, and an easy-to-use API.

    Start free