01 / DPO ENGINE

Software-defined control plane that forecasts GPU load ahead of the transient and computes setpoints for the power shelf. Runs as a software process alongside the workload scheduler.

INTERFACES

  • Redfish API — telemetry ingestion
  • PMBus — power shelf control
  • OCP ORv3 PMI — rack coordination

Target: OCP certification 2027 Q4

02 / DPO GATEWAY

Rack-mounted hardware node that coordinates BBU discharge and PSU output inside the ORv3 power shelf. Designed as a pin-compatible replacement for the PMI module.

INTERFACES

  • OCP ORv3 PMI-compatible form factor
  • SMBus + PMBus control
  • Real-time BBU + PSU coordination

Target: UL safety certification 2027 Q4

US Patent US12510873B2 · Granted

BOUCLE DE CONTRÔLE

Comment DPO répond

DPO Control-Loop Sequence
Phase 1 Rack · PSU · BBU Telemetry ingested via Redfish + PMBus
Phase 2 DPO Engine Transient forecast computed — setpoints determined
Phase 3 DPO Gateway Setpoints issued to BBU + PSU
Phase 4 Grid interface Grid sees shaped, compliant load profile
Continu DPO Engine Compliance telemetry logged

ARCHITECTURE

Prévoir, orchestrer, se conformer.

DPO sits between the data center EMS and the ORv3 rack. The diagram below shows the DPO Gateway Specification — XMight's open contribution to OCP Rack & Power — and the closed-loop architecture our commercial DPO implements.

DATA CENTER EMSFleet scheduler · workload orchestratorPOWER GRIDDemand · capacity · frequency schedulerDPO ENGINE — CONTROL PLANEML-driven rack-level power predictionVRT POLICYVoltage ride-through envelope enforcementPFAPR RECOVERY CONTROLPost-fault active power recovery slopeCONTROL PLANEDPO GATEWAY — SHELF PMI EXTENDRack-mounted power orchestration moduleVRT EDGE POLICYLocal VRT enforcementBBU/PSU ORCHESTRATORDischarge + output shapingRACK POWER PREDICTIONSub-ms transient forecastTELEMETRY AGGREGATORFleet data collectionAC INPUTGrid powerORv3 RACKACPOWER SHELFAC → DC conversionDPO48V DC48V DC BUSBBU SHELFStandard backupCharge ⇄ dischargeDPOBBU — VRT/FRT/PFAPRGrid compliance eventsCharge ⇄ dischargeDPOAI SERVER ×6+GPU compute load
DPO Gateway Specification — contribution ouverte de XMight à l'OCP Rack & Power
LE CHIFFRE QUI COMPTE

L'alimentation ORv3 classique réserve 20–30% de marge pour l'absorption des transitoires — une capacité de calcul définitivement bloquée. DPO réduit cette marge à 5–10% grâce à la prédiction des transitoires à l'échelle de la milliseconde et à la décharge BBU coordonnée. Le throttling n'est pas éliminé ; il devient plus petit, plus intelligent, et continu.

Source: ORv3 Base Specification headroom reservation values; DPO target band based on closed-loop simulation across reference workload mixes.

RÉPONSE AUX ÉVÉNEMENTS DE TENSION

Comment un rack ORv3 réagit — avec et sans DPO.

Deux scénarios d'événements de tension, quatre traces chacun. Colonne gauche : alimentation ORv3 conventionnelle. Colonne droite : même scénario avec DPO coordonnant la décharge BBU et la sortie PSU.

SANS DPO
AVEC DPO
Scenario 1 — 0.5 p.u. sag over 500 ms · Without DPONOGRR 282 VRT ENVELOPE · FAILS BOTH VRT AND PFAPR100%50%0%Vac (p.u.)100%0%PSU pwr100%0%BBU pwr100%0%48V busbar0 mssag begins≈3 msPFC trips500 msAC recovers2,500 msPFAPR limitPFAPR 2 s — FAILPFC trips → grid seesload disconnectPSU dead — restart 6–17 s≫ PFAPR 2 s budget
Scenario 1 — 0.5 p.u. sag over 500 ms · With DPODPO MEETS NOGRR 282 VRT + PFAPR100%50%0%Vac (p.u.)100%0%PSU pwr100%0%BBU pwr100%0%48V busbar0 mssag begins3–27 msDPO triggers BBU500 msAC recovers2,500 msPFAPR limitPFAPR 2 s — PASSDPO predicts → BBUdischarges → busbar heldCoordinated PSU rampmeets PFAPR recovery slope
Scenario 3 — 0.35 p.u. full AC loss · Without DPONO KEEP-ALIVE · PSU CONTROLLER DIES100%35%0%Vac (p.u.)100%0%PSU pwr100%0%BBU pwr100%0%48V busbar0 msAC fully lost≈3 msPFC trips150 msAC returns2,150 msPFAPR limitPFAPR 2 s — FAILPFC trips + controllerloses powerCold-boot 0–15 s delay≫ PFAPR 2 s — fails 5–17×
Scenario 3 — 0.35 p.u. full AC loss · With DPODPO KEEP-ALIVE · CONTROLLER SURVIVES OUTAGE100%35%0%Vac (p.u.)100%0%PSU pwr100%0%BBU pwr100%0%48V busbar0 msAC fully lost3–22 msDPO sustains busbar150 msAC returns2,150 msPFAPR limitPFAPR 2 s — PASSDPO + BBU keep busbar alivecontroller stays onlineCoordinated recoverywithin PFAPR 2 s budget

Sans DPO, le PFC se déclenche en quelques millisecondes et le redémarrage du PSU prend 6–17 secondes — bien au-delà du budget PFAPR de 2 secondes. Avec DPO, le moteur prédit le transitoire, la passerelle coordonne la décharge BBU pour maintenir le bus 48V, et le PSU se rétablit selon une pente conforme au PFAPR. Même matériel, plan de contrôle défini par logiciel.

FLUX D'ÉNERGIE

Du transitoire GPU à la charge visible par le réseau.

DPO intercepte le transitoire avant qu'il ne se propage au point de couplage réseau.

GPU compute
AI workload begins inference or training step — GPU draw spikes within milliseconds.
DPO Engine
Engine detects telemetry delta via Redfish + PMBus, runs MTL forecast, computes power shelf setpoints — before the transient reaches the PSU.
DPO Gateway
Gateway issues setpoints to BBU and PSU simultaneously — shaping the aggregate draw seen at the rack's grid connection point.
Grid interface
Grid sees a shaped, predictable load profile — within the envelope required for PFAPR compliance under ERCOT NOGRR 282.

ANALYSE INDUSTRIELLE

Quatre lacunes que le contrôle en boucle fermée doit traiter.

Mapped against current OCP ORv3 specifications. Each gap belongs to a different leaf-spec owner — none can be closed by a single vendor or a single spec change. DPO addresses all four at rack level.

VRT GAP
01

PSU PFC low-voltage behaviour

ORv3 PSU operates 180–305 V AC; specification permits derating between 180–198 V. Below 180 V (~0.75 p.u.), PFC shuts off in approximately 20 ms with 20 ms holdup. NOGRR 282 §2.14 requires ride-through to 0 p.u.

PFAPR GAP
02

PSU controller power-on delay

PSU controller loses power on AC drop; cold-boot on restart introduces a 0–15 second random delay for inrush staggering. The PFAPR budget is 1 second — current behaviour fails by 5–17×.

COORDINATED SAG GAP
03

BBU trigger condition

BBU Module 1.4 §4.5 specifies discharge trigger at busbar < 48.5 V for 2 ms. This covers full AC loss but does not provide coordinated voltage-sag support.

PMI GAP
04

PMI interface scope

PMI Specification 1.0 §2 defines pass-through monitoring only. No write commands, no open rack-level read-write coordination path is currently defined.

DPO Engine + DPO Gateway sont conçus pour combler les quatre lacunes au niveau du rack, sans nécessiter de coordination avec les fournisseurs de composants.

INTÉGRATION OCP

DPO Gateway est compatible OCP ORv3 PMI.

We identified 12 modifications required across 7 OCP spec documents for ERCOT compliance — the gap between ORv3's current disconnect-on-fault behavior and grid ride-through requirements. DPO Gateway extends PMI Specification 1.0 to enable dynamic power orchestration at the shelf level.

Spécification Version Exigence ERCOT Statut
Open Rack V3 Base 1.0
VRTFRT
Gap identified
Power Shelf 1.0
VRTPFAPR
Gap identified
48V PSU 1.0
VRTPFAPRFRT
Gap identified
PMI 1.0
VRTPFAPR
DPO extends
BBU Shelf 1.1
VRTPFAPR
Gap identified
BBU Module 1.4
VRTPFAPR
Gap identified
Modbus Register Map 0.73
PFAPR
Under review
NOTRE CONTRIBUTION À OCP

The Active BBU: Dynamic Power Orchestration for Stable and Efficient ORv3 AI Racks

Présenté à l'OCP Global Summit 2025

This paper describes the Active BBU architecture — the technical foundation of DPO — and its role in enabling PFAPR compliance while reclaiming compute headroom traditionally reserved for transient buffering.