Power Management high-level design

P-state/C-state management

ACPI Px/Cx data

CPU P-state/C-state are controlled by the guest OS. The ACPI P/C-state driver relies on some P/C-state-related ACPI data in the guest ACPI table.

The Service VM can run the ACPI driver with no problems because it can access the native ACPI table. For the User VM though, we need to prepare the corresponding ACPI data for the Device Model to build a virtual ACPI table.

The Px/Cx data includes four ACPI objects: _PCT, _PPC, and _PSS for P-state management, and _CST for C-state management. All these ACPI data must be consistent with the native data because the control method is a kind of passthrough.

These ACPI objects data are parsed by an offline tool and hard-coded in a Hypervisor module named CPU state table:

struct cpu_px_data {
    uint64_t core_frequency;        /* megahertz */
    uint64_t power;                 /* milliWatts */
    uint64_t transition_latency;    /* microseconds */
    uint64_t bus_master_latency;    /* microseconds */
    uint64_t control;               /* control value */
    uint64_t status;                /* success indicator */
} __attribute__((aligned(8)));

struct acpi_generic_address {
    uint8_t     space_id;
    uint8_t     bit_width;
    uint8_t     bit_offset;
    uint8_t     access_size;
    uint64_t    address;
} __attribute__((aligned(8)));

struct cpu_cx_data {
    struct acpi_generic_address cx_reg;
    uint8_t     type;
    uint32_t    latency;
    uint64_t    power;
} __attribute__((aligned(8)));

With these Px/Cx data, the Hypervisor is able to intercept the guest’s P/C-state requests with desired restrictions.

Virtual ACPI table build flow

Figure 285 shows how to build the virtual ACPI table with the Px/Cx data for User VM P/C-state management:

../../_images/hld-pm-image28.png

Figure 285 System block for building vACPI table with Px/Cx data

Some ioctl APIs are defined for the Device model to query Px/Cx data from the Service VM VHM. The Hypervisor needs to provide hypercall APIs to transit Px/Cx data from the CPU state table to the Service VM VHM.

The build flow is:

  1. Use an offline tool (e.g. iasl) to parse the Px/Cx data and hard-code to a CPU state table in the Hypervisor. The Hypervisor loads the data after the system boots.
  2. Before User VM launching, the Device mode queries the Px/Cx data from the Service VM VHM via ioctl interface.
  3. VHM transmits the query request to the Hypervisor by hypercall.
  4. The Hypervisor returns the Px/Cx data.
  5. The Device model builds the virtual ACPI table with these Px/Cx data

Intercept Policy

The Hypervisor should be able to restrict guest’s P/C-state request with a user-customized policy.

Hypervisor should intercept guest P-state request and validate whether it is a valid P-state. Any invalid P-state (e.g. doesn’t exist in CPU state table) should be rejected.

It is better not to intercept C-state request because the trap would impact both power and performance.

Note

For P-state control, you should pay attention to SoC core voltage domain design when doing P-state measurement. The highest P-state would win if different P-state requests on the cores shared same voltage domain. In this case, APERF/MPERF must be used to see what P-state was granted on that core.

S3/S5

ACRN is designed to support the system level S3/S5 with following assumptions:

  1. Guest has complete S3/S5 power state management.
  2. Guest follows the ACPI standard to implement S3/S5.
  3. Guest is responsible for saving its contents before entering S3/S5.
  4. Highest severity guest’s power state is promoted to system power state.
  5. Guest has lifecycle manager running to handle power state transaction requirement and initialize guest power state transaction.
  6. S3 is only available on configurations that have no DM launched RTVM.
  7. S3 is only supported at platform level - not VM level.

ACRN has a common implementation for notification between lifecycle manager in different guest. Which is vUART based cross-vm notification. But user could customize it according to their hardware/software requirements.

Figure 286 shows the basic system level S3/S5 diagram.

../../_images/hld-pm-image62.png

Figure 286 ACRN System S3/S5 diagram

System low power state entry process

Each time, when lifecycle manager of User VM starts power state transition, it will finally write the ACPI register per ACPI spec requirement. With help of ACRN I/O emulation framework, the User VM ACPI register writing will be dispatched to Device Model and Device Model will emulate the User VM power state (pause User VM for S3 and power off User VM for S5)

The VM Manager monitors all User VMs. If all active User VMs are in required power state, VM Manager will notify lifecycle manager of Service VM to start Service VM power state transition. lifecycle manager of Service VM follows a very similar process as User VM for power state transition. The difference is Service VM ACPI register writing is trapped to ACRN HV. And ACRN HV will emulate Service VM power state (pause Service VM for S3 and no special action for S5)

Once Service VM low power state is done, ACRN HV will go through its own low power state enter path.

The whole system is finally put into low power state.

Figure 287 shows the flow of low power S5 enter process with typical ISD configuration(S3 follows very similar process)

../../_images/hld-pm-image63.png

Figure 287 ACRN system S5 entry workflow

For system power state entry:

  1. Service VM received S5 request.
  2. Lifecycle manager in Service VM notifies User VM1 and RTVM through vUART for S5 request.
  3. Guest lifecycle manager initializes S5 action and guest enters S5.
  4. RTOS cleanup RT task, send response of S5 request back to Service VM and RTVM enter S5.
  5. After get response from RTVM and all User VM are shutdown, Service VM enter S5.
  6. OSPM in ACRN hypervisor checks all guests are in S5 state and shuts down whole system.

System low power state exit process

The low power state exit process is in reverse order. The ACRN hypervisor is awakened first. It will go through its own low power state exit path. Then, ACRN hypervisor will resume the Service VM to let Service VM go through Service VM low power state exit path. After that, the DM is resumed and let User VM go through User VM low power state exit path. The system is resumed to running state after at least one User VM is resumed to running state.

According to ACPI standard, S3 is mapped to suspend to RAM and S5 is mapped to shutdown. So the S5 process is a little different:

  • User VM enters S5 -> User VM powers off
  • System enters S5 -> System powers off
  • System resumes From S5 -> System fresh start
  • User VM resumes from S5 -> User VM fresh startup