Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 00:04:49
11 Jun 2021

With the development of intelligent hardware, front-end devices can also perform DNN computation. Moreover, the deep neural network can be divided into several layers. In this way, part of the computation of DNN models can be migrated to the front-end devices, which can alleviate the cloud burden and shorten the processing latency. This paper proposes a computation allocation algorithm of DNN between the front-end devices and the cloud server. In brief, we divide the DNN layers dynamically according to the current and the predicted future status of the processing system, by which we obtain a shorter end-to-end latency. The simulation results reveal that the overall latency reduction is more than 70% compared with traditional cloud-centered processing.

Chairs:
Ivan Bajic

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00