Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
    Length: 10:05
28 Oct 2020

Although unsupervised methods of monocular depth and camera motion estimation have made significant progress, most of them are based on the static scene assumption and may perform poorly in dynamic scenes. In this paper, we propose a novel framework for unsupervised learning of monocular depth and camera motion estimation, which is applicable to dynamic scenes. Firstly, the framework is trained to obtain initial inference results by assuming the scene is static, through minimizing a photometric consistency loss and a 3D transformation consistency loss. Then, the framework is fine-tuned by jointly learning with a motion rectification network (RecNet). Specifically, RecNet is designed to rectify the individual motion of moving objects and generate motion rectified images, enabling the framework to learn accurately in dynamic scenes. Extensive experiments have been done on the KITTI dataset. Results show that our method achieves state-of-the-art performance on both depth prediction and camera motion estimation tasks.

Value-Added Bundle(s) Including this Product

More Like This

  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00
  • SPS
    Members: $150.00
    IEEE Members: $250.00
    Non-members: $350.00