Abstract

In a sensor system that records a rolling-shutter image stream together with a synchronized, higher-rate inertial measurement unit (IMU) stream, the intra-frame geometric skew produced by rolling-shutter readout is treated not as an artifact to be removed but as a usable, temporally structured motion signal. Because a rolling-shutter frame exposes its rows sequentially, the frame encodes camera and scene motion over the readout interval. When the frame is time-correlated with inertial data at sufficient resolution, the inertial stream can be used to estimate the camera motion during readout, which in turn supports two complementary representations of each frame: (1) an estimate of the stationary scene at a common reference time, as if captured with a global shutter; and (2) a residual representation of image motion not explained by the estimated camera trajectory, which tends to correspond to independently moving objects in the scene. This disclosure describes the capture configuration, the per-frame data record, the estimation method, the derived products usable as auxiliary supervision for machine-learning models, and the use of staged calibration sequences as part of a model's training distribution so that the visual-inertial relationship may be learned implicitly. A wide range of embodiments and variations is enumerated to make clear the breadth of the disclosed subject matter.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS