COUPLED 2025

AI-assisted Robotic Endomicroscopy Tissue Scanning

  • Xu, chi (Imperial College London)
  • Giannarou, Stamatia (Imperial College London)

Please login to view abstract download link

Probe-based Confocal Laser Endomicroscopy (pCLE) enables direct visualisation of the tissue morphology at microscopic scale in vivo and in situ. Recent studies have shown that pCLE has a role in characterising tissue intraoperatively to guide tumour resection [1]. To capture good quality endomicroscopic information which is a prerequisite for accurate diagnosis, it is important to maintain the probe within a working range of micrometre scale, while also keeping it perpendicular to the tissue surface. This can be achieved through micro-surgical robotic manipulation of the pCLE probe. The aim of this work is to develop Artificial Intelligence (AI) methods for the automatic estimation of the longitudinal distance and orientation of the probe with respect to the tissue surface, required during robotic tissue scanning. In the literature, a blur metric-based method [2] has been proposed to approximate the location of the pCLE probe with respect to the tissue surface but it does not regress their actual distance. Regarding the automatic estimation of the pCLE orientation, macroscopic vision has been used in [3] to recover the pose of the pCLE probe with respect to the tissue surface. This method can not achieve the required micrometer accuracy for controlling the pCLE probe, and is prone to errors in the presence of tissue deformation and varying lighting conditions. To address the above challenges, we developed SFFC-Net [4], the first approach to automatically regress the distance between a pCLE probe and the tissue surface fusing data representations in the spatial and frequency domain. To incorporate robust image-based supervision and temporal information, a Generative Adversarial Network (GAN) and Sequence Attention (SA) module [5] have been developed. Regarding the pCLE probe orientation, we proposed the Fast Fourier Vision Transformer (FF-ViT), the first microscopic vision method for automatic inference of the probe orientation. Our performance evaluation study confirms the stable convergence of our distance regression and FF-ViT’s strong generalizability across out-of distribution datasets. Both methods outperform state-of-the-art (SOTA) methods in handling challenging datasets.