The article focuses on developing an algorithm for automation based on stereo computer vision and dynamic registration in a surgical robotic context. The performance of the algorithm was further tested experimentally utilizing the block transfer task which corresponds to tissue manipulation as designed by Fundamentals of Laparoscopic Surgery (FLS). The surgical robotics field as a whole progresses towards the reduction of invasiveness limiting the trauma at the periphery of the surgical site and increase of semi-autonomous operation while positioning the surgeon as a decision maker rather than as an operator. The autonomous FLS task is implemented successfully and tested experimentally with the Raven II surgical robot system. The data indicate that the autonomous operational mode has better overall performance and limited tool-environment interaction compared with the human teleoperation mode. Surgeon’s intention may also be extracted from a database that may lead to seamless switching between the human operator and the autonomous system and in that sense, it may allow the autonomous algorithm to cope with more complex surgical environments.
When a surgical robotic system is introduced to the surgical scene two human-machine interfaces are established and define its primary operation: (1) the surgeon-robot interface (S-R) and (2) the patient-robot interface (R-P). Each has a unique set of requirements that dictates its design capabilities and functions. Figure 1 maps several commercial systems, research systems and systems during commercialization process that were classified based on these interfaces .
The S-R interface is defined by a wide spectrum of control levels provided to the surgeon over the surgical robotic system. Assuming that a certain level of control is required to complete a task, it can be distributed between the human operator and the robotic system at different ratios which in turn defines the level of automation allocated for the task. This level of automation is bounded by two extreme operational modes. The right hand side of the horizontal axis in Figure 1 describes a mode in which the surgical robotic system is fully autonomous [2–5]. The left hand side of the horizontal axis in Figure 1 describes a mode of operation in which any movement of the surgical robotic system is in direct response to a real time position/ orientation command input provided by the surgeon. The system architecture used to enable this approach is teleoperation, utilizing a master/ slave configuration. The master is defined as the surgical console and the slave serves as the surgical robot itself interacting with the patient's tissue through the surgical tools. [6–8].
The robot-patient (R-P) interface determines the level of invasiveness (vertical axis in Figure 1). The level of Invasiveness spectrum spans across a range of surgical approaches including (1) the invasive open-procedure approach, which requires a large incision to expose the targeted anatomy, (2) variations of minimally invasive surgical (MIS) approaches with a gradual reduction of invasiveness, such as multiple tools inserted through ports, natural orifice transluminal endoscopic surgery (NOTES), catheters [9–11] and needles [12,13] and (3) a noninvasive approach in which energy (radiation) is provided by an external source to a localized space to provide a localized therapy . As the level of invasiveness decreases, the level of manipulation also decreases and, as a result, the surgeon has fewer degrees of freedom to mechanically manipulate the tissues. The surgical robotics field as a whole progresses towards the reduction of invasiveness limiting the trauma at the periphery of the surgical site and increase of semi-autonomous operation while positioning the surgeon as a decision maker rather than as an operator.
The reported study is focused on developing an algorithm for automation based on stereo computer vision and dynamic registration in a surgical robotic context. The performance of the algorithm was further tested experimentally utilizing the block transfer task which corresponds to tissue manipulation as defined by Fundamentals of Laparoscopic Surgery (FLS) . The surgical task was performed autonomously by a surgical robot (Raven II) and then compared with the performance of a human teleoperating the same surgical robotic system.
Raven II (UCLA/UW/Applied Dexterity Inc.) was used as the surgical robotic system for experimentally evaluating the performance of surgical task both in an autonomous mode and in a teleoperation mode . A compact commercial stereo Point Grey Bumblebee2 camera (BB2-03S2C-38) was positioned 0.23m to 0.3m above the surgical site pointing down. This position and orientation allow to encapsulate all the surgical tools into the field of view while eliminating potential collision between the camera and the four surgical robotic arms (Figure 2a). The camera has image update rate of 48 FPS at full resolution of 640x480. A custom support for the camera in OpenCV was developed enabling the use of OpenCV as the primary tool for real time image processing. The stereo vision was used for surgical tool detection, surgical tool visual servoing and surgical environment perception.
Given the master/slave architecture of the system a block diagram of the software architecture (Figure 2b) depicts the corresponding two components. The slave components software consists of the robot low-level real time servo control software, in a teleoperation mode the surgeon generates the position and orientation command signals using the master. In particular the reference command information sent from master to slave consists of the incremental Cartesian positions, the absolute orientation transformation matrix, and the absolute tool joint angles. However in an autonomous mode the operator is replaced with an intelligent agent generating autonomously the same inputs to the surgical robotic system. The autonomous intelligent software component substituting the operator includes the following modules: (1) computer vision module, (2) task and path planning module, (3) visual servo module, (4) network module and human interface module. A UDP layer is used for the data communication between the two software components. Both manual and automatic switching are included in the software between the teleoperation mode and the autonomous mode.
Task Definition and Decomposition
The FLS are a set of tasks that are used widely and primarily as part of a curriculum for surgeon training in MIS and performance assessment tools. In addition the FLS tasks provide a standard platform for comparing performance of manual operation as well as various teleoperated surgical robot systems. The FLS block transfer is a task that simulates tissue manipulation. It may be also defined as a “pick and place” task in which a set of blocks mounted on pegs are picked with one MIS surgical tool, transferred to the other tools and placed on a new set of pegs one at a time.
The FLS task was further decomposed into subtasks and potential failure modes were identified. The FLS block transfer subtasks are (Figure 3): (1) Starting configuration - Three triangle objects are placed in three left pins; (2) Move tool from the initial position to the location of the block; (3) Pick a block from the left pin and place it in the right pin, and then repeat until all three left blocks are transferred to three right pins (4) Move tool back to the initial position. The failure modes are (1) Grasping Failure: failing to grasp the block or dropping the block during the grasping process (2) Transport Failure: dropping the block during the transportation between the pegs (3) Place Failure: Failing to place the block on the peg (4) Collision Failure: Collision or an application of a large force by the tool on the peg board that causes it to move.
Surgical Tool Detection
A high precision but low rate computer vision based method was developed to detect the position / orientation of the surgical tools in the camera frame and enhance the high rate but lower precision forward kinematics approach which is compromised due to the compliance of the cables incorporated into the mechanical system as well as the limited information regarding the exact position and orientations of the robotic arms’ bases.Markers detected by the computer vision were placed on several locations on the shaft of the tools away from the tip in a known location that is not occluded by the potential tool tip tissue interaction. Forward kinemics which is limited to the last two DOF was then used to estimate the tool tip position and orientation. Figure 4 depicts the Point A and B that were detected in 3D by the computer vision system which led to the estimation of point C marking the tool tip.
Each triangular block is placed on a peg with a random rotation angle and with a non-coincided axis with respect to the peg. A dynamic real time algorithm was developed to detect the location and orientations of the blocks and the pegs–necessary information for the path planning. A dynamics real time algorithm is required to deal with potential changes in the environment. In the current experimental setup the environment may change due to collision between the tool and the blocks or the peg board. However in a clinical setting the surgical site is constantly subject to change due to tissue manipulations, dissections, and suturing.
Tool-Environment Collision Detection
Given the lack of force sensor incorporated into the tools substantial tool/environment collision can be detected by a significant translation or deformation of the environment or the tools. In the context of the experimental setup substantial collision is defined by a movement of the pegboard. Such a collision triggered the dynamic registration and facilitated uninterrupted completion of the task autonomously. Furthermore, collision that led to pegboard displacement was also used to quantify as an error for performance evaluation. In a clinical setting fiducial markers or key anatomical structures pointed by laser dots may be used for detecting a significant change on the operational field.
Visual Servo Control of the Surgical Robot
A hybrid Cartesian based visual servo approach was developed to mediate the requirement to update the control loop at a rate of 1 KHz for stable operation and the visual performance rate of maximum 48FPS and visual image processing rate of 25 Hz.
The Raven II robot (slave) is controlled with its low level joint controllers at a 1 KHz rate. The visual servo running at a rate of 25 Hz was incorporated into the automation algorithms controlling the master to provide delta Cartesian position commands (X,Y,Z), which are calculated as the difference between the desired position based on the planned trajectory and the actual position and orientation of the tip as acquired by the stereo camera. The proportional gains of the visual servo controller were selected experimentally to achieve fast and stable response. The visual servo control error at steady state along each axis is within 0.4 mm as measured in 3D camera frame.
Task Planning and State Machine
In order to automate the task, the FLS block transfer subtasks were decomposed into nine states defined in Table I and formed a state machine repeating state 1 to 8 three times for transferring the three blocks and terminating the process in state 9. An internal verification mechanism was used to check the completion of each state prior to every switch to the following state. An internal error correction mechanism was incorporated to correct for potential failures within each state and potentially moving to a different state to recover from the potential error. If the failure is not recoverable, such as object is dropped out of camera view, then the state machine will continue to next subtask cycle to transfer the other remaining objects.
Generic path is predefined offline for each state. However the actual 3D path points are dynamically generated to accommodate changes in the operational environment as detected by vision system, such that the path is adjusted in real time. The speed limits were set to 10 mm/s for high precision manipulation and to 30 mm/s for low precision translation.
The block transfer task was completed 20 times (60 block transfers in each mode) in the following modes (1) Autonomous operation (2) Teleoperation by a human subject. Robotic arm kinematics, tool tip trajectory, task completion time, peg board marker motion trajectories and videos from the stereo camera and webcam in teleoperation were all recorded and collected for off line analysis.
Table II summaries the performance difference between the two modes of operation in terms of goal completion success rate, performance measures and safety measure.
Task Goal Achievement
The success rate for block grasping task is 100% in both modes. Although a grasping force sensor is not incorporated into the current design of the tool, the accurate tool tracking and object detection makes the grasping a success in the autonomous mode of operation. During the block transportation the block grasping success rate is again 100% in both modes.
For transporting grasped objects from one location to another location without dropping the blocks, the success rate is 100%. The slight decrease in the success rate of the block placement of 96.7% is accounted for in 4 cases out of 60 in which the blocks didn’t fully drop to the base of the peg as a result of small misalignment between the center hole of the block and the peg. The success rate of the human teleoperation mode is as previously 100%.
Task Completion Time
For autonomous operation, the completion time is identical for all the repetitions (25 s). It takes a human operator about two times longer to complete the task (49 ± 5.7) with a standard deviation of about 5% in a teleoperation mode.
Surgical Tool Tip Trajectory
The surgical tool tip trajectory is used to analyze the efficiency of the motion. Given a specific task, a shorter trajectory is also perceived as a more effective trajectory with a lower potential for collisions. The average tool tip trajectory length in the autonomous mode is about 60% shorter than the tool tip trajectory during the human teleoperation mode as depicted in Figure 5. As indicated graphically the trajectories of the autonomous mode tend to be straight with smooth transitions between the individual segments. However the trajectories of the tip during the human teleoperation mode are composed of arches which are typically longer than the other mode. The arch like trajectories aim to clear the block from the array of pegs in an attempt to avoid potential collisions which in turn leads to longer trajectories and completion time.
Tool-Object Interaction / Environment Collision
The motion of the peg board is a result of force generated as the tool or the object interact with the pegs. The peg board motion is measured to evaluate the tool/environment interaction and defined as the sum of the trajectories of the four markers of the pegboard. Smaller peg board motion means fewer tool environment collisions and a smaller interaction force that may potentially damage the patient's tissue. The pegboard trajectory as a result of collision with the tool during teleoperation mode was more than 12 times longer than the autonomous mode, meaning that during the autonomous operation the collision between the tool and the environment is significantly lower than during the other mode. The trajectories of the markers are depicted in Figure 5 as well surrounding the trajectory of the tool tip.
Conclusion and Future Works
As part of the reported research effort a fully autonomous algorithm was developed for a block transfer of the FLS simulating surgical tissue manipulation in a surgical robotic MIS setting. The algorithm for the autonomous operation is composed of stereo vision based surgical tool detection, surgical tool visual servo control, pegboard environment detection and object detection. The FLS peg transfer task is decomposed into a state machine for task planning and path planning.
The autonomous FLS task is implemented successfully and tested experimentally with the Raven II surgical robot system. The data indicate that the autonomous operational mode has better overall performance and limited tool-environment interaction compared with the human teleoperation mode. In addition the proposed computer vision based automation approach doesn’t need the typical precise calibration of the robot arms since the autonomous agent software functions as an intelligent teleoperation master that is independent of the low level robot control system and can potentially be applied to different surgical robot systems.
Since the FLS peg transfer task includes the basic surgical skills and subtasks that are common in other surgical tasks it is likely that the proposed approach can be applied to the rest of the FLS tasks as well as to other surgical procedures’ subtasks. One should note that the goal of autonomous mode is not to replace the surgeon but to remove the surgeon from his or her role as an executer of every single motion of the robotic system to the position of a decision maker. A potential expansion of the reported research is the use of trajectories learned from expert surgeon (learn by demonstration) as a substitution for artificially generated trajectories and speed patterns. Furthermore, surgeon's intention may also be extracted from a database [17–19] that may lead to seamless switching between the human operator and the autonomous system [20-21] and in that sense it may allow the autonomous algorithm to cope with more complex surgical environments.
This work is supported by the U.S. National Science Foundation award IIS-1227184: Multilateral Manipulation by Human-Robot.