Troubleshooting and Debugging ============================== This page covers common failure modes across every RoboFlock subsystem and the commands needed to diagnose them. ---- Quick Diagnostic Checklist --------------------------- Run these commands over SSH first. They answer the most common questions in under 30 seconds. .. code-block:: bash # 1. Are the nodes running? ros2 node list # 2. Are the topics publishing? ros2 topic list # 3. Is cmd_vel getting values? ros2 topic echo /cmd_vel # 4. Is the LiDAR spinning? ros2 topic hz /scan # should be ~10 Hz # 5. What is the service doing? sudo systemctl status robot.service # 6. Recent service logs journalctl -u robot.service -n 50 --no-pager # 7. Are all USB devices assigned? ls /dev/rplidar /dev/odesc_* /dev/hc12 /dev/gps # 8. Jetson resource usage tegrastats ---- Environment / ROS2 Discovery ------------------------------ **Symptom:** ``ros2 node list`` returns nothing after SSH. This is almost always an ``ROS_LOCALHOST_ONLY`` mismatch between the running service and your SSH terminal. .. code-block:: bash # Check what the service process sees sudo cat /proc/$(pgrep -f self_destruct | head -1)/environ \ | tr '\0' '\n' | grep -E "ROS|DOMAIN|LOCALHOST" # Check your terminal env | grep -E "ROS|DOMAIN|LOCALHOST" If ``ROS_LOCALHOST_ONLY`` differs between the two, fix your terminal: .. code-block:: bash export ROS_LOCALHOST_ONLY=0 ros2 daemon stop ros2 daemon start ros2 node list To make this permanent, add the export to ``~/.bashrc``: .. code-block:: bash echo "export ROS_LOCALHOST_ONLY=0" >> ~/.bashrc source ~/.bashrc ---- robot.service ------------- **Symptom:** Service shows ``failed`` or ``inactive``. .. code-block:: bash sudo systemctl status robot.service journalctl -u robot.service -n 100 --no-pager Common causes: .. list-table:: :widths: 40 60 :header-rows: 1 * - Log message - Fix * - ``source: No such file or directory`` on ``install/setup.bash`` - Package was never built. Run ``colcon build --packages-select bring_up`` then restart service. * - ``PermissionError: [Errno 13] Permission denied: 'log'`` - Service is running as root. Add ``User=roboflock`` to the ``[Service]`` section of ``/etc/systemd/system/robot.service``, then ``sudo systemctl daemon-reload``. * - ``[ERROR] device or resource busy`` - USB device claimed by another process. Run ``sudo fuser /dev/rplidar`` to find and kill it. * - Service exits immediately - Last ``ros2 run`` command failed. Change all but the last node to background (``&``) in ``bring_up.sh``. ---- Controller / Teleop -------------------- **Symptom:** Robot does not move when R2 is pressed. The most common cause is R1 (dead-man switch) not being held. R1 must be held continuously for any motion to be sent. .. code-block:: bash # Watch raw joy input ros2 topic echo /joy Press each button and confirm the correct ``axes[]`` or ``buttons[]`` index changes. **Symptom:** Controller will not pair (light bar keeps flashing). .. code-block:: bash # Restart the Bluetooth service sudo systemctl restart bluetooth # Wait 5 seconds, then press PS button on controller **Symptom:** ``joy_node`` is running but ``/joy`` topic is empty. .. code-block:: bash ls /dev/input/js* # controller should appear here after pairing ros2 param get /joy_node dev # confirm device path If no ``js*`` device appears, the OS has not recognised the controller. Try: .. code-block:: bash sudo dmesg | tail -20 # look for Bluetooth HID events ---- ODESC Motor Controllers ----------------------- **Symptom:** One or more wheels not spinning. .. code-block:: bash # Check USB devices ls /dev/odesc_* # should show 4 entries # Check which ODESC UIDs are present udevadm info /dev/odesc_0 If a device is missing, the ODESC may not have initialised. Disconnect and reconnect its USB cable, then restart the service. **Symptom:** ``/cmd_vel`` publishes but wheels do not respond. The ODESC driver node may have crashed silently. Check: .. code-block:: bash ros2 node list | grep odesc If absent, restart ``robot.service``. If it keeps crashing, check ``journalctl -u robot.service`` for ODESC-specific errors. ---- LiDAR (RPLIDAR A1) ------------------ **Symptom:** RPLIDAR motor does not spin on boot. .. code-block:: bash ls /dev/rplidar # should exist ros2 topic hz /scan # should be ~10 Hz If ``/dev/rplidar`` does not exist: .. code-block:: bash dmesg | grep -i "cp210x\|ch34\|rplidar" # look for USB serial events Re-seat the USB cable. If the device still does not appear, test the cable with a different port. **Symptom:** ``/scan`` publishes but costmap is empty in RViz. .. code-block:: bash ros2 topic echo /scan --field header.frame_id The ``frame_id`` must exactly match the child link of the LiDAR joint in your URDF. If they differ (e.g. ``/scan`` says ``laser_frame`` but URDF says ``laser``), either update the URDF joint or set ``frame_id: laser_frame`` in the ``rplidar_ros`` node parameters. ---- GPS / Beacon Tracking ---------------------- **Symptom:** ``/beacon_gps`` topic is empty. .. code-block:: bash ls /dev/hc12 # HC-12 device must exist ros2 topic echo /beacon_gps If ``/dev/hc12`` is missing: .. code-block:: bash dmesg | grep -i "cp210x\|ch34" # HC-12 uses a CH340 or CP2102 chip Check that the beacon HC-12 is powered and within range (typically < 100 m line-of-sight). The HC-12 TX LED on the beacon should blink at 1 Hz when transmitting. **Symptom:** Robot GPS (``/fix``) has no fix. .. code-block:: bash ros2 topic echo /fix Check ``status.status``. A value of ``-1`` means no fix. Move to an area with clear sky view and wait up to 90 seconds for the first fix. ---- Nav2 ---- **Symptom:** Nav2 refuses to accept goals. .. code-block:: bash ros2 topic echo /initialpose # check initial pose is set ros2 service call /clear_costmaps nav2_msgs/srv/ClearEntireCostmap Check the Nav2 lifecycle nodes are active: .. code-block:: bash ros2 lifecycle list /controller_server All nodes should be in ``active`` state. If any are in ``unconfigured``, the params file likely has a YAML error. Validate it with: .. code-block:: bash python3 -c "import yaml; yaml.safe_load(open('nav2_params.yaml'))" **Symptom:** Robot reaches goal area but circles endlessly. Increase goal tolerances in ``nav2_params.yaml``: .. code-block:: yaml goal_checker: plugin: "nav2_controller::SimpleGoalChecker" xy_goal_tolerance: 0.5 yaw_goal_tolerance: 0.785 ---- Viewing the TF Tree -------------------- A broken TF tree is behind most Nav2 planning failures. .. code-block:: bash # Save TF tree to PDF ros2 run tf2_tools view_frames evince frames.pdf # Live TF monitor ros2 run tf2_ros tf2_monitor # Check specific transform ros2 run tf2_ros tf2_echo base_link laser If ``map → odom → base_link`` is not present, Nav2 cannot plan. Ensure ``robot_state_publisher`` is running with a valid URDF and that your odometry or GPS localisation node is publishing to ``/tf``. ---- Useful One-Liners ----------------- .. code-block:: bash # All topics with type and publisher count ros2 topic list -v # Node graph (text) ros2 node list && ros2 topic list # Kill a specific node cleanly ros2 lifecycle set /controller_server shutdown # Replay a rosbag for debugging (record first) ros2 bag record -o my_bag /scan /cmd_vel /fix /joy ros2 bag play my_bag # Check Jetson CPU throttling sudo cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq # Clear and restart the udev rules sudo udevadm control --reload-rules sudo udevadm trigger ---- Further Reading --------------- - `ROS 2 command line tools reference `_ - `Debugging with ros2 topic / ros2 node `_ - `Nav2 troubleshooting guide `_ - `tf2 debugging `_