Troubleshooting and Debugging

This page covers common failure modes across every RoboFlock subsystem and the commands needed to diagnose them.


Quick Diagnostic Checklist

Run these commands over SSH first. They answer the most common questions in under 30 seconds.

# 1. Are the nodes running?
ros2 node list

# 2. Are the topics publishing?
ros2 topic list

# 3. Is cmd_vel getting values?
ros2 topic echo /cmd_vel

# 4. Is the LiDAR spinning?
ros2 topic hz /scan        # should be ~10 Hz

# 5. What is the service doing?
sudo systemctl status robot.service

# 6. Recent service logs
journalctl -u robot.service -n 50 --no-pager

# 7. Are all USB devices assigned?
ls /dev/rplidar /dev/odesc_* /dev/hc12 /dev/gps

# 8. Jetson resource usage
tegrastats

Environment / ROS2 Discovery

Symptom: ros2 node list returns nothing after SSH.

This is almost always an ROS_LOCALHOST_ONLY mismatch between the running service and your SSH terminal.

# Check what the service process sees
sudo cat /proc/$(pgrep -f self_destruct | head -1)/environ \
  | tr '\0' '\n' | grep -E "ROS|DOMAIN|LOCALHOST"

# Check your terminal
env | grep -E "ROS|DOMAIN|LOCALHOST"

If ROS_LOCALHOST_ONLY differs between the two, fix your terminal:

export ROS_LOCALHOST_ONLY=0
ros2 daemon stop
ros2 daemon start
ros2 node list

To make this permanent, add the export to ~/.bashrc:

echo "export ROS_LOCALHOST_ONLY=0" >> ~/.bashrc
source ~/.bashrc

robot.service

Symptom: Service shows failed or inactive.

sudo systemctl status robot.service
journalctl -u robot.service -n 100 --no-pager

Common causes:

Log message

Fix

source: No such file or directory on install/setup.bash

Package was never built. Run colcon build --packages-select bring_up then restart service.

PermissionError: [Errno 13] Permission denied: 'log'

Service is running as root. Add User=roboflock to the [Service] section of /etc/systemd/system/robot.service, then sudo systemctl daemon-reload.

[ERROR] device or resource busy

USB device claimed by another process. Run sudo fuser /dev/rplidar to find and kill it.

Service exits immediately

Last ros2 run command failed. Change all but the last node to background (&) in bring_up.sh.


Controller / Teleop

Symptom: Robot does not move when R2 is pressed.

The most common cause is R1 (dead-man switch) not being held. R1 must be held continuously for any motion to be sent.

# Watch raw joy input
ros2 topic echo /joy

Press each button and confirm the correct axes[] or buttons[] index changes.

Symptom: Controller will not pair (light bar keeps flashing).

# Restart the Bluetooth service
sudo systemctl restart bluetooth
# Wait 5 seconds, then press PS button on controller

Symptom: joy_node is running but /joy topic is empty.

ls /dev/input/js*        # controller should appear here after pairing
ros2 param get /joy_node dev   # confirm device path

If no js* device appears, the OS has not recognised the controller. Try:

sudo dmesg | tail -20    # look for Bluetooth HID events

ODESC Motor Controllers

Symptom: One or more wheels not spinning.

# Check USB devices
ls /dev/odesc_*          # should show 4 entries

# Check which ODESC UIDs are present
udevadm info /dev/odesc_0

If a device is missing, the ODESC may not have initialised. Disconnect and reconnect its USB cable, then restart the service.

Symptom: /cmd_vel publishes but wheels do not respond.

The ODESC driver node may have crashed silently. Check:

ros2 node list | grep odesc

If absent, restart robot.service. If it keeps crashing, check journalctl -u robot.service for ODESC-specific errors.


LiDAR (RPLIDAR A1)

Symptom: RPLIDAR motor does not spin on boot.

ls /dev/rplidar           # should exist
ros2 topic hz /scan       # should be ~10 Hz

If /dev/rplidar does not exist:

dmesg | grep -i "cp210x\|ch34\|rplidar"   # look for USB serial events

Re-seat the USB cable. If the device still does not appear, test the cable with a different port.

Symptom: /scan publishes but costmap is empty in RViz.

ros2 topic echo /scan --field header.frame_id

The frame_id must exactly match the child link of the LiDAR joint in your URDF. If they differ (e.g. /scan says laser_frame but URDF says laser), either update the URDF joint or set frame_id: laser_frame in the rplidar_ros node parameters.


GPS / Beacon Tracking

Symptom: /beacon_gps topic is empty.

ls /dev/hc12              # HC-12 device must exist
ros2 topic echo /beacon_gps

If /dev/hc12 is missing:

dmesg | grep -i "cp210x\|ch34"   # HC-12 uses a CH340 or CP2102 chip

Check that the beacon HC-12 is powered and within range (typically < 100 m line-of-sight). The HC-12 TX LED on the beacon should blink at 1 Hz when transmitting.

Symptom: Robot GPS (/fix) has no fix.

ros2 topic echo /fix

Check status.status. A value of -1 means no fix. Move to an area with clear sky view and wait up to 90 seconds for the first fix.



Viewing the TF Tree

A broken TF tree is behind most Nav2 planning failures.

# Save TF tree to PDF
ros2 run tf2_tools view_frames
evince frames.pdf

# Live TF monitor
ros2 run tf2_ros tf2_monitor

# Check specific transform
ros2 run tf2_ros tf2_echo base_link laser

If map odom base_link is not present, Nav2 cannot plan. Ensure robot_state_publisher is running with a valid URDF and that your odometry or GPS localisation node is publishing to /tf.


Useful One-Liners

# All topics with type and publisher count
ros2 topic list -v

# Node graph (text)
ros2 node list && ros2 topic list

# Kill a specific node cleanly
ros2 lifecycle set /controller_server shutdown

# Replay a rosbag for debugging (record first)
ros2 bag record -o my_bag /scan /cmd_vel /fix /joy
ros2 bag play my_bag

# Check Jetson CPU throttling
sudo cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq

# Clear and restart the udev rules
sudo udevadm control --reload-rules
sudo udevadm trigger

Further Reading