DJI Osmo/Zenmuse X3 & X5 ArduPilot UAV Integration

They might be a bit old now, but the DJI X3/X5 can still be an attractive option for a UAV camera. Originally used with the Osmo handle or the Inspire 1, it features 3 axis stabilisation, up to 4/3 sensor sizes with interchangeable lens’ and the whole unit has a nice quick release mechanism. The catch? It’s using a proprietary connector and communications protocol that isn’t documented well. The hope is to release a .lua driver for Ardupilot that can interface with the gimbal to control it, and to use CBU’s OpenWRT router to forward on the video feed to a ground station. This blog will mostly be split into 3 parts; The Hardware, The Control and The Video. We have hardware to test on which is made up of the DJI Osmo Handle, along with either the X3 or X5 (with adapter).

Current resources:

https://www.rcgroups.com/forums/showthread.php?2439038-Hacking-Inspire-camera-and-gimbal

https://www.rcgroups.com/forums/showthread.php?2834744-DJI-OSMO-FOCUS-RONIN-etc-CAN-BUS-protocol-investigations

http://www.g0l.ru/blog/n4101

https://b3yond.d3vl.com/duml

https://github.com/strazzere/duml-packet

https://github.com/o-gs/dji-firmware-tools/tree/master/comm_dissector

https://docs.rs-online.com/da3c/A700000006655609.pdf


The Hardware:

This section is a general area to cover the physical mounting hardware, connectors, circuit layout, locking rings etc. Low detail unless relevant.

DJI Osmo (handle):

The Handle has a NXP IC along with some other supporting componentry. Pictured is the back side of the joystick and is where most of the useful test pads are.

DJI Zenmuse X3 (Black non zoom):

Nothing so interesting here, direct connection from the handle to the gimbal.

DJI Zenmuse X5:

The X5 is more interesting, there are reports that it works directly with the Osmo handle (with a modified lower locking ring) however mine has come with the X5 Pro adapter, which both moves the gimbal away from the handle, but also contains a connector for things like a focus wheel. Internally it contains x2 USB hubs GL850G which I assume allow a direct USB connection to a mobile, instead of the normal WIFI of the Osmo handle

Connector & Breakout

The main connector used between the Osmo handle (or Inspire) uses I think a FSI-110-3-G-D-AD.
https://docs.rs-online.com/da3c/A700000006655609.pdf

The same connector comes in different heights, and 10mm gives the most clearance if needed.




Control Protocol:

A deep dive into DJIs Internal protocol and reverse engineering it for our own gains. Work In Progress

CAN

Overall the Gimbal uses a CAN bus to exchange information between sub components. I have been using a “Canable” adapter and have connected it to the CAN High & Low pads in the Osmo Handle. This lets me sniff traffic but wont let me write any commands because they will conflict with the commands from the main handle, no problem to start with. The CAN messages are sent on a baud of 1,000,000 bps. The following assumes some basic knowledge of typical CAN Bus systems and overlooks some of the more boring parts.

After a bit of research and digging around trying to work out the meaning of the payloads, it turns out fairly simple… DJI use an internal protocol nicknamed DJI Universal Mark-up Language (or DUML) for most of their devices of a given era. These messages are characterised with a leading byte of 0x55, followed by the length of the message and some identifiers that we will get into later. The Osmo/X3/X5 all seem to be using this protocol, they are just spread out over multiple CAN messages due to the 8 byte payload limit of most CAN systems. Other products use the same DUML protocol but over UART (Phantom 2) or over the network (DJI Mavic, USB as Ethernet).

DJIOsmoCANLog.csv

For example, looking at a log of communication, and then filtering to only be from sender CAN ID of 0x413 (the Osmo Handle) gives a payload stream as highlighted below. It looks a lot like each of those blocks are their own message, just spread over multiple payloads!

How do we confirm this? There are already tools for decoding DUML messages so putting the Blue byte stream (551904e404021e00000405f8f87201b2fc80004e032201cee8) into one of these should result in some useful data.
https://b3yond.d3vl.com/duml/#551904e404021e00000405f8f87201b2fc80004e032201cee8

Looks like it!

This is some concatenated messages from one of the CAN IDs (I cant remember which one!) pulled from the log above.

concatenated_messages.csv



TODO:

  • Write proper Python to decode & encode DUML packets on the fly

  • Log and understand the initialisation

  • Start Sending CAN Messages (Waiting on Breakout PCB Above)

  • Understand the overall control theory and implement basic gimbal control with Ardupilot


Video Feed:

Main Video feed is provided over USB by either the DM386 or Ambarella in the Gimbal. They enumerate as RNDIS devices (Ethernet) and might support Gimbal controls over the network, rather than CAN.

To be Filled.