Close to 4 years ago I talked about reverse engineering the Stream Deck to gain full control of the device and remove the dependence on the Stream Deck software. Well, I still really enjoy the hardware, but the software has gotten worse – it now goes as far as to requiring users for an account to download extensions.
If we’ve interacted in the past, I am big on respecting customer privacy and choices, and that means – if I want to use a device without an account, I better be damn able to do that. Luckily, building on my past work with DeckSurf, I finally was determined to push the pedal to the metal and make my project a viable alternative to proprietary software for this extremely versatile and flexible button box.
This post is going to be looking at the workings of the Stream Deck Plus, a $179.99 (discounted by $20 at the time of this writing) and how you, dear reader, can use it even if you don’t want to install Elgato’s own software.
What are we working with #
Here is the device we’ll be talking about:
Stream Deck Plus sports a few capabilities that I should outline before I go more in-depth about the actual reverse-engineering process:
- 8 buttons. This is effectively the same as with all the other Stream Deck products. Numbers vary, but the behavior is consistent.
- A narrow screen. Right below the buttons is a narrow color screen band that can be used to provide auxiliary contextual information.
- 4 dials. Each dial can turn right or left an unlimited number of times. They can also be pressed down (click).
We’re going to dive into each of the features and how they work. I will also mention that to actually reverse-engineer this device, I used the following tooling:
The way Stream Deck is built, it’s a generic HID device and it does not require you to have Stream Deck software installed to function. That is, once I reverse-engineer the protocol for the hardware, I can build my own client software that does whatever I want and doesn’t depend in any capacity on Elgato’s software stack.
Setting up the inspection process #
To get started, let’s launch Wireshark and select a USB capture interface.
Depending on your machine configuration, you may have more than one interface available. You might need to try opening a few until you’re able to spot the connected Stream Deck.
To find the Stream Deck Plus device in my case, I can filter by the product ID (PID), that is 0x0084
(the Elgato VID, shall you need it, is 0x0FD9
). Narrowing down the traffic to just the connected Stream Deck Plus device can be done by applying the following filter string:
And just like that, the Stream Deck lights up in the list (in the likely sea of other USB traffic):
Based on the above, the source I need to look for is 3.5.0
. That value is also the destination – the “address” of the device that we can use to inspect outbound traffic that actually sets things like brightness or images on the different surfaces available on the Stream Deck.
With this data at hand, I can now set up the filter string like this:
usb.dst matches "3\\.5\\..*"
Hold on a second… You said that the address is 3.5.0
above, but in your second filter string it looks like you’re matching to everything that follows the 3.5.
pattern. Why is that?
Aha! Keen eye. Indeed, I am not filtering just by 3.5.0
. The USB address is made of three components – the bus, device, and endpoint. In our case, the Stream Deck Plus is operating as bus 3, device 5, and endpoint 0, but what we also need to know is that a single device can have multiple endpoints. So for us to properly look at all Stream Deck Plus device traffic, we exclude the endpoint identifier.
We can also simplify the filter string like this:
This is a bit cleaner and you don’t need to worry about RegEx-ing a relatively constraint.
With these basics out of the way, let’s take a look at how the actual hardware interacts with my computer, and vice-versa.
Stream Deck hardware #
The buttons #
The behavior for the buttons is the same as I’ve outlined with the Stream Deck XL. I really appreciate the consistency here, and I guess from a supportability perspective this makes sense – once you have an API more or less working, why change how things act from a new hardware release to another? Kudos to Elgato on that.
Every button supports a 120×120 color image. The content is not dynamically updated by the device itself but rather by the host – that is, whatever computer you’re connecting it to. If there is updated status displayed on the button, that’s only because the computer is pushing new images to the Stream Deck constantly. On Windows and macOS, that responsibility typically falls on the Stream Deck software (that I am aiming to fully replace with DeckSurf).
Setting images #
When images are set on the computer, a JPEG-encoded and usually compressed (if you use a larger resolution) image is being sent over the wire, along with some other generic packet metadata.
The packets that set the image can be recognized by looking for the following pattern in the header (values are hexadecimal):
+-------+----+----+----+----+----+----+----+----+
| Byte | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
+-------+----+----+----+----+----+----+----+----+
| Value | 02 | 07 | 18 | 00 | F8 | 03 | 00 | 00 |
+-------+----+----+----+----+----+----+----+----+
The header can be described like this:
Byte Index | Description |
---|---|
0 |
Always 02 |
1 |
Always 07 |
2 |
Hexadecimal ID of the button for which the image is set. This value is zero-indexed. |
3 |
Determines whether the current packet is the final packet that sets an image. Larger images are broken down into multiple packets, and this value can be either 00 or 01 . |
4 and 5 |
16-bit Little Endian representation of the image payload length in the current packet. |
6 and 7 |
16-bit Little Endian representation of the zero-based iteration (or, page) for cases where the image is split in multiple packets. |
Everything that follows the header is the image payload. If the image is split, we will see several packets, like this:
Those URB_INTERRUPT out
packets is what we’re after. Don’t worry if you haven’t learned about this terminology yet. URB
stands for “USB Request Block” and is used as a structure to describe a USB transfer between the host and the device. INTERRUPT
refers to the transfer type. USB interrupt transfers are designed for devices that require low-latency communication, typically for small amounts of data. These devices can be keyboards, mice, or, in our case, a Stream Deck Plus. Basically, anything that sends or receives frequent updates. Lastly, out
indicates the direction of the transfer – out of the host and to the device. URB_INTERRUPT out
means that the host (my computer) is sending data to a device using an interrupt transfer.
Now, notice that the packets being sent are uniform – they are 1,051 bytes long. The first packet contains the JPEG header (starting bytes of the image), and every subsequent packet contains the rest of the image, split in chunks. The header is always 8 bytes and the content (image payload) is declared in the header, but is usually 1,016 bytes, making the total payload 1,024 bytes long.
Do always check the header for the real length, though – never rely on these kind of assumptions as the de-facto truth, as things may change in the future.
To verify what image is being set, we can use a bit of command line magic. In Wireshark, select the URB_INTERRUPT out
packets that contain the image. To make it easier to spot them, you can apply a more restrictive filter:
usb.dst ~ "3.5" && _ws.col.info == "URB_INTERRUPT out"
This will look for URB_INTERRUPT out
packets to the specific destination only. As I mentioned, select the packets, and then from the File menu select Export Specified Packets….
In the following dialog, in the Packet Range section, click on Selected packets only.
Give the file a descriptive name and store it somewhere on disk. Next, we will use a command-line tool that comes with Wireshark, called tshark
.
On Windows, tshark
is typically located in the Wireshark installation folder. In my case, it was in C:\Program Files\Wireshark
:
For easier consumption of tshark
, you can add the Wireshark path to your system PATH
environment variable. Assuming that is done, we can now invoke this from the terminal:
tshark -r .\test-image-extraction.pcapng -T fields -e usb.capdata > data.txt
What this command does is extract the HID data and dump it all in a text file. Because we’re already operating on a *.pcapng
file that only contains the image packets we’re interested in, we don’t need to fiddle more with filtering, and just put everything in a text file.
The content will look like this:
Not super helpful, but we spot the things I mentioned earlier – the 02 07
header starter, for example. As a quick and dirty “hack” to dump image data from this text file, I have a PowerShell script:
param (
[string]$DataFile,
[string]$OutputFileName
)
function Process-HIDData {
param (
[string]$DataFile,
[string]$OutputFileName
)
$lines = Get-Content -Path $DataFile
$imageBytes = @()
foreach ($line in $lines) {
$hexBytes = $line.Trim()
if ($hexBytes.Length -gt 16) {
$processedBytes = $hexBytes.Substring(16)
$byteArray = for ($i = 0; $i -lt $processedBytes.Length; $i += 2) {
[Convert]::ToByte($processedBytes.Substring($i, 2), 16)
}
$imageBytes += $byteArray
}
}
$binaryData = [byte[]]::new($imageBytes.Length)
[System.Array]::Copy($imageBytes, $binaryData, $imageBytes.Length)
$scriptDirectory = $PSScriptRoot
$outputFilePath = Join-Path -Path $scriptDirectory -ChildPath $OutputFileName
[System.IO.File]::WriteAllBytes($outputFilePath, $binaryData)
Write-Output "Image saved as $outputFilePath"
}
if (-not $DataFile) {
Write-Error "Data file path is required."
exit 1
}
if (-not $OutputFileName) {
Write-Error "Output file name is required."
exit 1
}
Process-HIDData -DataFile $DataFile -OutputFileName $OutputFileName
All it really does is strips out the first 8 bytes from each line (one line represents one batch of HID data) and stores the binary representation as a JPEG file. It can be invoked as such:
.\exportimage.ps1 -DataFile .\data.txt -OutputFile image.jpg
Once the script executes, we can see my test hedgehog image:
Nice! We have an idea of how images are passed over the wire. But another characteristic of Stream Deck buttons is that, just like any other buttons, they can be pressed. I’ve also talked about this in my previous blog post, but the gist is that you have to look at the reverse of what we were doing with images.
That is, your filter string is now this (make sure to adjust the src
argument):
_ws.col.info == "URB_INTERRUPT in" && usb.src == 3.5.1
I am looking for interrupt data flowing in (to the host) from the USB device at 3.5.1
. That data, as it turns out, is very easy to parse because what happens is that with each button press and release we get the entire button map in the HID data.
The first four bytes are the header and we can ignore those. The third byte always indicates the number of buttons on the panel – in the Stream Deck Plus case, that’s 8. For the Stream Deck XL, that is 32. The third byte also indicates how many bytes after the header contain the button map. So, if I press the 4th button on the Stream Deck Plus, the data it will send to my computer is this:
0000 01 00 08 00 00 00 00 01 00 00 00 00 00 00 00 00 ................
0010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00A0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00B0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00C0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00D0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00E0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00F0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0100 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0120 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0140 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0150 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0160 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0180 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0190 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
01A0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
01B0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
01C0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
01D0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
01E0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
01F0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
Yet another piece of hardware infrastructure that is not different across SKUs, which I really appreciate Elgato doing.
The screen #
Let’s now talk about the second component on a Stream Deck Plus – the screen. It’s a narrow band that we can use to display a bunch of information. By default it is used to display information about the knobs right below it. From this statement, we can assume that there are four distinct sections of this screen associated with each knob, but that would be only a partially correct assumption.
In practice, the entire screen area is just one big image. Well, big is relative here – it’s a 800×100 image. The way I found this is by setting the background to the screen and then doing the PowerShell “hack” I mentioned above (but with a header offset of 16 bytes instead of 8) to inspect the traffic from my computer to the connected Stream Deck Plus. What I saw was this:
What the Stream Deck software does is create a composite image of all the things you’re associating with the knobs, and then pass it to the device as one blob. If we inspect the outbound traffic from the host to the device, we see packets like this:
0000 02 0C 00 00 00 00 20 03 64 00 00 00 00 F0 03 00 ...... .d....ð..
0010 FF D8 FF E0 00 10 4A 46 49 46 00 01 01 00 00 01 ÿØÿà..JFIF......
0020 00 01 00 00 FF DB 00 43 00 03 02 02 03 02 02 03 ....ÿÛ.C........
0030 03 03 03 04 03 03 04 05 08 05 05 04 04 05 0A 07 ................
0040 07 06 08 0C 0A 0C 0C 0B 0A 0B 0B 0D 0E 12 10 0D ................
0050 0E 11 0E 0B 0B 10 16 10 11 13 14 15 15 15 0C 0F ................
0060 17 18 16 14 18 12 14 15 14 FF DB 00 43 01 03 04 .........ÿÛ.C...
0070 04 05 04 05 09 05 05 09 14 0D 0B 0D 14 14 14 14 ................
We now have a 16-byte header, that is structured like this (judging from the starting packet that is used to send the image):
+-------+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
| Byte | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 |
+-------+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
| Value | 02 | 0C | 00 | 00 | 00 | 00 | 20 | 03 | 64 | 00 | 00 | 00 | 00 | F0 | 03 | 00 |
+-------+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+----+
Looking at all packets for a single image, I started jotting down the following assumptions for the screen image setting headers:
Byte Index | Description |
---|---|
0 |
Always 02 |
1 |
Always 0C |
2 to 5 |
Always 00 |
6 |
Always 20 |
7 |
Always 03 |
8 |
Always 64 |
9 |
Always 00 |
10 |
Indicates whether this is the final chunk (i.e., “page”) when setting an image via a multi-part packet. Can be 00 or 01 . |
11 |
Chunk (i.e., “page”) index when setting an image via a multi-part packet. |
12 |
Always 00 |
13 and 14 |
Little Endian representation of the payload length. |
15 |
Always 00 |
And this would seem good enough, except there is a twist. I mentioned earlier that the assumption that the whole screen is always managed as a single image is only partially correct, because when you use one of the knobs (whether you press it or turn it), there is an overlay displayed briefly on the screen showing what is happening.
These overlays, just like everything else, are managed by the Stream Deck software. What was peculiar about it, though, was that it was sent as a segment – that is, the image that you see above is what was sent from my PC to Stream Deck Plus. The software didn’t send the full composite, but rather just that one part of the screen.
Could it be that every segment of the screen is addressable? I started comparing the headers between overlay sets. The data below shows the headers as individual bytes, followed by a short extract of the image payload (trust me, you don’t want the whole thing here). To make it easier to analyze things, I’ve split the screen in four segments, from left to right – A, B, C, and D.
Segment A #
02 0c 00 00 00 00 c8 00 64 00 00 00 00 f0 03 00 ffd8ffe000104a46
02 0c 00 00 00 00 c8 00 64 00 00 01 00 f0 03 00 62719049eff74526
02 0c 00 00 00 00 c8 00 64 00 00 02 00 f0 03 00 6b8844808f976955
02 0c 00 00 00 00 c8 00 64 00 00 03 00 f0 03 00 e7d81a2c2bbb151b
02 0c 00 00 00 00 c8 00 64 00 00 04 00 f0 03 00 0f7c678f7a666ddf
02 0c 00 00 00 00 c8 00 64 00 00 05 00 f0 03 00 434b7255d6e66e9d
02 0c 00 00 00 00 c8 00 64 00 00 06 00 f0 03 00 fc54da1adeecad78
02 0c 00 00 00 00 c8 00 64 00 00 07 00 f0 03 00 7e4095f72f5c6bc2
02 0c 00 00 00 00 c8 00 64 00 00 08 00 f0 03 00 918ba9de8b9d4edf
02 0c 00 00 00 00 c8 00 64 00 00 09 00 f0 03 00 12ebd3dd2116a218
02 0c 00 00 00 00 c8 00 64 00 01 0a 00 0a 00 00 d38bb9152295be67
Segment B #
02 0c c8 00 00 00 c8 00 64 00 00 00 00 f0 03 00 ffd8ffe000104a46
02 0c c8 00 00 00 c8 00 64 00 00 01 00 f0 03 00 7662eade285d4963
02 0c c8 00 00 00 c8 00 64 00 00 02 00 f0 03 00 23241008207b8c96
02 0c c8 00 00 00 c8 00 64 00 00 03 00 f0 03 00 920734728fdaf35d
02 0c c8 00 00 00 c8 00 64 00 00 04 00 f0 03 00 f41d3afd723f98ed
02 0c c8 00 00 00 c8 00 64 00 00 05 00 f0 03 00 4b14e7701b82b291
02 0c c8 00 00 00 c8 00 64 00 00 06 00 f0 03 00 78f61540d6b165b6
02 0c c8 00 00 00 c8 00 64 00 00 07 00 f0 03 00 481bb70eb8fc7fcf
02 0c c8 00 00 00 c8 00 64 00 01 08 00 05 02 00 fce4dbc726e6ea09
Segment C #
02 0c 90 01 00 00 c8 00 64 00 00 00 00 f0 03 00 ffd8ffe000104a46
02 0c 90 01 00 00 c8 00 64 00 00 01 00 f0 03 00 1260fa52b10e4d00
02 0c 90 01 00 00 c8 00 64 00 00 02 00 f0 03 00 e52b2649911c1ce1
02 0c 90 01 00 00 c8 00 64 00 00 03 00 f0 03 00 8cadbf5d1a6bf53d
02 0c 90 01 00 00 c8 00 64 00 00 04 00 f0 03 00 51e403cc601c8127
02 0c 90 01 00 00 c8 00 64 00 00 05 00 f0 03 00 bfd26ffc7fa25e69
02 0c 90 01 00 00 c8 00 64 00 00 06 00 f0 03 00 91d352f3fe933f3d
02 0c 90 01 00 00 c8 00 64 00 00 07 00 f0 03 00 e90ffb69fc22540d
02 0c 90 01 00 00 c8 00 64 00 01 08 00 13 01 00 9f434156b8d8df76
Segment D #
02 0c 58 02 00 00 c8 00 64 00 00 00 00 f0 03 00 ffd8ffe000104a46
02 0c 58 02 00 00 c8 00 64 00 00 01 00 f0 03 00 9e49ea2ad547ccd3
02 0c 58 02 00 00 c8 00 64 00 00 02 00 f0 03 00 33d453a105caa44e
02 0c 58 02 00 00 c8 00 64 00 00 03 00 f0 03 00 2d7d0e89d08caad3
02 0c 58 02 00 00 c8 00 64 00 00 04 00 f0 03 00 d7b45dcc67503a1e
02 0c 58 02 00 00 c8 00 64 00 00 05 00 f0 03 00 9e0ed72437f7b6f6
02 0c 58 02 00 00 c8 00 64 00 00 06 00 f0 03 00 b8b9d1aea3d7a351
02 0c 58 02 00 00 c8 00 64 00 01 07 00 a4 00 00 4e735a459338a458
Spotting the delta #
Between the packets I listed above, the only things that changed in the header are the third and fourth bytes. This makes me think that those are the screen segment addresses. We have:
Segment | Address |
---|---|
A |
00 00 |
B |
C8 00 |
C |
90 01 |
D |
58 02 |
If we convert the values to decimal, the table suddenly will start making more sense:
Segment | Address | Little-Endian Address |
---|---|---|
A |
00 00 |
0 |
B |
C8 00 |
200 |
C |
90 01 |
400 |
D |
58 02 |
600 |
Amazing. The third and fourth byte in the header represent the pixel offset (remember how I mentioned that the full image is 800×100). Now, let’s compare all those packets to what we see when we set the full image:
02 0c 00 00 00 00 20 03 64 00 00 00 00 f0 03 00 ffd8ffe000104a46
02 0c 00 00 00 00 20 03 64 00 00 01 00 f0 03 00 89b062cee0cd8e84
02 0c 00 00 00 00 20 03 64 00 00 02 00 f0 03 00 5805c9c63771d327
02 0c 00 00 00 00 20 03 64 00 00 03 00 f0 03 00 3dab5a71737721b3
02 0c 00 00 00 00 20 03 64 00 00 04 00 f0 03 00 618246734b6342fe
02 0c 00 00 00 00 20 03 64 00 00 05 00 f0 03 00 aa159464a0cce71d
02 0c 00 00 00 00 20 03 64 00 00 06 00 f0 03 00 8996e51482344335
02 0c 00 00 00 00 20 03 64 00 00 07 00 f0 03 00 45ae683e24bbff00
02 0c 00 00 00 00 20 03 64 00 00 08 00 f0 03 00 8bc53378a7c07e1c
02 0c 00 00 00 00 20 03 64 00 00 09 00 f0 03 00 ff005ff807e3af0b
02 0c 00 00 00 00 20 03 64 00 00 0a 00 f0 03 00 b299080709c9c02a
02 0c 00 00 00 00 20 03 64 00 00 0b 00 f0 03 00 4fb45746ef817c56
02 0c 00 00 00 00 20 03 64 00 00 0c 00 f0 03 00 c62390ac84897272
02 0c 00 00 00 00 20 03 64 00 00 0d 00 f0 03 00 27030d8e36a8e001
02 0c 00 00 00 00 20 03 64 00 00 0e 00 f0 03 00 d95ab7fa562dd956
02 0c 00 00 00 00 20 03 64 00 00 0f 00 f0 03 00 ba2eaf650de69fa4
02 0c 00 00 00 00 20 03 64 00 00 10 00 f0 03 00 efe19c9f0f3c1da1
02 0c 00 00 00 00 20 03 64 00 00 11 00 f0 03 00 7e140462b850800c
02 0c 00 00 00 00 20 03 64 00 00 12 00 f0 03 00 ec37b1e8da7b416d
02 0c 00 00 00 00 20 03 64 00 00 13 00 f0 03 00 ee1636963dec088e
02 0c 00 00 00 00 20 03 64 00 00 14 00 f0 03 00 3f06e9f2c02596f4
02 0c 00 00 00 00 20 03 64 00 00 15 00 f0 03 00 49a0d8d94b7b55d3
02 0c 00 00 00 00 20 03 64 00 00 16 00 f0 03 00 8a2ea63df238c374
02 0c 00 00 00 00 20 03 64 00 00 17 00 f0 03 00 f4b8d34433bc65c4
02 0c 00 00 00 00 20 03 64 00 00 18 00 f0 03 00 2033824b02a3712a
02 0c 00 00 00 00 20 03 64 00 00 19 00 f0 03 00 1c8ee4f526bf4da7
02 0c 00 00 00 00 20 03 64 00 00 1a 00 f0 03 00 e3da788745f19f88
02 0c 00 00 00 00 20 03 64 00 00 1b 00 f0 03 00 2755734755b7dc43
02 0c 00 00 00 00 20 03 64 00 00 1c 00 f0 03 00 e0cad297dd8f5c35
02 0c 00 00 00 00 20 03 64 00 00 1d 00 f0 03 00 c00217a1cfd8f259
02 0c 00 00 00 00 20 03 64 00 00 1e 00 f0 03 00 c1cfc8ec33f91ac7
02 0c 00 00 00 00 20 03 64 00 00 1f 00 f0 03 00 b6f30aecddb72492
02 0c 00 00 00 00 20 03 64 00 00 20 00 f0 03 00 ff00b3fb552c8b08
02 0c 00 00 00 00 20 03 64 00 00 21 00 f0 03 00 baf3fc47d99f25fe
02 0c 00 00 00 00 20 03 64 00 00 22 00 f0 03 00 c6adc4510f3644f3
02 0c 00 00 00 00 20 03 64 00 00 23 00 f0 03 00 dcb5a79842477124
02 0c 00 00 00 00 20 03 64 00 00 24 00 f0 03 00 73950000a48af069
02 0c 00 00 00 00 20 03 64 00 00 25 00 f0 03 00 032c919120c6eca0
02 0c 00 00 00 00 20 03 64 00 00 26 00 f0 03 00 63f0d3e1a787127d
02 0c 00 00 00 00 20 03 64 00 00 27 00 f0 03 00 6efb25e5add3b7a9
02 0c 00 00 00 00 20 03 64 00 00 28 00 f0 03 00 7874fe1883c53e1d
02 0c 00 00 00 00 20 03 64 00 00 29 00 f0 03 00 20d4632b5ceda74e
02 0c 00 00 00 00 20 03 64 00 01 2a 00 92 02 00 bf32e9c5367927c4
The seventh and eight bytes all of a sudden became 20 03
, and that is because, once again – we need to look at the Little Endian representation for the values. For each segment, this value has been C8 00
, which translates to 200
. 20 03
is 800
. This is the image width. 64 00
is 100
, so it’s the image height.
My assumed table now took a much better shape:
Byte Index | Description |
---|---|
0 |
Always 02 |
1 |
Always 0C |
2 and 3 |
Offset from the left corner. |
4 and 5 |
Always 00 00 . |
6 and 7 |
Image width. |
8 and 9 |
Image height. |
10 |
Indicates whether this is the final chunk (i.e., “page”) when setting an image via a multi-part packet. Can be 00 or 01 . |
11 and 12 |
Chunk (i.e., “page”) index when setting an image via a multi-part packet. |
13 and 14 |
Little Endian representation of the payload length. |
15 |
Always 00 |
And that’s it, we now know how screen data is set! Not overly complicated once I started looking at the delta between packets.
The last thing we need to talk about here is the fact that the screen is a touch screen, so we also need to be able to spot the user pressing on a segment. Pressing on any of the screen parts are functionally equivalent to pressing the knob – the same overlay will be shown if you’re using the Stream Deck software. But how does it show up in Wireshark?
To check, let’s once again set the filter to this, because we want to track events that originate on the device and go to the PC:
usb.src ~ "3.5" && _ws.col.info == "URB_INTERRUPT in"
Here is the data we get, tapping from left to right.
01 02 0e 00 01 01 47 00 40 00 00000000000000000
01 02 0e 00 01 01 07 01 1c 00 00000000000000000
01 02 0e 00 01 01 ee 01 32 00 00000000000000000
01 02 0e 00 01 01 b3 02 29 00 00000000000000000
That looks very random. A little too varied for us to make any definitive conclusions. But you know what, let’s try tapping the same segment in different parts of the touch screen:
01 02 0e 00 01 01 b2 02 33 00 00000000000000000
01 02 0e 00 01 01 13 03 20 00 00000000000000000
01 02 0e 00 01 01 a9 02 46 00 00000000000000000
01 02 0e 00 01 01 06 03 26 00 00000000000000000
01 02 0e 00 01 01 bb 02 20 00 00000000000000000
01 02 0e 00 01 01 00 03 52 00 00000000000000000
01 02 0e 00 01 01 b6 02 3e 00 00000000000000000
01 02 0e 00 01 01 f7 02 2f 00 00000000000000000
01 02 0e 00 01 01 f1 02 1c 00 00000000000000000
01 02 0e 00 01 01 8d 02 1c 00 00000000000000000
This variability in values instantly gave me a clue – we’re looking at coordinates on the screen! The structure ends up being this:
+-------+----+----+----+----+----+----+----+-----+----+-----+
| Byte | 0 | 1 | 2 | 3 | 4 | 5 | 6 - 7 | 8 - 9 |
+-------+----+----+----+----+----+----+----+-----+----+-----+
| Value | 01 | 02 | 0E | 00 | 01 | 01 | X coord. | Y coord. |
+-------+----+----+----+----+----+----+----+-----+----+-----+
Unlike the buttons, there is no release event – we just get a tap, and that’s it. That means that once we get an event marked by the header above, we can compute based on the X and Y coordinates which part of the screen was tapped and react accordingly.
The knobs #
Last but not least, the one thing we should talk about are the knobs. Each knob can be used in three ways:
- Turn right.
- Turn left.
- Press.
To make sure I log things properly, for each knob, labeled similar to how I labeled screen segments, I took four turns to the right, four turns to the left, and then a press. The data I captured is below.
Knob A #
Right turns #
01 03 05 00 01 01 00 00 00 0000000000
01 03 05 00 01 01 00 00 00 0000000000
01 03 05 00 01 01 00 00 00 0000000000
01 03 05 00 01 01 00 00 00 0000000000
Left turns #
01 03 05 00 01 ff 00 00 00 0000000000
01 03 05 00 01 ff 00 00 00 0000000000
01 03 05 00 01 ff 00 00 00 0000000000
01 03 05 00 01 ff 00 00 00 0000000000
Press #
01 03 05 00 00 01 00 00 00 0000000000
01 03 05 00 00 00 00 00 00 0000000000
Knob B #
Right turns #
01 03 05 00 01 00 01 00 00 0000000000
01 03 05 00 01 00 01 00 00 0000000000
01 03 05 00 01 00 01 00 00 0000000000
01 03 05 00 01 00 01 00 00 0000000000
Left turns #
01 03 05 00 01 00 ff 00 00 0000000000
01 03 05 00 01 00 ff 00 00 0000000000
01 03 05 00 01 00 ff 00 00 0000000000
01 03 05 00 01 00 ff 00 00 0000000000
Press #
01 03 05 00 00 00 01 00 00 0000000000
01 03 05 00 00 00 00 00 00 0000000000
Knob C #
Right turns #
01 03 05 00 01 00 00 01 00 0000000000
01 03 05 00 01 00 00 01 00 0000000000
01 03 05 00 01 00 00 01 00 0000000000
01 03 05 00 01 00 00 01 00 0000000000
Left turns #
01 03 05 00 01 00 00 ff 00 0000000000
01 03 05 00 01 00 00 ff 00 0000000000
01 03 05 00 01 00 00 ff 00 0000000000
01 03 05 00 01 00 00 ff 00 0000000000
Press #
01 03 05 00 00 00 00 01 00 0000000000
01 03 05 00 00 00 00 00 00 0000000000
Knob D #
Right turns #
01 03 05 00 01 00 00 00 01 0000000000
01 03 05 00 01 00 00 00 01 0000000000
01 03 05 00 01 00 00 00 01 0000000000
01 03 05 00 01 00 00 00 01 0000000000
Left turns #
01 03 05 00 01 00 00 00 ff 0000000000
01 03 05 00 01 00 00 00 ff 0000000000
01 03 05 00 01 00 00 00 ff 0000000000
01 03 05 00 01 00 00 00 ff 0000000000
Press #
01 03 05 00 00 00 00 00 01 0000000000
01 03 05 00 00 00 00 00 00 0000000000
Slicing and dicing #
Looking at the signals above, commonalities emerged rather quickly because the same values kept shifting to the right.
That’s basically the button press pattern that we’ve seen with, well, button presses!
+-------+----+----+----+----+------------+---------------+---------------+---------------+---------------+
| Byte | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
+-------+----+----+----+----+------------+---------------+---------------+---------------+---------------+
| Value | 01 | 03 | 05 | 00 | Is turning | Knob A action | Knob B action | Knob C action | Knob D action |
+-------+----+----+----+----+------------+---------------+---------------+---------------+---------------+
For each knob, we can get a combo of:
- Byte
4
set to01
(turning), button-specific bytes set to01
(turn right) orFF
(turn left). - Byte
4
set to00
(presset), button-specific bytes set to01
(pressed) or00
(released).
Very nice – if you followed along all this time, you now know how the binary data is set for every control on the Stream Deck Plus.
Writing a wrapper #
Alright, now that we went through the slog that is binary data analysis, it’s time to have a nicer way to deal with all this mess. To do that, I updated the DeckSurf SDK to support the Stream Deck Plus.
With the latest release of the DeckSurf SDK on NuGet (0.0.4 at the time of this writing), you can use it to managed a Stream Deck Plus device!
Here is a fully-functioning sample in C# that shows how to handle events and set images on a Stream Deck Plus:
using DeckSurf.SDK.Core;
using DeckSurf.SDK.Models;
using DeckSurf.SDK.Util;
using System;
using System.Collections.Generic;
using System.IO;
using System.Threading;
namespace DeckSurf.SDK.StartBoard
{
class Program
{
static void Main(string[] args)
{
var exitSignal = new ManualResetEvent(false);
var devices = DeviceManager.GetDeviceList();
Console.WriteLine("The following Stream Deck devices are connected:");
foreach (var connectedDevice in devices)
{
Console.WriteLine(connectedDevice.Name);
}
var device = ((List<ConnectedDevice>)devices)[0];
device.StartListening();
device.OnButtonPress += Device_OnButtonPress;
byte[] testImage = File.ReadAllBytes(args[0]);
var image = ImageHelpers.ResizeImage(testImage, device.ScreenWidth, device.ScreenHeight, device.IsButtonImageFlipRequired);
device.SetScreen(image, 250, device.ScreenWidth, device.ScreenHeight);
var keyImage = ImageHelpers.ResizeImage(testImage, device.ButtonResolution, device.ButtonResolution, device.IsButtonImageFlipRequired);
device.SetKey(1, keyImage);
device.SetBrightness(29);
Console.WriteLine("Done");
exitSignal.WaitOne();
}
private static void Device_OnButtonPress(object source, ButtonPressEventArgs e)
{
Console.WriteLine($"Button with ID {e.Id} was pressed. It's identified as {e.ButtonKind}. Event is {e.EventKind}. If this is a touch screen, coordinates are {e.TapCoordinates.X} and {e.TapCoordinates.Y}. Is knob rotated: {e.IsKnobRotating}. Rotation direction: {e.KnobRotationDirection}.");
}
}
}
Now, of course – this sample makes the assumption that the Stream Deck Plus device is the first one connected (index zero), so you might want to tweak that if you have more than one Stream Deck plugged in. But nonetheless, this shows you how easy I try to make Stream Deck interactions with the DeckSurf SDK. It’s all still in early preview, so things might break with future releases until I get it to a stable version, but until then – feel free to experiment!
As always, the latest documentation is available on https://docs.deck.surf
.