• Dataset for RSSI Measurements of Beacon Frames From Wi-Fi Radio Waves

Description:

The data collection phase involves the collection of beacon frame characteristics and RSSI values from Wi-Fi APs using two Raspberry Pi devices. The purpose of this phase is to gather enough data to train the ML module of the proposed system to accurately determine the user's devices location based on these characteristics and values. To collect the data, we defined a threshold distance of 7 feet. This is the maximum distance between the user's devices that we consider acceptable for the purposes of this experiment. We then collected two datasets: one with data collected while the two Raspberry Pis were with 7 feet or less of each other, and another with data collected while the distance between the two Raspberry Pis was over 7 feet. In the first dataset collection stage, we followed the following steps:

  1. Began collecting data by placing the two Raspberry Pis 7 feet from each other.

  2. Moved the two Raspberry Pis closer and farther from each other while maintaining the distance within the predefined threshold.

  3. Repeated the data collection process at different locations to capture the variation in beacon frame characteristics and RSSI values that may exist in different environments.

In the second dataset collection stage, we followed the following steps:

  1. Began collecting data by placing the two Raspberry Pis 7.5 feet from each other. This helped to determine the "gray area" between the acceptable threshold distance and the distance at which access should be denied.

  2. Moved the two Raspberry Pis closer and farther from each other while keeping the closest distance between them at 7.5 feet

  3. Repeated the data collection process at different locations to capture the variation in beacon frame characteristics and RSSI values that may exist in different environments.

We collected a total of 4,825 samples of data from two Raspberry Pis (RPi 1 and RPi 2) measuring the SSID and RSSI values of 10 different WiFi APs at different locations and times. The Raspberry Pis were positioned at distances of 7.5 feet or less apart in the "authentic" dataset and at distances of 7.5 feet or more apart in the "unauthorized" dataset. Each dataset includes six columns: "RPi," "SSID," "Frequency (Hz)," "RSSI (dBm)," "Location," and "Label." The "RPi" column indicates which Raspberry Pi collected the data, the "SSID" column lists the name of the Wi-Fi AP, the "Frequency (Hz)" column specifies the frequency of the Wi-Fi AP in Hz, the "RSSI (dBm)" column shows the RSSI value in dBm, the "Location" column specifies the location where the data was collected, and the "Label" column is a categorical column with the value 1 or 0 for all rows, where 1 means "authentic" and 0 means "unauthorized". The resulting dataset was balanced, with 2442 samples in the "authentic" dataset and 2383 samples in the "unauthorized" dataset.


The dataset is available on IEEE DataPort; click below to find it.