Under the Hood of AFD.sys Part 3: Sending TCP packets
A deep-dive into the IOCTL_AFD_SEND Fast-I/O path: snaring AfdFastIoDeviceControl hits in WinDbg, reverse-engineering the AFD_SEND_INFO / WSABUF chain, and blasting raw TCP payloads straight from user space on Windows 11—still no Winsock, just pure AFD.sys magic.
Introduction
With a word of introduction, this post is the third in a series of articles in which we take a closer look at the AFD.sys
driver. So far we have managed to create a socket and perform a three-way handshake using only I/O request packets to AFD.sys
with the omission of Winsock and mswsock.dll
. Now it was time to send and receive the packet.
As tradition dictates, here is our code from Winsock for our reference:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
void createTCPv4() {
const size_t PAYLOAD = 8;
SOCKET s = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
if (s == INVALID_SOCKET) { std::cerr << "socket: " << WSAGetLastError() << '\n'; return; }
sockaddr_in dst{};
dst.sin_family = AF_INET;
dst.sin_port = htons(80);
InetPtonA(AF_INET, "192.168.1.1", &dst.sin_addr);
if (connect(s, reinterpret_cast<sockaddr*>(&dst), sizeof(dst)) == SOCKET_ERROR) {
std::cerr << "connect: " << WSAGetLastError() << '\n';
closesocket(s); return;
}
std::string big(PAYLOAD, 'A');
size_t sent = 0;
while (sent < big.size()) {
int n = send(s, big.data() + sent, static_cast<int>(big.size() - sent), 0);
if (n == SOCKET_ERROR) {
std::cerr << "send: " << WSAGetLastError() << '\n';
break;
}
sent += n;
}
closesocket(s);
}
After how easy it was to intercept the communiqué between our example program from Winsock and AFD.sys
I thought send
and recv
would be equally easy, but I was wrong. Setting the breakpoints to afd!AfdSend
and afd!AfdReceive
did nothing. The previously adopted method was not effective, in this case.
Starting with send
, I thought at that point that maybe just maybe AfdSend
is not the function that is actually called to send TCP packets. I started searching by available symbols and the phrase Send
, I then hit nearly 96 different entries in the export table…
How do drivers differentiate between requests?
Unlike a normal program, where the start function is main
(simplification), in Windows drivers such a function is DriverEntry
. This is the place where the DRIVER_OBJECT
is created, which is a structure describing the device being made available, you will find there such information as:
- The name of the device, which will be visible, e.g.
\Device\Afd
. - Function setting, when the driver is initialised/deinitialised.
- Setting of the disptach function that is called when an
IRP
comes in.
In order for the driver to distinguish between specific codes there is such a thing as a dispatch function. This is a function that decodes the IoControlCode
and passes the data and control to the next function responsible for handling that particular request. For example, below we have the pseudo C code (Binary Ninja) from the AfdDispatchDeviceControl
function:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
uint64_t AfdDispatchDeviceControl(int64_t arg1, IRP* arg2) {
void* Overlay = *(uint64_t*)((char*)arg2->Tail + 0x40);
if (NetioNrtIsTrackerDevice()) {
int32_t rax_6 = NetioNrtDispatch(arg1, arg2);
*(uint32_t*)((char*)arg2->IoStatus. + 0) = rax_6;
IofCompleteRequest(arg2, 0);
return (uint64_t)rax_6;
}
int32_t r8 = *(uint32_t*)((char*)Overlay + 0x18);
uint64_t rax_3 = (uint64_t)(r8 >> 2) & 0x3ff;
if (rax_3 < 0x4a && *(uint32_t*)((rax_3 << 2) + &AfdIoctlTable) == r8) {
*(uint8_t*)((char*)Overlay + 1) = rax_3;
if ((&AfdIrpCallDispatch)[rax_3])
return _guard_dispatch_icall();
}
if ((*(int64_t*)((char*)g_rgFastWppLevelEnabledFlags + 0xe)) & 0x10)
WPP_SF_D(0xb, &WPP_750cd5b025b73ac1a6ce4c47647b8469_Traceguids, r8);
*(uint32_t*)((char*)arg2->IoStatus. + 0) = 0xc0000010;
IofCompleteRequest(arg2, AfdPriorityBoost);
return 0xc0000010;
}
There are a number of ways on how to perform such a dispatch, one is simply to create a series of expressions with if
or switch/case
and based on the resulting IoControlCode
value, the specific function responsible for performing the operation is called.
The second way (used in AFD.sys
) is to create a call table (see AfdIrpCallDispatch
). Instead of complex conditional expressions, the driver creates an array of (pointers to) functions for itself and, depending on the decoded function, the corresponding call
is executed. A fragment of this code can be found in lines 14 to 19 in the snippet above.
We can go further and see what the content of this AfdIrpCallDispatch
table looks like:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
1c0059410 void* AfdIrpCallDispatch = AfdBind
1c0059418 void* data_1c0059418 = AfdConnect
1c0059420 void* data_1c0059420 = AfdStartListen
1c0059428 void* data_1c0059428 = AfdWaitForListen
1c0059430 void* data_1c0059430 = AfdAccept
1c0059438 void* data_1c0059438 = AfdReceive
1c0059440 void* data_1c0059440 = AfdReceiveDatagram
1c0059448 void* data_1c0059448 = AfdSend
1c0059450 void* data_1c0059450 = AfdSendDatagram
1c0059458 void* data_1c0059458 = AfdPoll
1c0059460 void* data_1c0059460 = AfdDispatchImmediateIrp
1c0059468 void* data_1c0059468 = AfdGetAddress
1c0059470 void* data_1c0059470 = AfdDispatchImmediateIrp
1c0059478 void* data_1c0059478 = AfdDispatchImmediateIrp
...
We see there, for example, that operation 0 will be AfdBind
, operation 1 will be AfdConnect
, and we also find there that operation 7 will be AfdSend
. And these offsets are actually reflected in how we build the IoControlCode
to communicate with AFD.sys
. Our control code is encoded with information about what operation we want to perform:
1
2
3
4
5
6
7
8
9
...
#define AFD_BIND 0
#define AFD_CONNECT 1
...
#define FSCTL_AFD_BASE FILE_DEVICE_NETWORK
#define _AFD_CONTROL_CODE(Request, Method) (FSCTL_AFD_BASE << 12 | (Request) << 2 | (Method))
...
#define IOCTL_AFD_BIND _AFD_CONTROL_CODE(AFD_BIND, METHOD_NEITHER) // 0x12003
#define IOCTL_AFD_CONNECT _AFD_CONTROL_CODE(AFD_CONNECT, METHOD_NEITHER) // 0x12007
Intercepting AfdDispatchDeviceControl
So instead of creating a breakpoint on afd!AfdSend
let’s try setting one for our afd!AfdDispatchDeviceControl
function. What I want to do at this point is simply check what IoControlCode
values are sent to our driver and see if one of them will be IOCTL_AFD_SEND
(0x1201F
). To do this we will use the JavaScript below, which is supposed to read the IoControlCode
value at each hit:
1
2
3
4
5
6
7
8
9
10
11
12
13
"use strict";
function GetIoctl(irpAddr){
// Get _IRP object
const irp = host.createTypedObject(irpAddr, "nt", "_IRP");
// Get _IO_STACK_LOCATION address
const stackPtr = irp.Tail.Overlay.CurrentStackLocation;
// Get _IO_STACK_LOCATION object
const isl = stackPtr.dereference();
const code = isl.Parameters.DeviceIoControl.IoControlCode;
return code;
}
Now we need to load our script and set the appropriate breakpoint, which will write us the returned value and not stop each time:
1
2
3
4
5
6
7
8
9
10
11
10: kd> .scriptrun D:\afddispatch.js;
JavaScript script successfully loaded from 'D:\afddispatch.js'
JavaScript script 'D:\afddispatch.js' has no main function to invoke!
14: kd> .foreach /pS 1 (ep { !process 0 0 afd_re.exe }) { bm /p ${ep} afd!AfdDispatchDeviceControl "dx Debugger.State.Scripts.afddispatch.Contents.GetIoctl(@rdx);gc;" }
17: fffff800`515b2db0 @!"afd!AfdDispatchDeviceControl"
14: kd> g
Debugger.State.Scripts.afddispatch.Contents.GetIoctl(@rdx) : 0x120bf // IOCTL_AFD_TRANSPORT_IOCTL
Debugger.State.Scripts.afddispatch.Contents.GetIoctl(@rdx) : 0x12003 // IOCTL_AFD_BIND
Debugger.State.Scripts.afddispatch.Contents.GetIoctl(@rdx) : 0x12007 // IOCTL_AFD_CONNECT
Already from the obtained IoControlCode
we can see that we only have AfdBind
and AfdConnect
, but where is our AfdSend
? After many hours of reversing AFD.sys
and mswsock.dll
and searching the Internet for information I came across something called Fast I/O.
What is Fast I/O?
I will use the book Windows® Internals Part 2 - 6th edition (especially Chapter 11) (Allievi et al.) as one source of information here. As we can read on page 375, Fast I/O is Windows’ mechanism for performing fast operations, bypassing all the anguish involved in generating I/O request packets. Our driver first checks if something can be handled as Fast I/O, if so it goes to another dispatch function that will handle the request. Although in the book itself the author refers to a File system driver, as we will see this does not apply only to file handling. One of the requirements to be able to handle Fast I/O is that our request must be synchronous, and our send
function from Winsock is, after all, waiting until it receives the result - I don’t know if this is the good determinant, mswsock.dll
may handle it differently, but it’s always something. Importantly, requests that can be handled as Fast I/O do not go to the traditional dispatch function.
Looking for send
We have some suspicion that AFD.sys
supports send
as Fast I/O, so let’s start looking for confirmation in the code. Like traditional dispatch, fast dispatch is also set in DriverEntry
:
1
2
3
4
5
6
7
8
9
10
11
NTSTATUS DriverEntry(DRIVER_OBJECT* arg1) {
...
rdi_3 = __memfill_u64(&arg1->MajorFunction, AfdDispatch, 0x1c);
arg1->MajorFunction[0xe] = AfdDispatchDeviceControl;
arg1->MajorFunction[0xf] = AfdWskDispatchInternalDeviceControl;
arg1->MajorFunction[0x17] = AfdEtwDispatch;
arg1->FastIoDispatch = &AfdFastIoDispatch;
arg1->DriverUnload = AfdUnload;
void* AfdDeviceObject_1 = AfdDeviceObject;
...
}
And so right next to AfdDispatchDeviceControl
we have the AfdFastIoDispatch
function, it is worth taking a closer look at it. Our AfdFastIoDispatch
object is an array:
1
2
3
4
5
6
7
8
9
1c0065000 AfdFastIoDispatch:
1c0065000 e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
1c0065010 void* data_1c0065010 = AfdFastIoRead
1c0065018 void* data_1c0065018 = AfdFastIoWrite
1c0065020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
1c0065030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
1c0065040 void* data_1c0065040 = AfdSanFastUnlockAll
1c0065048 00 00 00 00 00 00 00 00 ........
1c0065050 void* data_1c0065050 = AfdFastIoDeviceControl
In our array we can see the entry AfdFastIoDeviceControl
, which is a dispatch function, but for Fast I/O. Why not throw a breakpoint in there and collect the IoControlCode
. Except that they won’t have to delve into the _IRP
structure, the operation code is passed as one of the arguments of the PFAST_IO_DEVICE_CONTROL
call:
1
2
3
4
5
6
7
8
9
10
11
12
13
typedef
BOOLEAN
(*PFAST_IO_DEVICE_CONTROL) (
IN struct _FILE_OBJECT *FileObject,
IN BOOLEAN Wait,
IN PVOID InputBuffer OPTIONAL,
IN ULONG InputBufferLength,
OUT PVOID OutputBuffer OPTIONAL,
IN ULONG OutputBufferLength,
IN ULONG IoControlCode,
OUT PIO_STATUS_BLOCK IoStatus,
IN struct _DEVICE_OBJECT *DeviceObject
);
So all we need to do is read the seventh argument (@rdi
) of the call, we do this by setting such a breakpoint:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
6: kd> .foreach /pS 1 (ep { !process 0 0 afd_re.exe }) { bm /p ${ep} afd!AfdFastIoDeviceControl ".printf \"IoControlCode=%p\\n\", @rdi;gc;" }
2: fffff800`515c4c20 @!"afd!AfdFastIoDeviceControl"
Couldn't resolve error at 'SessionId: afd!AfdFastIoDeviceControl ".printf \"IoControlCode=%p\\n\", @rdi;gc;" '
6: kd> g
IoControlCode=000000000001207b // IOCTL_AFD_TRANSMIT_FILE
IoControlCode=000000000001207b // IOCTL_AFD_TRANSMIT_FILE
IoControlCode=0000000000012047 // IOCTL_AFD_SET_CONTEXT
IoControlCode=00000000000120bf // IOCTL_AFD_TRANSPORT_IOCTL
IoControlCode=0000000000012047 // IOCTL_AFD_SET_CONTEXT
IoControlCode=0000000000012003 // IOCTL_AFD_BIND
IoControlCode=0000000000012047 // IOCTL_AFD_SET_CONTEXT
IoControlCode=0000000000012007 // IOCTL_AFD_CONNECT
IoControlCode=0000000000012047 // IOCTL_AFD_SET_CONTEXT
IoControlCode=000000000001201f // IOCTL_AFD_SEND
Ok, there we have it! Our send
is treated as Fast I/O, let’s try to look at the AFD.sys
code and find what function is called when the driver receives 0x1201f
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
1c0034be0 int64_t AfdFastIoDeviceControl(struct _FILE_OBJECT* FileObject,
1c0034be0 BOOLEAN Wait, PVOID InputBuffer, ULONG InputBufferLength,
1c0034be0 PVOID OutputBuffer, ULONG OutputBufferLength, ULONG IoControlCode,
1c0034be0 PIO_STATUS_BLOCK IoStatus, struct _DEVICE_OBJECT* DeviceObject) {
...
1c0034c9b if (IoControlCode == 0x1201f)
1c0034c9b goto label_1c0034d7d;
...
1c0034d7d label_1c0034d7d:
1c0034d7d __builtin_memset(&s_2, 0, 0x14);
1c0034d8d int128_t s_3;
1c0034d8d __builtin_memset(&s_3, 0, 0x48);
...
1c00350f7 rbx = (uint64_t)AfdFastConnectionSend(FsContext,
1c00350f7 &s_2, rax_30, IoStatus);
1c00350fa goto label_1c003646b;
...
1c0034be0 }
The code of the entire AfdFastIoDeviceControl
is quite extensive, so I have only shown the parts related to our 0x1201f
. We can find there that if IoControlCode == 0x1201f
, then execute jmp
to 0x1c0034d7d
. This is where the initialisation of all necessary memory areas, variables etc. starts. And a piece further on we have a call to the AfdFastConnectionSend
function. This could be our function responsible for sending the data. Of course, to confirm this we should now set a breakpoint there:
1
2
3
4
5
6
7
8
9
6: kd> .foreach /pS 1 (ep { !process 0 0 afd_re.exe }) { bm /p ${ep} afd!AfdFastConnectionSend }
4: fffff800`515aac90 @!"afd!AfdFastConnectionSend"
Couldn't resolve error at 'SessionId: afd!AfdFastConnectionSend '
6: kd> g
Breakpoint 4 hit
afd!AfdFastConnectionSend:
fffff800`515aac90 4053 push rbx
Hit! We found our function responsible for sending data via TCP! Now it is time to analyse the input buffer. Here, as usual, our invaluable sources (killvxk), (unknowncheats.me ICoded post), (ReactOS Project), (DynamoRIO / Dr. Memory), (DeDf), (diversenok) will help us.
Analyzing retrieved data AfdFastConnectionSend
From our signature for PFAST_IO_DEVICE_CONTROL
, we know that to the dispatch, InputBuffer
and InputBufferLength
are passed as arguments to the third and fourth arguments, respectively. We are not sure that they are passed to AfdFastConnectionSend
at the same positions, but we can safely assume that they are also passed directly as arguments. So what we’ll be looking for is by the values of the address registers from user-space (canonical lower half) and some (relatively) small buffer length value.
1
2
3
4
5
6
7
8
9
10
11
12: kd> r
rax=0000000000000002 rbx=000000c9532ff128 rcx=ffffbd8bfa8dda80
rdx=ffffce0958bcef70 rsi=0000000000000001 rdi=0000000000000000
rip=fffff800515aac90 rsp=ffffce0958bcee88 rbp=ffffce0958bcf4e0
r8=0000000000000008 r9=ffffce0958bcf1c8 r10=fffff800bbc17c70
r11=ffff88f9d7c00000 r12=ffffbd8bfa8dda80 r13=0000000000000000
r14=0000000000000018 r15=000000000000afd1
iopl=0 nv up ei pl zr na po nc
cs=0010 ss=0018 ds=002b es=002b fs=0053 gs=002b efl=00040246
afd!AfdFastConnectionSend:
fffff800`515aac90 4053 push rbx
Here we see that the rbx
register stores something that may resemble an address in user-space, while r14
looks like the size of our buffer. So let’s read their value:
1
2
3
12: kd> db 000000c9532ff128 L18
000000c9`532ff128 08 f2 2f 53 c9 00 00 00-01 00 00 00 00 00 00 00 ../S............
000000c9`532ff138 00 00 00 00 00 00 00 00 ........
Again we have something that resembles an address and some size, let’s try to read (note on the dumpy it is little-endian) 0x000000c9532ff208
:
1
2
3
12: kd> db 000000c9532ff208 L18
000000c9`532ff208 08 00 00 00 00 00 00 00-e0 f2 2f 53 c9 00 00 00 ........../S....
000000c9`532ff218 00 00 00 00 00 00 00 00 ........
Once again, we see some size (0x08
), which corresponds to the AAAAAAAA
payload we sent. Let’s try another dereference and check to see what it is at 0x000000c9532ff2e0
:
1
2
3
12: kd> db 0x000000c9532ff2e0 L8
12: kd> db 0x000000c9532ff2e0 L8
000000c9`532ff2e0 41 41 41 41 41 41 41 41 AAAAAAAA
We’ve got it! There is our payload! But the question is how are the buffers constructed? The answer to that will be found in (diversenok):
1
2
3
4
5
6
7
8
9
10
11
12
// ref: https://learn.microsoft.com/en-us/windows/win32/api/ws2def/ns-ws2def-wsabuf
typedef struct _WSABUF {
ULONG len;
CHAR *buf;
} WSABUF, *LPWSABUF;
typedef struct _AFD_SEND_INFO {
_Field_size_(BufferCount) LPWSABUF BufferArray;
ULONG BufferCount;
ULONG AfdFlags;
ULONG TdiFlags; // TDI_RECEIVE_*
} AFD_SEND_INFO, *PAFD_SEND_INFO;
Breaking this down step by step, we first have a _AFD_SEND_INFO
structure containing a pointer to an array of buffers and the number of these buffers. In each buffer, on the other hand, we have its length and a pointer to the data. A fairly good analogy for this might be the standard use of argv
in the main
function. There, too, we are dealing with an array for pointers to the buffers of our arguments passed to the program.
A keen eye can spot a certain inconsistency. After all, we know that the InputBuffer from Winsock is 0x18
bytes and our _AFD_SEND_INFO
structure is 0x20
bytes. I have experimentally verified that, in principle, TdiFlags
is optional. Presumably if we had indicated TransportDevice
(e.g. DeviceTcp
) when creating the socket we would have had to indicate this. This leaves the conundrum of what values can AfdFlags
take?
According to what we have in (diversenok) this could be:
1
2
#define AFD_NO_FAST_IO 0x0001
#define AFD_OVERLAPPED 0x0002
The AFD_NO_FAST_IO
seems to be the most interesting from the perspective of our work so far. In fact when we set AfdFlags
to 0x0001
then AFD.sys
goes through a classic dispatch and the breakpoint on AfdSend
is triggered:
1
2
3
4
5
6
7
12: kd> .foreach /pS 1 (ep { !process 0 0 afd-networking.exe }) { bm /p ${ep} afd!AfdSend }
6: fffff800`515a18c0 @!"afd!AfdSend"
Couldn't resolve error at 'SessionId: afd!AfdSend '
12: kd> g
Breakpoint 6 hit
afd!AfdSend:
fffff800`515a18c0 4c8bdc mov r11,rsp
So, that’s cool, we can control how this particular request will be dispatched. It’s worth saving this for a later reserach on where and how this is done. What about TCPv6? Generally it looks the same, there are no big differences in sending packets. Socket created, connection established, interface to send is the same.
The question now would be how many buffers can it send, how big can they be? Does the total number of bytes count? Let’s find out!
Playing with buffers
So let’s perhaps start by trying to send 10 megabytes using WinSock and see if it breaks it up somehow, to get a general idea of what we’re dealing with. By default, I set my breakpoint to afd!AfdFastIoDeviceControl
and write out the IoControlCode
to see if, for example, Winsock is splitting this data packet into multiple requests:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
14: kd> .foreach /pS 1 (ep { !process 0 0 afd_re.exe }) { bm /p ${ep} afd!AfdFastIoDeviceControl ".printf \"IoControlCode=%p\\n\", @rdi;gc;" }
10: fffff800`515c4c20 @!"afd!AfdFastIoDeviceControl"
Couldn't resolve error at 'SessionId: afd!AfdFastIoDeviceControl ".printf \"IoControlCode=%p\\n\", @rdi;gc;" '
14: kd> g
IoControlCode=000000000001207b
IoControlCode=000000000001207b
IoControlCode=0000000000012047
IoControlCode=00000000000120bf
IoControlCode=0000000000012047
IoControlCode=0000000000012003
IoControlCode=0000000000012047
IoControlCode=0000000000012007
IoControlCode=0000000000012047
IoControlCode=000000000001201f
Despite our loop to make sure all the data was sent this Winsock managed to send 10 Megabytes at a time:
1
2
3
4
5
6
7
8
9
while (sent < big.size()) {
int n = send(s, big.data() + sent, static_cast<int>(big.size() - sent), 0);
std::cerr << "sent portion: " << n << '\n';
if (n == SOCKET_ERROR) {
std::cerr << "send: " << WSAGetLastError() << '\n';
break;
}
sent += n;
}
And what does the buffer that is passed to AfdFastConnectionSend
look like?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
8: kd> .foreach /pS 1 (ep { !process 0 0 afd_re.exe }) { bm /p ${ep} afd!AfdFastConnectionSend }
12: fffff800`515aac90 @!"afd!AfdFastConnectionSend"
Couldn't resolve error at 'SessionId: afd!AfdFastConnectionSend '
8: kd> g
Breakpoint 12 hit
afd!AfdFastConnectionSend:
fffff800`515aac90 4053 push rbx
8: kd> r
rax=0000000000000002 rbx=000000ce1a6ff328 rcx=ffffbd8bfa8dac00
rdx=ffffce0956db6f70 rsi=0000000000000001 rdi=0000000000000000
rip=fffff800515aac90 rsp=ffffce0956db6e88 rbp=ffffce0956db74e0
r8=0000000006400000 r9=ffffce0956db71c8 r10=fffff800bbc17c70
r11=ffff88f9d7c00000 r12=ffffbd8bfa8dac00 r13=0000000000000000
r14=0000000000000018 r15=000000000000afd1
iopl=0 nv up ei pl zr na po nc
cs=0010 ss=0018 ds=002b es=002b fs=0053 gs=002b efl=00040246
afd!AfdFastConnectionSend:
fffff800`515aac90 4053 push rbx
8: kd> dq 000000ce1a6ff328 L3
000000ce`1a6ff328 000000ce`1a6ff408 00000000`00000001
000000ce`1a6ff338 00000000`00000000
8: kd> dq 000000ce`1a6ff408 L2
000000ce`1a6ff408 00000000`06400000 00000242`e3249080
Everything flies in one big buffer - the same for 1 Gigabyte. So I am curious how realistically AFD.sys
interprets these buffers. Maybe n
buffers will be sent as n
packets? This is already verified without using Winsock:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
NTSTATUS sendAfdPacketTCP(HANDLE socket) {
const int BUF_NUM = 16;
const int BUF_SIZE = 16;
AFD_BUFF* payload = new AFD_BUFF[BUF_NUM];
for (int i = 0; i < BUF_NUM; i++) {
payload[i].buf = (uint8_t*)malloc(BUF_SIZE);
memset(payload[i].buf, 0x42, BUF_SIZE);
payload[i].len = BUF_SIZE;
}
AFD_SEND_PACKET* afdSendPacket = new AFD_SEND_PACKET;
afdSendPacket->BufferArray = payload;
afdSendPacket->BufferCount = BUF_NUM;
afdSendPacket->AfdFlags = AFD_NO_FAST_IO;
IO_STATUS_BLOCK ioStatus;
NTSTATUS status = NtDeviceIoControlFile(socket, NULL, NULL, NULL, &ioStatus, IOCTL_AFD_SEND,
afdSendPacket, sizeof(AFD_SEND_PACKET),
NULL, NULL);
if (status == STATUS_PENDING) {
WaitForSingleObject(socket, INFINITE);
status = ioStatus.Status;
}
return status;
}
As it turns out this changes nothing, it flies as one packet. For obvious reasons per the TCP specification the packet would be split once it exceeded 0xFFFF
bytes, but the number of buffers has no bearing on this. I checked experimentally and AFD.sys
will also accept 1024*1024
buffers of 1024
bytes each. An important limitation, of course, remains our hardware.
Next steps
Although I originally intended to discuss both send
and receive
in this part, this article is long enough that it is in the next step that we will deal with receiving TCP packets.
Final code
Below you can find the full code for the current state of our knowledge:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
#include <stdint.h>
#include <Windows.h>
#include <winternl.h>
#include <iostream>
#include "afd_defs.h"
#include "afd_ioctl.h"
#pragma comment(lib, "ntdll.lib")
NTSTATUS createAfdSocket(PHANDLE socket) {...}
NTSTATUS bindAfdSocket(HANDLE socket) {...}
NTSTATUS connectAfdSocket(HANDLE socket) {...}
// AFDFLAGS
#define AFD_NO_FAST_IO 0x0001
#define AFD_OVERLAPPED 0x0002
struct AFD_BUFF {
uint64_t len;
uint8_t* buf;
};
struct AFD_SEND_PACKET {
AFD_BUFF* buffersArray;
uint64_t buffersCount;
uint64_t afdFlags;
uint64_t tdiFlags; // optional
};
NTSTATUS sendAfdPacketTCP(HANDLE socket) {
const int BUF_NUM = 1;
const int BUF_SIZE = 16;
AFD_BUFF* payload = new AFD_BUFF[BUF_NUM];
for (int i = 0; i < BUF_NUM; i++) {
payload[i].buf = (uint8_t*)malloc(BUF_SIZE);
memset(payload[i].buf, 0x42, BUF_SIZE);
payload[i].len = BUF_SIZE;
}
AFD_SEND_PACKET* afdSendPacket = new AFD_SEND_PACKET;
afdSendPacket->buffersArray = payload;
afdSendPacket->buffersCount = BUF_NUM;
afdSendPacket->afdFlags = 0;
IO_STATUS_BLOCK ioStatus;
NTSTATUS status = NtDeviceIoControlFile(socket, NULL, NULL, NULL, &ioStatus, IOCTL_AFD_SEND,
afdSendPacket, sizeof(AFD_SEND_PACKET),
NULL, NULL);
if (status == STATUS_PENDING) {
WaitForSingleObject(socket, INFINITE);
status = ioStatus.Status;
}
return status;
}
int main() {
HANDLE socket;
NTSTATUS status = createAfdSocket(&socket);
if (!NT_SUCCESS(status)) {
std::cout << "[-] Could not create socket: " << std::hex << status << std::endl;
return 1;
}
std::cout << "[+] Socket created!" << std::endl;
status = bindAfdSocket(socket);
if (!NT_SUCCESS(status)) {
std::cout << "[-] Could not bind: " << std::hex << status << std::endl;
return 1;
}
std::cout << "[+] Socket bound!" << std::endl;
status = connectAfdSocket(socket);
if (!NT_SUCCESS(status)) {
std::cout << "[-] Could not connect: " << std::hex << status << std::endl;
return 1;
}
std::cout << "[+] Connected!" << std::endl;
status = sendAfdPacketTCP(socket);
if (!NT_SUCCESS(status)) {
std::cout << "[-] Could not send TCP packet: " << std::hex << status << std::endl;
return 1;
}
std::cout << "[+] Sent!" << std::endl;
return 0;
}
References
- Vittitoe, Steven. “Reverse Engineering Windows AFD.sys: Uncovering the Intricacies of the Ancillary Function Driver.” Proceedings of REcon 2015, 2015, https://doi.org/10.5446/32819.
- killvxk. CVE-2024-38193 Nephster PoC. 2024, https://github.com/killvxk/CVE-2024-38193-Nephster/blob/main/Poc/poc.h.
- unknowncheats.me ICoded post. Native TCP Client Socket. n.d., https://www.unknowncheats.me/forum/c-and-c-/500413-native-tcp-client-socket.html.
- ReactOS Project. Afd.h. n.d., https://github.com/reactos/reactos/blob/master/drivers/network/afd/include/afd.h.
- DynamoRIO / Dr. Memory. afd_sharedḣ. n.d., https://github.com/DynamoRIO/drmemory/blob/master/wininc/afd_shared.h.
- Dr. Memory - GH issue#376. Issue #376: AFD Support Improvements. n.d., https://github.com/DynamoRIO/drmemory/issues/376.
- Microsoft. NtCreateFile Function (Winternl.h). n.d., https://learn.microsoft.com/en-us/windows/win32/api/winternl/nf-winternl-ntcreatefile.
- ---. x64 Calling Convention. n.d., https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-170.
- ---. x64 Calling Convention. n.d., https://learn.microsoft.com/pl-pl/windows/win32/api/winsock2/nf-winsock2-wsasocketa.
- DeDf. AFD Repository. n.d., https://github.com/DeDf/afd/tree/master.
- Allievi, Andrea, et al. Windows® Internals Part 2 - 6th Edition. 6th ed., Microsoft Press (Pearson Education), 2022, https://learn.microsoft.com/sysinternals/resources/windows-internals.
- diversenok. \Textttntafd.h – Ancillary Function Driver Definitions. commit 2dda0dd, Hunt & Hackett, April 2025, https://github.com/winsiderss/systeminformer/blob/master/phnt/include/ntafd.h.