APC Suspension for Suspending unsuspendable.
Weaponizing APCs to reliably suspend all threads.
Suspension Techniques
The ideas by diversenok shows us that reliably suspending threads/processes in Windows is inherently hard, because there is no atomicity in this operation at kernel level. There are a few ways he found. Especially the state changes. But then we have to suspend entire process for that to work. Today I found a technique that allows us to suspend all threads separately, and reliably.
APCs Crash Course
Asynchronous Procedure Calls, perhaps the most interesting thing in the entire Windows NT kernel. They have 2 modes, special and normal. Normal APCs require the thread to await for an APC explicitly to receive APCs via NtDelayExecution or similar function. Special APCs on the other hand blast through the thread’s context without permission.
APCs can be issued both from kernel and user mode, in this case, it will be from user mode.
The moment user-mode thread returns from kernel, the special queued APCs begin execute. If thread is already in user-mode, they get “blasted” (I started to like this term).
So how do we do it?
In the SuspendMe tool, author simply has a loop that uses ResumeThread on each of the threads. While it might seem ineffective, as he notes, its actually highly effective against anyone who tries to suspend threads of a process.
It is effective because it scales with the number of active threads, as they continuously loop and resume one another. Approaches like NtSuspendProcess or manually enumerating threads fail here. For instance, in NtSuspendProcess, instead of freezing the process instantly, the kernel simply iterates through the existing threads sequentially using PsGetNextProcessThread and PsSuspendThread. This non-atomic execution in the kernel creates a race condition: while the kernel is busy suspending thread N, thread N+1 is already resuming it. This is the killer that makes enumerated suspension hard against an anti-suspension loop.
We will use APCs to blast through what they are currently doing and redirect them to a Sleep call or even better, a int3 for our debugger. As you might read from my “Debugger From Scratch” series. This will come in handy for next blog posts. We can force the software to stop or throw an exception by pointing it to a guard page, or to an int3 instruction.
The real deal is that, since they aren’t “suspended”, their ResumeThread calls are for nothing. We are redirecting the control flow to put them in sleep state ourselves, which are “sleeping” and not “suspended”.
We will need the function NtQueueApcThreadEx2 to be able to queue special APCs.
1
2
3
4
5
6
7
8
9
10
11
12
// taken from ntdoc
NTSTATUS
NTAPI
NtQueueApcThreadEx2(
_In_ HANDLE ThreadHandle,
_In_opt_ HANDLE ReserveHandle, // NtAllocateReserveObject
_In_ ULONG ApcFlags, // QUEUE_USER_APC_FLAGS
_In_ PPS_APC_ROUTINE ApcRoutine, // RtlDispatchAPC
_In_opt_ PVOID ApcArgument1,
_In_opt_ PVOID ApcArgument2,
_In_opt_ PVOID ApcArgument3
);
Now, we need to inject a shellcode that does what we exactly want, in this case, we will use the good old VirtualAllocEx.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
PVOID addr = VirtualAllocEx(hProcess, NULL, 1024, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
UINT8 shell[] = {
0x48, 0x83, 0xEC, 0x28, // sub rsp, 40 ; allocate shadoww space
0x48, 0xC7, 0xC1, 0x50, 0xC3, 0x00, 0x00, // mov rcx, 50000 ; wait for 50000 ms
0x48, 0xB8, // mov rax, <imm64> ; set rax to point to Sleep function
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, // addr
0xFF, 0xD0, // call rax ; execute!
0x48, 0x83, 0xC4, 0x28, // add rsp, 40 ; we will never be here
0xC3 // ret
};
UINT64 sleep = GetProcAddress(GetModuleHandleA("kernel32.dll"), "Sleep");
memcpy(&shell[13], &sleep, 8);
DWORD bytesWritten = 0;
WriteProcessMemory(hProcess, addr, shell, 28, &bytesWritten);
After this, we will go through each thread and issue a APC.
1
2
3
for (INT32 i = 0; i < count; i++) {
NtQueueApcThreadEx2(threads[i], NULL, QUEUE_USER_APC_FLAGS_SPECIAL_USER_APC, addr, 0, 0, 0);
}
Then while they all sleep, we put the final nail in coffin:
1
2
3
for (INT32 i = 0; i < count; i++) {
SuspendThread(threads[i]);
}
Note that you can use QUEUE_USER_APC_FLAGS_CALLBACK_DATA_CONTEXT flag which gives us a CONTEXT of thread we just blasted through. This will come in handy in our next episodes of “Debugger From Scratch”.
We open the SuspendMe tool for our tests. 
As you can see, we succesfully suspended each thread without having to use state changes NT provides us.
