Understanding Kernel Stack Overflows
Kernel stack overflows are a common error in many cases reported to us by customers. These are caused by drivers taking up too much space on the kernel stack. This results in a kernel stack overflow, which will then crash the system with one of the following bugchecks:
- STOP 0x7F: UNEXPECTED_KERNEL_MODE_TRAP with Parameter 1 set to EXCEPTION_DOUBLE_FAULT, which is caused by running off the end of a kernel stack.
- STOP 0x1E: KMODE_EXCEPTION_NOT_HANDLED, 0x7E: SYSTEM_THREAD_ EXCEPTION_NOT_HANDLED, or 0x8E: KERNEL_MODE_EXCEPTION_NOT_ HANDLED, with an exception code of STATUS_ACCESS_VIOLATION, which indicates a memory access violation.
- STOP 0x2B: PA NIC_STACK_SWITCH, which usually occurs when a kernel-mode driver uses too much stack space.
Each thread in the system is allocated with a kernel mode stack. Code running on any kernel-mode thread (whether it is a system thread or a thread created by a driver) uses that thread's kernel-mode stack unless the code is a deferred procedure call (DPC), in which case it uses the processor's DPC stack on certain platforms.
The stack grows negatively. This means that the beginning (bottom) of the stack has a higher address than the end (top) of the stack. For example, let's say the beginning of your stack is 0x80f1000, and this is where your stack pointer (ESP) is pointing. If you push a DWORD value onto the stack, its address would be 0x80f0ffc. The next DWORD value would be stored at 0x80f0ff8 and so on up to the limit (top) of the allocated stack. The top of the stack is bordered by a guard page to detect overruns.
The size of the kernel-mode stack varies among different hardware platforms. For example, on 32-bit platforms, the kernel-mode stack is 12 KB, and on 64-bit platforms, the kernel-mode stack is 24 KB. The stack sizes are hard limits that are imposed by the system, and all drivers need to use space conservatively so that they can coexist. When we reach the top of the stack, one more push instruction is going to cause an exception, which in turn can lead to a Stop error. This could be either a simple push instruction or something along the lines of a call instruction that also pushes the return address onto the stack.
In this tutorial:
- Troubleshooting Stop Messages
- Stop Message Overview
- Identifying the Stop Error
- Finding Troubleshooting Information
- Stop Messages
- Bugcheck Information
- Technical Information
- Debug Port and Dump Status Information
- Types of Stop Errors
- Memory Dump Files
- Configuring Small Memory Dump Files
- Configuring Kernel Memory Dump Files
- Configuring Complete Memory Dump Files
- How to Manually Initiate a Stop Error and Create a Dump File
- Using Memory Dump Files to Analyze Stop Errors
- Using Windows 7 Error Reporting
- Using Symbol Files and Debuggers
- Being Prepared for Stop Errors
- Record and Save Stop Message Information
- Check Software Disk Space Requirements
- Install a Kernel Debugger and Symbol Files
- Stop 0xA or IRQL_NOT_LESS_OR_EQUAL
- Stop 0x1E or KMODE_EXCEPTION_NOT_HANDLED
- Understanding Kernel Stack Overflows
- Stop 0x24 or NTFS_FILE_SYSTEM
- Stop 0x2E or DATA_BUS_ERROR
- Stop 0x3B or SYSTEM_SERVICE_EXCEPTION
- Stop 0x3F or NO_MORE_SYSTEM_PTES
- Stop 0x50 or PA GE_FAULT_IN_NONPA GED_AREA
- Stop 0x77 or KERNEL_STACK_INPA GE_ERROR
- Stop 0x7A or KERNEL_DATA_INPA GE_ERROR
- Stop 0x7B or INACCESSIBLE_BOOT_DEVICE
- Stop 0x7F or UNEXPECTED_KERNEL_MODE_TRAP
- Stop 0x9F or DRIVER_POWER_STATE_FAILURE
- Stop 0xBE or ATTEMPTED_WRITE_TO_READONLY_MEMORY
- Stop 0xC2 or BAD_POOL_CALLER
- Stop 0xCE or DRIVER_UNLOADED_WITHOUT_CANCELLING_ PENDING_OPERATIONS
- Stop 0xD1 or IRQL_NOT_LESS_OR_EQUAL
- Stop 0xD8 or DRIVER_USED_EXCESSIVE_PTES
- Stop 0xEA or THREAD_STUCK_IN_DEVICE_DRIVER
- Stop 0xED or UNMOUNTABLE_BOOT_VOLUME
- Stop 0xFE or BUGCODE_USB_DRIVER
- Stop 0x00000124
- Stop 0xC000021A or STATUS_SYSTEM_PROCESS_TERMINATED
- Stop 0xC0000221 or STATUS_IMAGE_CHECKSUM_MISMATCH
- Hardware Malfunction Messages
- Stop Message Checklist
- Check Your Software
- Use the Last Known Good Configuration
- Restart the System in Safe Mode
- Check Event Viewer Logs
- Install Compatible Antivirus Tools
- Report Your Errors
- Install Operating System and Driver Updates
- Install and Use a Kernel Debugger
- Check Your Hardware
- Check for Nondefault Firmware Settings
- Check for Non-Default Hardware Clock Speeds
- Check by Running Hardware Diagnostic Tools
- Check for SCSI Disk and Controller Settings
- Check Memory Compatibility
- Check by Temporarily Removing Devices
- Check by Replacing a Device