The execve service implements the POSIX execve() system call and related execution mechanisms in the Linux kernel. It is responsible for replacing the current process image with a new executable, handling binary format detection, credential management, memory layout, and security checks.
Location: fs/exec.c (primary), architecture-specific syscall wrappersCategory: Process Management / System CallsKey Files: fs/exec.c, include/linux/binfmts.h, fs/binfmt_elf.cDependencies: Security subsystem, MM subsystem, Credential management, Binary format handlers
The execve system enables:
User Space: execve(path, argv, envp)
↓
Syscall Entry: SYSCALL_DEFINE3(execve, ...)
↓
do_execve() → do_execveat_common(AT_FDCWD, ...)
↓
1. Validation & Setup
- Check RLIMIT_NPROC
- Allocate linux_binprm structure
- Count and validate argc/envc
↓
2. Argument/Environment Copying
- copy_string_kernel(filename)
- copy_strings(envc, envp)
- copy_strings(argc, argv)
- Handle empty argv edge case
↓
3. bprm_execve(bprm)
├─ prepare_bprm_creds() - Setup credentials
├─ check_unsafe_exec() - Security validation
├─ current->in_execve = 1
├─ sched_exec() - CPU migration optimization
├─ security_bprm_creds_for_exec() - LSM hook
└─ exec_binprm(bprm)
└─ search_binary_handler(bprm)
└─ Iterate registered binfmt handlers
└─ fmt->load_binary(bprm) [e.g., load_elf_binary]
└─ begin_new_exec(bprm) [Point of no return]
↓
4. Post-Execution Cleanup
- sched_mm_cid_after_execve()
- rseq_execve()
- current->in_execve = 0
- user_events_execve()
- acct_update_integrals()
- task_numa_free()search_binary_handler()
↓
Iterate through registered formats:
1. binfmt_script (#! interpreter)
2. binfmt_elf (ELF binaries)
3. binfmt_misc (Custom handlers)
4. binfmt_flat (Flat binaries, embedded)
↓
First successful load_binary() wins
↓
If interpreter needed (scripts):
Recursively call exec_binprm() with interpreter
Max depth: 5 levels (prevents infinite loops)struct linux_binprmCore structure holding execution parameters throughout the exec process.
struct linux_binprm {
// Memory management
struct vm_area_struct *vma; // VMA for argument stack
unsigned long vma_pages;
unsigned long argmin; // Stack limit marker
struct mm_struct *mm;
unsigned long p; // Current top of memory
// Flags
unsigned int
have_execfd:1, // Exec fd passed to userspace
execfd_creds:1, // Use script creds (binfmt_misc)
secureexec:1, // Privilege-gaining exec occurred
point_of_no_return:1, // Cannot return errors to userspace
comm_from_dentry:1, // Comm from dentry
is_check:1; // Check executability only
// File references
struct file *executable; // Executable file
struct file *interpreter; // Interpreter (for scripts)
struct file *file; // Current binary file
// Credentials
struct cred *cred; // New credentials
int unsafe; // LSM_UNSAFE_* mask
unsigned int per_clear; // Personality bits to clear
// Arguments
int argc, envc; // Argument/environment counts
const char *filename; // Binary name (for procps)
const char *interp; // Actual executed binary
const char *fdpath; // Generated path for execveat
// Execution state
unsigned interp_flags;
int execfd; // Executable file descriptor
unsigned long exec; // Entry point address
// Resource limits
struct rlimit rlim_stack; // Saved RLIMIT_STACK
// Magic number buffer (first 128 bytes of file)
char buf[BINPRM_BUF_SIZE];
};struct linux_binfmtBinary format handler interface.
struct linux_binfmt {
struct list_head lh; // Linked list node
struct module *module; // Owning module
int (*load_binary)(struct linux_binprm *); // Load handler
#ifdef CONFIG_COREDUMP
int (*core_dump)(struct coredump_params *); // Core dump handler
unsigned long min_coredump; // Minimum core dump size
#endif
};execve(const char *filename, char *const argv[], char *const envp[])Standard POSIX execve implementation.
Parameters:
filename: Path to executableargv: NULL-terminated argument arrayenvp: NULL-terminated environment arrayImplementation:
SYSCALL_DEFINE3(execve,
const char __user *, filename,
const char __user *const __user *, argv,
const char __user *const __user *, envp)
{
return do_execve(getname(filename), argv, envp);
}execveat(int fd, const char *filename, char *const argv[], char *const envp[], int flags)Extended version supporting relative paths and additional flags.
Flags:
AT_EMPTY_PATH: Allow empty filename (operate on fd directly)AT_SYMLINK_NOFOLLOW: Don't follow symlinksImplementation:
SYSCALL_DEFINE5(execveat,
int, fd, const char __user *, filename,
const char __user *const __user *, argv,
const char __user *const __user *, envp,
int, flags)
{
return do_execveat(fd, getname_uflags(filename, flags), argv, envp, flags);
}For 32-bit compatibility on 64-bit kernels:
COMPAT_SYSCALL_DEFINE3(execve, ...)COMPAT_SYSCALL_DEFINE5(execveat, ...)kernel_execve(const char *filename, const char *const *argv, const char *const *envp)Execute a program from kernel space (e.g., init process, hotplug helpers).
Restrictions:
Format | Handler | Module | Description |
ELF |
| binfmt_elf | Standard ELF executables |
Script |
| binfmt_script | #! interpreter scripts |
Misc |
| binfmt_misc | Custom format handlers |
Flat |
| binfmt_flat | Flat binaries (embedded) |
ELF FDPIC |
| binfmt_elf_fdpic | ELF with FDPIC (MMU-less) |
// Register at end of list (lower priority)
void register_binfmt(struct linux_binfmt *fmt);
// Insert at beginning (higher priority)
void insert_binfmt(struct linux_binfmt *fmt);
// Unregister format
void unregister_binfmt(struct linux_binfmt *fmt);Example (from binfmt_elf.c):
static struct linux_binfmt elf_format = {
.module = THIS_MODULE,
.load_binary = load_elf_binary,
#ifdef CONFIG_COREDUMP
.core_dump = elf_core_dump,
.min_coredump = ELF_EXEC_PAGESIZE,
#endif
};
static int __init init_elf_binfmt(void)
{
register_binfmt(&elf_format);
return 0;
}Execve handles privilege changes based on file permissions:
security_bprm_check(bprm); // Pre-execution check
security_bprm_creds_for_exec(bprm); // Credential setup
security_bprm_committing_creds(bprm); // Before committing
security_bprm_committed_creds(bprm); // After committingTracked via bprm->unsafe flags:
Flag | Meaning |
| Sharing mm with another process |
| Process is being ptraced |
| No-new-privs bit set |
When unsafe states detected:
After begin_new_exec() succeeds:
bprm->point_of_no_returnHigh Addresses
┌─────────────────────┐
│ Environment Strings │ ← envp[0], envp[1], ...
├─────────────────────┤
│ Argument Strings │ ← argv[0], argv[1], ...
├─────────────────────┤
│ Filename String │ ← Program path
├─────────────────────┤
│ Auxiliary Vector │ ← AT_ENTRY, AT_PHDR, etc.
├─────────────────────┤
│ Environment Pointers│ ← envp[] array + NULL
├─────────────────────┤
│ Argument Pointers │ ← argv[] array + NULL
├─────────────────────┤
│ Argument Count │ ← argc
├─────────────────────┤
Low AddressesMAX_ARG_STRINGS (typically 2^31)MAX_ARG_STRLEN (PAGE_SIZE * 32)RLIMIT_STACKRandomization applied to:
Controlled via:
/proc/sys/kernel/randomize_va_spacepersonality(ADDR_NO_RANDOMIZE) flagdo_execveat_common()Main entry point for all exec variants.
Responsibilities:
linux_binprmbprm_execve()bprm_execve()Orchestrate the execution process.
Flow:
prepare_bprm_creds(bprm);
check_unsafe_exec(bprm);
current->in_execve = 1;
sched_exec();
security_bprm_creds_for_exec(bprm);
exec_binprm(bprm); // May not return on success
// Post-exec cleanup on failureexec_binprm()Search for and invoke appropriate binary format handler.
Features:
search_binary_handler()Try each registered binary format until one succeeds.
Algorithm:
read_lock(&binfmt_lock);
list_for_each_entry(fmt, &formats, lh) {
if (!try_module_get(fmt->module))
continue;
read_unlock(&binfmt_lock);
retval = fmt->load_binary(bprm);
read_lock(&binfmt_lock);
put_binfmt(fmt);
if (retval != -ENOEXEC)
return retval;
}
return -ENOEXEC;Function | Purpose |
| Allocate and initialize linux_binprm |
| Free bprm resources |
| Copy user-space strings to kernel |
| Copy kernel-space string |
| Count NULL-terminated string array |
| Commit to new executable (point of no return) |
| Setup memory layout for new executable |
| Final post-exec cleanup |
| Set current mm's binary format |
linux_binprm structurebprm->p)load_binary() handlerbegin_new_exec() called by loadercurrent->in_execveErrors returned to userspace normally:
-ENOENT: File not found-EACCES: Permission denied-ENOMEM: Out of memory-E2BIG: Argument list too long-ENOEXEC: Invalid executable format-ELOOP: Too many levels of symbolic links or interpretersCannot return to userspace safely:
trace_sched_process_exec(current, old_pid, bprm); // Scheduler trace
audit_bprm(bprm); // Audit subsystem
ptrace_event(PTRACE_EVENT_EXEC, old_vpid); // Ptrace notification
proc_exec_connector(current); // Proc connectorsched_exec(): Optimize CPU selection after execsched_mm_cid_before_execve(): Save concurrency IDsched_mm_cid_after_execve(): Restore concurrency IDConfig Option | Purpose |
| Enable ELF binary support |
| Enable #! script support |
| Enable custom format handlers |
| Enable flat binary support |
| Enable core dump support |
| Enable 32-bit compatibility |
| Enable /proc/sys/fs/suid_dumpable |
| Enable exec unit tests |
Cause: Total size of arguments + environment exceeds stack limit
Solution:
RLIMIT_STACKCause:
nosuidno_new_privs bit setDebugging:
mount | grep nosuid
cat /proc/self/status | grep NoNewPrivs
dmesg | grep AVC # SELinux denialsCause: Script interpreter chain exceeds 5 levels
Example:
#!/usr/bin/python3 # Level 1
#!/usr/bin/env python3 # Level 2 (via env)
...Solution: Reduce interpreter nesting depth
Cause:
Debugging:
File | Purpose |
| Main execve implementation |
| Binary format interfaces |
| ELF binary loader |
| Script interpreter handler |
| Custom format handler |
| Flat binary loader |
| Task structure definitions |
| Credential management |
| LSM hook definitions |
| x86 syscall interface |
Enable with CONFIG_EXEC_KUNIT_TEST=y:
#ifdef CONFIG_EXEC_KUNIT_TEST
#include "tests/exec_kunit.c"
#endif# Test basic execve
strace -e execve ./test_program
# Test with different formats
./elf_binary
./script.sh
echo "test" | binfmt_misc_handler
# Test SUID behavior
chmod u+s suid_program
ls -l suid_program
./suid_program
# Test resource limits
ulimit -s 8192 # 8MB stack
./large_argv_program# Profile execve latency
perf record -e syscalls:sys_enter_execve,syscalls:sys_exit_execve ./program
perf report
# Trace binary format selection
ftrace -p function_graph -g search_binary_handlerLast Updated: 2026-04-09
Maintainer: Linux Kernel Process Management Maintainers
Status: Stable - Core System Call