Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Linux adoption wikipedia , lookup
Plan 9 from Bell Labs wikipedia , lookup
Distributed operating system wikipedia , lookup
Unix security wikipedia , lookup
Security-focused operating system wikipedia , lookup
Burroughs MCP wikipedia , lookup
Spring (operating system) wikipedia , lookup
Linux kernel wikipedia , lookup
Thread (computing) wikipedia , lookup
Introduction to Embedded Systems Dr. Jerry Shiao, Silicon Valley University Spring 2014 SILICON VALLEY UNIVERSITY CONFIDENTIAL 1 Section 5 Kernel Internals Kernel Modules Functionality Object file that extends the running Kernel without rebuilding and rebooting Kernel. Loaded and unloaded upon demand. Drivers to support new hardware or filesystems. Support system calls. Module Organization Invoked by user application needing its service. Does not execute on its own. Allows application communicate to kernel: proc file system (i.e. /proc). System calls and ioctl calls. - Process Control (i.e. load/execute/create/terminate process, allocate/free memory. - File management (i.e. create/delete/open/close/read/write file). - Device Management (i.e. request/release/read/write device). - Information Maintenance (i.e. get/set system data, get/set device attributes). Spring 2014 module_init() and module_exit(). Module installation (Insmod, modprobe), removal (rmmod), list (lsmod). SILICON VALLEY UNIVERSITY CONFIDENTIAL 2 Section 5 Kernel Internals Kernel Modules Without Loadable Kernel Modules Need large, monolithic kernel for all possible functionality. Kernel code memory not fragmented. LKM load/unload fragments Kernel code memory. Rebuild and reboot required for new functionality. Waste memory on unused modules. NOTE: With embedded systems, hardware will not change. During embedded system development, modules are loadable and for production, modules are made static and placed in the kernel. License Issues LKM are derived work of the Kernel. “Tainted” Kernel when non-GPL LKM loaded (code not available to public). MODULE_LICENSE macro for GPL. EXPORT_SYMBOL_GPL macro resolves ONLY for GPL modules. Configuration .config file. CONFIG_INSMOD=y CONFIG_RMMOD=y CONFIG_LSMOD=y /lib/modules ( extension “.ko” ). Spring 2014 SILICON VALLEY UNIVERSITY CONFIDENTIAL 3 Section 5 Kernel Internals Kernel Modules Loading Modules Into the Kernel. Kernel Module: request_module() System call execs modprobe Insmod: Kernel Initialization (/etc/rc.d scripts): depmod –a -- Create dependency file to be used by modprobe. Spring 2014 /etc/modprobe.conf -- Aliase to module. … alias eth0 e100 kmod subsystem: /etc/modules/<version>/modules.dep -- Dependency before loading. … /lib/modules/…/e100.ko: /lib/modules/…/mii.ko /lib/modules/…/ext2.ko: /lib/modules/…/mbcache.ko SILICON VALLEY UNIVERSITY CONFIDENTIAL 4 Section 5 Kernel Internals Kernel Modules System Call – Explicit request to the kernel made through a software interrupt. Service provided in the kernel, cross the user-space / kernel boundary. Linux C library – Provides the wrapper routines to issue a system call. Wrapper routine Load arguments: 1) Register eax = system call index. 2) Software Interrupt (Interrupt 0x80). main() { xyz() { … … xyz() “SYSCALL” } } User Application system_call() sys_xyz() { … … sys_call_table[] -> sys_xyz() } … “SYSEXIT” Linux C library System Call Handler User Mode Spring 2014 System Call Service Routine Kernel Mode SILICON VALLEY UNIVERSITY CONFIDENTIAL 5 Section 5 Kernel Internals Kernel Modules #include <linux/module.h> #include <linux/kernel.h> … MODULE_LICENSE(“GPL”); MODULE_DESCRIPTION(“Mytime Module”); MODULE_AUTHOR(“Linux:Embedded Class (SVU). … Int init_mytime_module( void ) { printk(KERN_ALERT "Hello, world\n"); … return ret; } /* init_mytime_module */ Identify whether the module is GPL. NOTE: EXPORT_SYMBOL_GPL() only available to modules with GPL compatible license. Describes what module does. Author of the module. Int cleanup_mytime_module( void ) { … return ret; } /* cleanup_my_time_module */ Kernel module initialization entry point. Modules only in kernel space. Called by insmod. module_init( init_mytime_module ); module_exit( cleanup_mytime_module); Spring 2014 Required for all modules. Defines for MODULE_XXX macros. Contained in ELF header in .modinfo section. Kernel module exit entry point. Called by rmmod. SILICON VALLEY UNIVERSITY CONFIDENTIAL 6 Section 5 Kernel Internals Linux Device Drivers Provides access to hardware devices, hiding details through set of standardized APIs independent to specific driver. Drivers provide APIs through system calls or through ioctl for the user to access the actual hardware device. Independent modules separate from Kernel. Dynamic Loadable Module linkable and removable at runtime. /dev directory contains all hardware devices accessable through file system abstraction (required before device driver can access). Device driver catagories. Character Drivers. Block Drivers. Does not need buffering, byte I/O. Accessed as stream of bytes. Do not need fixed block size. /dev/console, /dev/ttyS0 Buffers to respond to requests. Send and receive through blocks. The block size is determined by driver module. Usually a filesystem (i.e. RAMDisk, USB stick). Network Drivers. Spring 2014 Interface is hardware device for transmit/receive packets, but could be software. Network subsystem, not in /dev. Accessed through name (i.e. eth0). SILICON VALLEY UNIVERSITY CONFIDENTIAL 7 Section 5 Kernel Internals Linux Device Drivers Spring 2014 SILICON VALLEY UNIVERSITY CONFIDENTIAL 8 Section 5 Kernel Internals Linux Device Drivers Major number used by kernel to identify device driver. Minor number used by device driver to distinguish devices of the same type. Hard drive multiple partitions, each partition minor number. Multiple serial ports. mknod command creates the device file in /dev. Udev manages /dev directory. Automatically selects the major numbers. Prevents clutter of /dev with unused devices. Only present devices in /dev. Linux Assigned Name and Number Authority (LANANA) to get major number. /sys initially created to perform driver level debugging out of /proc. /sys became useful to other subsystems. Contain exported information about devices and drivers in user space. /sys/devices are all devices in system. /sys/class shows devices grouped according to classes. /sys/block contains the block devices. Spring 2014 SILICON VALLEY UNIVERSITY CONFIDENTIAL 9 Section 5 Kernel Internals Linux Char Device Driver Components #include <linux/module.h> #include <linux/kernel.h> … MODULE_LICENSE(“GPL”); MODULE_DESCRIPTION(“Device Driver”); MODULE_AUTHOR(“Linux:Embedded Class (SVU). … Int init_device_driver( void ) { major = register_chrdev(major, “foo”, &fops); … } /* init_device_driver */ Int cleanup_device_driver( void ) { unregister_chrdev(major, “foo”); … } /* cleanup_device_driver */ module_init( init_device_driver ); module_exit( cleanup_device_driver); Spring 2014 Character device created using: #mknod /dev/foo c <major> 0 Registers device driver for character device. “fops” is structure with callback functions of file operations for this device. struct file_operations fops = { .owner = THIS_MODULE, .open = <device_open_cb>, .release = <device_release _cb>, .read = <device_read_cb>, .write = <device_write_cb>, .ioctl = <device _ioctl_cb> } Unregisters device driver by releasing the major number for character device, “foo”. SILICON VALLEY UNIVERSITY CONFIDENTIAL 10 Section 5 Kernel Internals Linux Block Device Driver Components … MODULE_LICENSE(“GPL”); … Int init_device_driver( void ) { major = register_blkdev(major, “foo”); foo.gd = alloc_disk( 16 ); foo.gd->major = major; foo.gd->minors = 16; set_capacity(foo.gd, <disk_size_in_sectors>); strcpy(foo.gd->disk_name, “foo”); foo.gd->fops = &foo_ops; … } /* init_device_driver */ Block device created using: #mknod /dev/foo b <major> 0 Gendisk structure created, with 15 partition descriptors. Initialize gendisk structure capacity. Registers device driver for block device. “foo_ops” is structure with callback functions of file operations for this device. Int cleanup_device_driver( void ) { unregister_blkdev(major, “foo”); … } /* cleanup_device_driver */ struct file_operations foo_ops = { .owner = THIS_MODULE, .open = <device_open_cb>, .release = <device_release _cb>, .read = <device_read_cb>, .write = <device_write_cb>, .ioctl = <device _ioctl_cb> } module_init( init_device_driver ); module_exit( cleanup_device_driver); Unregisters device driver by releasing the major number for block device, “foo”. Spring 2014 SILICON VALLEY UNIVERSITY CONFIDENTIAL 11 Section 5 Kernel Internals Assignment 5: Kernel Modules /proc file system is used to store many system configuration parameters. Use the /proc file system to return the number of clock ticks since the system initialized. Kernel module to create /proc file structure (struct proc_dir_entry) with file system read/write APIs. Kernel variable jiffies represents the number of clock ticks since system initialized. Kernel variable HZ represents the clock tick rate, the frequency of clock ticks per second (i.e. HZ = 250 means hardware will have 250 ticks/second or 4 milliseconds per tick. User application use system calls (i.e. read/write) to interface with kernel module. Insmod installs loadable kernel module into running kernel. Spring 2014 SILICON VALLEY UNIVERSITY CONFIDENTIAL 12 Section 6 Kernel Scheduling Kernel processes heart of the Linux system. Process and files are fundamental concept in Unix Operating System. Process is a program in execution. Consists of: Executing program code. Set of resources (i.e. open files). Internal kernel data (i.e. struct task_struct). Address space. One or more threads of execution. Data section (i.e. global variables). Spring 2014 SILICON VALLEY UNIVERSITY CONFIDENTIAL 13 Section 6 Kernel Scheduling Linux multitasking operating system (tasks running concurrently). Process is an independently running program that has its own set of resources (i.e. output file). Process managed with process descriptors (struct task_struct) in circular doubly-linked list. Use light weight processes or threads to support multithreaded operation. Light weight processes share same address space with each other and parent process. Communication simple. Heavy weight processes contains its own address space. Communication through IPCs. Linux process state: TASK_RUNNING: Executing in CPU or waiting. TASK_INTERRUPIBLE: Suspended waiting on some condition. TASK_UNINTERRUPTIBLE: Suspended until a given event. TASK_STOPPED: Stopped from receiving signal. TASK_ZOMBIES: Child process terminated, but not waited on by its parent. Does not release its process ID. Spring 2014 SILICON VALLEY UNIVERSITY CONFIDENTIAL 14 Section 6 Kernel Scheduling ps (Process Status) command: [sau@localhost ~]$ ps –f UID PID PPID STIME … hermie 24 1 00:35:28 hermie 137 24 00:36:39 … TTY TIME CMD pts/1 0:01 pts/1 0:00 [sau@localhost ~]$ ps –ax PID TTY STAT TIME COMMAND … 181 pts/1 S 0:01 -bash 231 ? T 0:07 emacs 274 pts/1 R 0:00 ps … Spring 2014 bash ps –f UID – Owning user. PID – Process ID (used in command “kill”). PPID – Parent process ID. STIME – When the task started. TTY – Controlling terminal or running from a shell. STAT – State. S = Shell is suspended. T = Running, but suspended. R = Running. TIME – CPU time used so far. CMD – Command running. PID – Process ID (used in command “kill”). TTY – Controlling terminal or running from a shell. STAT – State. S = Shell is suspended. T = Running, but suspended. R = Running. TIME – CPU time used so far. COMMAND – Command running. SILICON VALLEY UNIVERSITY CONFIDENTIAL 15 Section 6 Kernel Scheduling Process Scheduling Scheduler responsible for best utilizing the processor time by deciding which processor to run next from a set of runnable processes. Scheduling gives impression that multiple processes are executing concurrently. Multitasking operating systems has two options: Preemptive multitasking. Process involuntary suspended. Scheduler decides when process is context switched through time slice. Scheduler manages the time slice, prevents process monopolizing CPU. Interrupts will cause context switch. Cooperative multitasking. Process must voluntarily suspend (i.e. driver’s responsibility). Scheduler does not decide when process is switched. Process can monopolize CPU, hang the system. Group Scheduling by userid or groups of processes to not starve processes. Real time systems are multitasked with key activities taking a greater share of process time. Spring 2014 SILICON VALLEY UNIVERSITY CONFIDENTIAL 16 Section 6 Kernel Scheduling Scheduling Policy Objectives: Set of rules determining when and how a process is to run. Fast process response time, good throughput background jobs, avoid process starvation, reconcilitate low and high priority processes. Time sharing principle based on timer interrupts. Each process has time slice or quantum. When quantum expires, context switch occurs (process appear to run concurrently). Epoch is time when all runnable processes has exceeded their quantum. Timer interrupts used by Linux to measure process quantum. Quantum Linux 2.4: 210ms, Linux 2.6: 100ms. Linux dynamically changes process priority. Priority adjusted higher for releasing CPU before quantum (I/Obound process). Priority adjusted lower for using entire quantum (CPU-bound process). Process dynamically changes its own priority: nice() and setpriority(). Spring 2014 Device driver executing I/O and expects to take a long time. SILICON VALLEY UNIVERSITY CONFIDENTIAL 17 Section 6 Kernel Scheduling Scheduling Policy Scheduling Algorithmn epoch Process_2 Process_3 Quantum_2 Quantum_3 Process_3 Process_1 Quantum_3 Process_1 Process_1 Quantum_1 Quantum_1 Quantum_1 o Epoch is division of CPU time or a period of time o Every process has a maximum time quantum computed at beginning of each epoch. o When process waiting for I/O its quantum suspended and can be used again in the epoch until its fully used. o Epoch ends when all runnable processes have used all of their quantum. Spring 2014 SILICON VALLEY UNIVERSITY CONFIDENTIAL 18 Section 6 Kernel Scheduling Scheduling Policy Process Priority Static priorities: Real Time processes. Range 1 to 99. Always higher than dynamic priority of other processes. Dynamic priorities: Other processes. I/O and CPU bound based on time used. Sum of the base time quantum and the number of “clock ticks” left in the current epoch. I/O-bound priority bonus ( -5 ). CPU-bound priority bonus ( +5 ). Process Quantum Process assigned at least base time quantum. Process does not use its entire quantum, bonus calculated from unused quantum carried over into the next epoch. Spring 2014 Increased quantum used to calculate priority in next epoch. When a new child process is created, the parent’s quantum is split in two halves, one for the parent and one for the child. SILICON VALLEY UNIVERSITY CONFIDENTIAL 19 Section 6 Kernel Scheduling Scheduling Policy Process Priority Normal process static priority range 100 – 139 New process inherits static priority of its parent. Dynamic priorities: Process either I/O and CPU bound based on time used or the average sleep time. Real-Time process static priority range 1 – 99. Spring 2014 Used by scheduler to select the next process to run. Max(100, min (static priority – bonus + 5, 139)) Bonus determined by the average sleep time. Sleep 0-99 msec, bonus = 0. Sleep 100-199 msec, bonus = 1. … Sleep 900-1000 msec, bonus = 10 .Highest process priority: max( 100, min( 100-10+5, 139 ) ) = 100. Lowest process priority: max( 100, min( 100-0+5, 139 ) ) = 139. CPU-bound priority bonus ( static priority – 0 + 5 ) is +5. I/O-bound priority bonus ( static priority – 10 + 5 ) is -5. Always higher than dynamic priority of normal processes. SILICON VALLEY UNIVERSITY CONFIDENTIAL 20 Section 6 Kernel Scheduling Scheduling Policy Process Classification: Interactive Process. Interactive processes are I/O-bound, spending a lot of time waiting for I/O operations. Interactive processes waiting for human input. Response time must be quick (latency 50-150ms). Command shells, text editors, graphical applications, mouse and keyboard processes. Processes heavily utilizing I/O (i.e. hard drive, CD). Spring 2014 I/O bound process “blocked”, pending arrival of data, will receive interrupt when data has arrived and continue. Higher priority access to CPU. Scheduled multiple times within epoch. CPU released before quantum completes. SILICON VALLEY UNIVERSITY CONFIDENTIAL 21 Section 6 Kernel Scheduling Scheduling Policy Process Classification: Batch Processes. Batch processes are CPU-bound, heavily ultilize CPU time for computations. Batch processes typically executing in background, i.e. compilers, database search engines, and scientific computiations. No user interactions. Lower priority access to CPU. Preempted by I/O-bound process. Scheduled less than I/O-bound process within epoch. CPU usually ultilized up to quantum. Spring 2014 SILICON VALLEY UNIVERSITY CONFIDENTIAL 22 Section 6 Kernel Scheduling Scheduling Policy Process Classification: Real-time Processes. Real-time processes highest priority, always higher than I/O or Batch processes. “Hard” real-time require bounded response time (absolute deadline must be meet to guarantee deterministic behavior). Linux is “soft” real-time. Allows time tolerance when event occurs in effort to minimize latency. Internal latency cannot guarantee deterministic behavior. Streaming video/audio, robot controllers, programs collecting data from physical sensors. Spring 2014 SILICON VALLEY UNIVERSITY CONFIDENTIAL 23 Section 6 Kernel Scheduling Linux 2.6 Scheduler Algorithm Runqueues Keep track of all runnable tasks for a CPU. Two priority arrays, active and expired array. Tasks at same priority are RR scheduled Dynamic priority changes based on CPU usage. Special threads: idle thread and migration thread. runqueue structure Spring 2014 *curr: Pointer to the currently running task. *idle: Pointer to the CPU’s idle task. cpu_load: CPU load info used by Migration thread to balance CPU loads by scheduling threads to other CPUs. *active: Pointer to active priority array containing tasks with unexpired quantums. *expired: Pointer to expired priority array containing tasks with used up quantums. arrays[]: Actual active and expired priority arrays. *migration_thread: Thread that handles task migrations. Every 200 ms, checks to see whether the CPU loads are unbalanced. SILICON VALLEY UNIVERSITY CONFIDENTIAL 24 Section 6 Kernel Scheduling Linux 2.6 Scheduler Algorithm Priority Queues Priority arrays are the data structure that provide O(1) scheduling. 140 priority levels. Lower the value, higher the priority. Two priority-ordered arrays (active and expired) per CPU managed by runqueue structure. Active/Expired Array elements linked list of task_structs. queue structure. Spring 2014 Transition between active and expired handled by moving pointers in runqueue structure. When each task's quantum reaches zero, its quantum is recalculated before it is moved to the expired array. Multiple tasks at same priority scheduled round robin. bitmap[]: Bitmap represents the priorities for which active tasks exist in the priority array. *list_head_queue[]: Array of linked lists. One list in the array for each priority level (140 priority levels). SILICON VALLEY UNIVERSITY CONFIDENTIAL 25 Section 6 Kernel Scheduling Linux 2.6 Scheduler Algorithm runqueue per CPU. runqueue Active Priority 140 Bitmap 1 0 Task 1 0 Task 2 ... 3 2 1 ... 0 0 1 ... Task N Priority: 1 ... Expired Task 3 Task 4 Task 5 Task 6 Task 7 Task 68 Task M ... Task Z Task X Priority: 140 Priority: 1 Priority: 140 Migration Thread Spring 2014 SILICON VALLEY UNIVERSITY CONFIDENTIAL 26 Section 6 Kernel Scheduling Linux 2.6 Scheduler Algorithm Linux 2.6.23 CFS (Completely Fair Scheduler) Fair queuing algorithm. Aim to maximize overall CPU utilization, especially interactive performance. Uses Red-Black tree, instead of runqueues. Red-Black tree implements “time-line” of future task execution. Spent processor time as key, efficiently finding process with least amount of time (leftmost node). Nanosecond granularity does not need dynamic +- bonus priority adjustments. Choosing process is O(1). Inserting process is O(log N), because of RB tree. Spring 2014 Uses same concept of rewarding sleeping or waiting processes. SILICON VALLEY UNIVERSITY CONFIDENTIAL 27 Section 6 Kernel Scheduling Linux 2.6 Scheduler Algorithm schedule() Kernel Function Implements the Linux scheduler, finds a process in the runqueue and assigns CPU. Direct Invocation Lazy Invocation Spring 2014 Current process is blocking (sleeping), waiting for resource. Current process state changed to TASK_INTERRUPTIBLE or TASK_UNINTERRUPTIBLE and placed in wait queue. schedule() called. Resource becomes available, process removed from wait queue. Device driver checks for long iterations, and schedules if too long. Current process used up its quantum. Process added to the runqueue with higher priority than currently executing process (i.e. wake_up_process()). Process calls sched_yield() to release CPU. SILICON VALLEY UNIVERSITY CONFIDENTIAL 28 Section 6 Kernel Scheduling Linux 2.6 Scheduler Algorithm schedule() Kernel Function Handles Linux internal “ housekeeping” tasks (i.e. accounting, maintaining system uptime). Scans the runqueue for highest priority process. If all processes used up quantum, start new epoch. Assign new quantum for all processes. Found process with higher priority than current process. Scheduler performs context switch. Spring 2014 Real-time process: Current process state is SCHED_RR and quantum used up, provide new quantum and place at end of runqueue. No runnable process in runqueue, for multi-processor system, attempt load balancing. Interrupt handler bottom-half processing. SILICON VALLEY UNIVERSITY CONFIDENTIAL 29 Section 6 Kernel Scheduling Linux 2.6 Scheduler Algorithm Scheduling Classes SCHED_FIFO: Real-time FIFO scheduling preempts other processes. Does not have quantum. Runs until: SCHED_RR: Real-time RR scheduling similar with SCHED_FIFO, but with quantums and can be preempted by SCHED_FIFO process. SCHED_NORMAL: Conventional time-shared process. NonRealtime. SCHED_BATCH: Similar to SCHED_NORMAL, but process cannot be considered interactive process (quantum stays the same). Spring 2014 Process completes. Blocked by an I/O call. Higher priority SCHED_FIFO process becomes runnable. sched_yield() system calls. Group scheduling reserves minimum time for other processes (prevent FIFO process locking CPU). Always background process (i.e. tape backup process). SILICON VALLEY UNIVERSITY CONFIDENTIAL 30