Implement a simple Embedded Operating System

Source: Internet
Author: User
Implement a simple Embedded Operating System (1)  

Implement an embedded operating system that can't do anything

1. First determine the CPU. Here we use embedded CPU for simplicity, such as the arm series. The reason why we use a short Instruction Set)
The convenience of a type of CPU is that there is no distinction between the real mode and the protection mode, and linear unified addressing is adopted, that is, the segment is not required
Page-based memory management, or the chip is integrated with some common peripheral controllers, such as the ethernet card and serial port.
So many peripheral chips on the PC Motherboard
2. Determine the modules and functions to be implemented. For simplicity, only multi-task scheduling (but limited, for example, up to 10) can be implemented.
Current Interrupt Processing (the interrupt priority is not supported). Dynamic shell interaction is not performed, dynamic module loading is not implemented, and fork is not implemented.
(That is to say, to add user programs to your operating system, you can only statically compile them into the kernel; no
Supports file systems, network, PCI, USB, disks, and other peripherals (except for serial ports, which are the easiest way to use serial ports ),
Virtual Memory Management is not supported (that is, every process in a multi-task can access any address. In this way, a program
The operating system is finished)
3. Determine the compiler to use. GCC is used here, and the file adopts the ELF format. Of course, the final file is the bin format, GCC and
Linux is closely related to its own operating system, which requires C library support and System Call support. Therefore, you need to crop C library by yourself,
Implement system calls by yourself
4. Implementation steps: first, select the CPU and establish a cross-compiling environment. Then, write Bootloader and the operating system.

Implement a simple Embedded Operating System (2)  

How to Implement bootloader

1. one of the reasons for implementing a dedicated bootloader is to better transplant and upgrade itself, and the other is to facilitate debugging of the operating system. Of course, you can integrate the functions related to the operating system.
2. determine the functions to be completed by a simple Bootloader: Here we only need to complete two main functions, one is to load the operating system to the memory for running, the second is to solidify itself and the operating system kernel into the ROM storage area (here the ROM can be a lot of devices, such as flash in the embedded chip, floppy disk on the PC, U disk, hard disk, etc)

3. Compilation of Bootloader:
Step 1: make initial implementations of related hardware, for example, on the embedded board at91rm9200 (this chip will be used in the future, mainly because I am familiar with this chip, ), first, switch the CPU mode to the system mode, disable system interruption, disable the watchdog, perform memory region ing based on the actual situation, and initialize the memory control area, including parameters related to the memory used, refresh frequency, etc. Second, set the system operating frequency, including using an external crystal oscillator and setting
CPU frequency, bus frequency, and external device frequency. Third, set the system interruption, including the timer interruption, whether to use FIQ interrupt, external interruption, and the interrupt priority. Here, only two priorities are implemented, and only the clock interruption level is higher, the rest are the same, and the interrupt vectors point these interrupt vectors to 0x18 during initialization, and disable all the interruptions here. If the Board still has a flash device, you also need to set the flash-related operating registers. Fourth, You need to disable the cache. So far, the chip-related content will be initialized.

Step 2: There is a difference between the interrupt vector table of arm and the interrupt vector table of PC chip. For the sake of simplicity, when the interrupt occurs, the CPU directly jumps into a part of the area starting from 0x0 (the ARM chip itself determines that it will jump into the area starting from 0x0 when it is interrupted, the specific address to jump to is determined by the interrupt mode. Generally, reset interrupt, Fiq, IRQ interrupt, SWI interrupt, command exception interrupt, and Data Exception interrupt are used, prefetch command exception interrupt), and when the CPU enters the corresponding vector table starting from 0x0, the user needs to program to take over the interrupt handler, this requires you to write the interrupt vector table by yourself. The interrupt vector table stores some jump commands. For example, when an IRQ interrupt occurs on the CPU, it automatically jumps to 0x18, here is a jump command compiled by the user. if the user has compiled a command to jump to 0x20010000, the address is a total IRQ interrupt processing entry, one CPU may have multiple IRQ interruptions. How can we differentiate different interruptions at this general entrance? It is determined by user programming. For specific implementation, see the relevant section later. The interrupt vector table generally uses a vector. s file, of course, how to name it is your favorite, but one thing you need to declare is that it must be located at 0x0 during the link.

Step 3: Set the stack. Generally, three stacks are used. One is the IRQ stack and the other is the stack in the system mode (the register and memory space are shared in the system mode and in the user mode, this is mainly for simplicity). The purpose of setting the stack is mainly to call functions and store local variables. It is impossible to use the whole assembly, nor to use local variables.

Step 4: Copy all future code segments and data segments to the memory and clear the BSS segments.

Step 5: Initialize the serial port (mainly for user interaction and file transmission with PC). The initialization of Flash stores the boot and kernel in Flash ), write the Flash Driver (the driver here is different from what we usually call the driver, because Flash is not like the SDRAM, as long as the relevant controller is set, you can directly read and write the data of the specified address, the Flash write operation is performed on a piece of data, rather than writing one byte or one byte. For details, refer to the relevant materials)
Step 6: Wait for a certain number of seconds to receive user input. If the user does not enter any characters within the specified number of seconds
Boot starts to read all the data in the kernel to the memory at the specified position in Flash (which can be specified by the user) (specifically, the location in the memory is determined by the user, you can also use Linux or the like, that is, add a 0x8000 place at the starting position of the memory) to jump to the first code in the kernel ); if you type a character in the specified number of seconds (this is mainly for convenient development, if you do not need this code after the development is finalized), then you can interact with the user on the serial port, accept the user input commands in the serial port, such as the user requires to download files in the Flash specified location, the specific content can refer to the U-BOOT and other open source projects so far, boot part has been completed, this boot is very simple, just to solidify the files uploaded by the PC into flash, and then load the operating system kernel part of flash into the memory, and give the CPU control to the operating system. The next page explains how to write the simplest operating system !!!!

Implement the simplest Embedded Operating System (3)  

How to Implement the simplest Operating System

For simplicity, we do not consider portability, do not receive parameters from the boot section, or perform hardware detection,
You do not need to relocate data segments or code segments. I only read the Linux kernel and did not implement it myself.
An operating system, so what I'm talking about below is just a conceptual thing:

1. Take over the Interrupt Processing of the system. Because the Boot Code determines the interrupt vector table, it determines the system interrupt.
But the boot does not know the location of the interrupt processing function of the operating system. What should I do?
There are several methods, one of which is: If your board can remap the address, that is, the location where the memory stick is located.
Remap to start from 0x0. When the kernel is connected, the interrupt vector table of the operating system is located at 0x0.
The booting operation is completed at the end of the bootloader boot, and the CPU jumps to 0x0 for execution.
I don't know what to do with the function, but I think of a compromise, that is, the bootloader is started.
(That is, when the CPU control is handed over to the operating system kernel), re-rewrite the 0x0 area of flash, that is, the operation
The interrupt of the system kernel is written to 0x0 in the Flash area. For example, when an IRQ occurs, the CPU decides
Jump to 0x18 (assume that flash occupies the address bus 0x0 to 0x0fffffff, and the memory occupies 0x20000000 to 0x2fffffff)
In the end, bootloader changes the code at 0x18 to the code at 0x20000000 plus the address at 0x18.
The address is the relevant jump instruction in the interrupt vector table of the kernel, which is equivalent to the IRQ processing function associated with the kernel.
When the system is powered on again
The interrupt vector table has been modified, unless the boot itself is not interrupted.
Feature interruption required

2. For the sake of simplicity, you do not need to create a page table or perform other operations without using paging memory management.
The system stack settings are the same as those for boot. Then, the BSS segment is cleared. Here, the BSS segment
It refers to the BSS segment of the operating system. the same meaning as the BSS segment of boot is only used in different places.
Jump into the main function.

3. To minimize the possibility of simplicity, use static Task Structure arrays. For example, to create only ten tasks, you must first
Allocate segments of memory for the ten task structures, which can be allocated on the heap (the allocated memory will not be available until the end of the Operating System
Released. You can also specify a memory area that is not used in other parts of the operating system.
It is a bit of a layman's taste, while the pointer to the operator structure array is a global variable, which is stored in the BSS segment or data segment ),
Since a system Stack has been allocated in the previous step, we will share these 10 tasks in the overall stack area.
The focus here is that if you define the structure in the array of each task structure, you can refer to the relevant sections of Linux to design

4. interrupt handling: in step 1, several types of Interrupt jump addresses related to the CPU have been determined, and the same type
There is only one entry address for the interrupt. Here, the interrupt processing will complete several actions:
First, the stack operation, including all registers, is the IRQ stack set in step 2,
Second, all screen interruptions. Oh, for the sake of simplicity, it is not allowed to be interrupted again during the process of interruption.
3. Read the interrupt-related registers to identify what was interrupted and jump into the related Interrupt Processing letter.
Number-based execution (there are only two types of interruptions here, one is clock interruption, the other is SWI interruption, that is, the so-called
4. Wait until the interrupt processing is completed, enable the interrupt and exit the stack, restore the scene, and assign the CPU control to the interrupted
1. In Mian, you must first determine the interruptions to be handled by the entire system, that is, the interruptions to be handled.
Function, and then write the interrupt processing function here.
Second, this operating system does not process virtual memory, and it does not even handle CPU exceptions (everything is simple). Once
If an exception occurs, the system crashes.

5. For timer implementation, first determine the time slice. In order to make the system more stable, and we do not need real-time functions
The time slice may be set to a little longer. For example, if we want a task to run 20 tick messages
Frequency to determine the millisecond used by each system tick. Here, the system timer is interrupted once in 5 milliseconds, then
You need to write the clock register. For details, refer to the chip information. After calculation, a task can run for up to 100 milliseconds.
Note: our operating system does not support kernel preemption and only supports two levels of Interrupt priority, that is, only the clock
The interrupt priority is a little higher, and the other priorities are all a little lower, but this function is removed in the Interrupt Processing Section.
Interruption is prohibited as soon as the interrupt processing is started. Therefore, no matter how high the priority of other interruptions is, it is useless.
The advantage is simple, but the disadvantage is obvious, especially when the related interrupt processing function enters an endless loop
The whole system is dead, and the time slice becomes inaccurate. No real-time or real-time clock is needed.
For how to set the interrupt priority, see the chip information.

6. The implementation of process scheduling, namely the do_timer function (clock interrupt processing function), has a global variable pointer,
It points to the current task structure array (or linked list). When the clock is interrupted, it enters this function and first judges
Whether or not the time slice in the task struct is used up. If it is not used up, it will be reduced by one, and then the interruption will be exited, so that the CPU will continue to run when
If the time slice is used up, reset the time slice and search for the next wait in any structure array.
If a running task is found, it is switched to the new task. For how to switch, see the next page.
Switch to the idle task (similar to Linux, haha, all the processing is to imitate Linux, because I am too good
). Note: For simplicity, the task priority is not implemented or the task is not implemented.
Sleep, etc., that is, as long as ten tasks are determined statically, the ten tasks are executed one by one in sequence.
In addition, each task cannot end. That is to say, the last code in each process must use an endless loop. Otherwise
Then the system runs.) Another point is that the process does not support signals, and there is no sleep or wake-up operation. The CPU is
The CPU is not human, so human rights are not needed !!! Is this scheduling simple?
The ticket cannot be simpler ?????!!!!

7. The serial port is not interrupted, which is the most likely to reduce the difficulty. The serial port is accessed through inquiry (when
However, it is a blocking method, and only the write method is supported and the read is not allowed, because the interruption Method is required during the read,
This is because the polling method is not good, that is, when reading, it is possible that the time slice of the current process is used
When the system switches to another process, the data you input in the serial port of the PC is discarded.
It's easy)

8. The last step is the last part of the mian function. This process is treated as an idle process (equivalent to modifying the task structure ).
Data in the array), enable the interruption, and add the current process to an endless loop to prevent it from exiting.

9. compile your Bootloader and kernel, burn it to flash, and debug it repeatedly.

10. Connect the serial port of your at91rm9200 (or other similar chips) to the PC and open the Super Terminal,
Turn on the board power, maybe your operating system will print "Hello, world !!! One of the simplest operations
The system is out.

The next page is implemented by specific functional modules.

Implement a simple Embedded Operating System (4)  

Task Structure array (or linked list) Implementation

The structure of our task is in the form of a linked list, but its length is limited. The header pointer is a global pointer variable (
The pointer variable is an unsigned integer pointer. Its pointer address is in the BSS segment, but it points
Memory allocated to the heap). Use kmalloc to allocate the kernel memory. You must write the kmalloc function by yourself.
For simplicity, this function only accepts one parameter, that is, the size to be allocated. This function is very easy to do. First
There is a global needle that points to the starting position of the entire heap during initialization and has a fixed size, which is called the kernel.
Stack, after the kernel stack, is the user stack. Because there are a total of 10 tasks, of course, excluding the tasks of the kernel itself,
Therefore, the entire stack is evenly divided into eleven parts. Note: After all tasks are initialized, another step is
Moving the kernel task to the user State is equivalent to modifying the stack pointer of the task structure ),
To determine whether the size exceeds the allocable range of the kernel heap, you need to maintain the heap of the Kernel Heap and other tasks,
You need to partition and have a global memory usage identifier. Use an array. It is simple. 0 indicates the corresponding memory.
If some items are not occupied, 1 indicates that they are occupied, and the corresponding kfree is equivalent to setting the flag to 0 ),
Memory maintenance is complicated. For simplicity, it is set to 4 kb and cannot be applied for memory larger than 4 kb because
After 4 K, because there is no virtual address concept, it is impossible to achieve continuous Address allocation on the stack. Of course, it is allocated on the stack.
It can be larger than 4 K, and the stack is determined by the compiler and CPU.

The task structure includes:
1. The remaining time slice
2. Memory Address of the code segment pointed to by this task, which is also the function entry address.
3. The data segment address pointed to by this task. The data segment here is included in the entire kernel, so it is useless and reserved.
4. Whether the function body of this task exists or is scheduled
5. The stack pointer used by this task
6. Heap pointer used by this task
7. ID of the task. 0 indicates idle, and 1 indicates other processes.
8. values of all registers
9. The current Pc value is set to the function entry address during initialization.

First, we will explain the initialization of the task array structure:
Define a global pointer first, then forcibly convert the pointer to a task structure pointer, and use the kmalloc function in the kernel
The occupied heap (previously speaking, the starting point of the kernel heap is the starting point of the entire heap) allocates the memory occupied by ten task structures. Here
It will never exceed 4 K and assign values to the ten task structures. Set the first task to idle, the time slice to 20, the memory address of the code segment to the address of the main function, and the data segment address is ignored, the function body exists and can be scheduled. The position pointed by the stack pointer is calculated as follows:
Assuming that each stack can be used for each task is set to 64 K, and the starting position of the entire heap is 0x20030000, the first heap Pointer Points to 0x20030000, the stack is 0x20030000 + 64 K, and the second is later, and so on.
Note: Before initializing the task structure, the system is not allowed to use the heap, but the stack can be used, so the kernel Task Stack is divided
Two, before scheduling, the stack is set in step 2 on the previous page, so you have to note when setting the stack on the previous page.
It is necessary to set the stack space to ten 64kb, and use the previously largest possible stack space in this step.

Next, let's explain what to do during Task Switching:
When entering the entire interrupt processing entry, all registers will be pushed into the IRQ stack, and the value will be copied to the corresponding fields of the current task structure, extract the current Pc value of the interrupted process and store it in the corresponding field in the current task structure. Next, identify the interrupt type to enter the corresponding interrupt processing function, here we will enter the do_timer function, and the following is the process after entering this function:
There is also a global pointer in the kernel, which is the current task pointer, which is also in the system BSS segment. Its definition is the same as the global pointer in the previous step. When the system clock is interrupted, the global pointer is taken out. After the initialization is complete in the previous step, the pointer is directed to the position where the first task structure is located, that is, 0x20030000. Then, the time slice field in the task structure is taken out, if it is 0 or not, save the stack pointer in user mode to the current task structure and save the heap pointer, search for the structure of the task that can be scheduled, assign the structure to the current task pointer, and set the identifier for Task Switching. This identifier is also a global variable, however, it is assigned an initial value and will be placed in the Data Segment of the entire system. The do_timer function is returned. If the value is not 0, perform the following operations:
Subtract one time slice, return to the do_timer function, and then judge the task switching identifier. If it is 0, perform the following operations:
No task switching is required. All registers exit the stack (the stack here refers to the IRQ stack), enable the interrupt again, switch to the user mode, and load the current Pc value field in the current task structure, to exit the interrupt handler. If this ID is 1, perform the following operations:
Task switching is required to let all registers go out of the stack (here the stack refers to the IRQ stack) and send all
The memory value is restored to the corresponding register, and the stack pointer in the user State is restored to the stack pointer of the current task structure.
The heap pointer of the current task structure, restores the ID of the task to be switched to 0, re-enables the interruption, and switches to user mode. Task switching is implemented by loading the Pc value, that is, by loading the current Pc value field in the current task structure to exit the interrupt handler

Implementation of system calls

This system does not implement system calls. Because kernel and user mode are not protected
In your own C library, all functions are implemented like kmalloc. You can directly write the function prototype in the kernel.
Extended, let's talk about the system call, which is implemented using the malloc system call.

There is also a heap pointer (there is a heap pointer in front of kmalloc, but that heap pointer is a kernel task
The heap pointer is used for user State. It assigns the initial value before the system initialization is complete.
The value is the starting position of the heap used by the first task structure, that is, the 64 K position is added to the heap used by the kernel.
The implementation steps of the malloc function in the function library are as follows:
1. First, check whether the application size exceeds 4 K. If the application size exceeds 4 K, an error is returned.
2. Call the system (here _ syscall1 is used and only one parameter is passed (the size to be allocated)
Implementation of system call function _ syscall1:
1. Press the Register into the stack (the stack here points to the stack of the current task)
2. Set system call number 1 to R0 and the parameter to R1.
3. Send a SWI command to generate a SWI interrupt (that is, a soft interrupt, a trap)
When the system is interrupted, it will enter the SWI interrupt processing entry. The following describes the implementation of the SWI entry function.
1. Retrieve the R0 value, determine its value, and enter the corresponding branch processing code segment.
2. Enter the _ malloc processing code segment here, retrieve the value of R1, get the current heap pointer mentioned above, and apply for the corresponding number
The data block size is used in the corresponding field of the memory usage identifier. It puts the current heap pointer into r0, moves the current heap pointer, and changes the current
Switch the heap pointer of the service structure to the user State and return the processing result of SWI Interrupt System Call _ syscall1:
For simplicity, when the user State is returned from the kernel state, the task is no longer rescheduled, so the above steps are relatively simple.
1. When the return result is interrupted from SWI, the system runs in the user State. At this time, the R0 value is taken and assigned to the pointer to apply for memory.
2. The register pops up in the user State and returns to the previous function layer.
The return of the malloc function. Now the malloc function returns the pointer directly, and the entire malloc process is over.
Unified calling is similar to this process

So far, this operating system has been initially implemented, but it seems that nothing can be done. If you want to enable it to support serial port interruptions, you may be able to do something a little bit, for example, for functions like single-chip microcomputer, the difficulty of the entire system is Interrupt Processing and task switching. In this example, arm does not support CPU-level protection modes like 0x86, so when you perform task switching, You have to load the Pc value by yourself, it is the protection of register inbound and outbound stack. When an interrupt occurs, the Register must be protected. However, if you need to re-schedule the register, you must switch from the interrupt context to the process context, how can I switch from the interrupt context to the process context ?? The method I used here is clumsy:
1. First let the Register into the stack
2. Store the Register to the current task structure array, and save the Pc value of the interrupted process to the task structure.
3. Handle timer interruptions
4. If you want to switch between tasks, find the next schedulable process, and specify the structure of the current task.
Allows the Register to exit the stack, restores the values in the current task structure to the register, restores the stack pointer, and switches to the user State, resume the suspended process by loading the Pc value of the current task structure. Here, the task structure is used in the interrupt context, which is not used in Linux, the interrupt context and process context are two different concepts. The interrupt context cannot access the task structure in the process context. I really cannot find any way to implement process scheduling, so please refer to my article for a better method.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.