The simplest Embedded Operating System

Source: Internet
Author: User
Implement an embedded operating system that can't do anything

1. First determine the CPU. Here we use embedded CPU for simplicity, such as the arm series. The reason why we use a short Instruction Set)
The convenience of a type of CPU is that there is no distinction between the real mode and the protection mode, and linear unified addressing is adopted, that is, the segment is not required
Page-based memory management, or the chip is integrated with some common peripheral controllers, such as the ethernet card and serial port.
So many peripheral chips on the PC Motherboard
2. Determine the modules and functions to be implemented. For simplicity, only multi-task scheduling (but limited, for example, up to 10) can be implemented.
Current Interrupt Processing (the interrupt priority is not supported). Dynamic shell interaction is not performed, dynamic module loading is not implemented, and fork is not implemented.
(That is to say, to add user programs to your operating system, you can only statically compile them into the kernel; no
Supports file systems, network, PCI, USB, disks, and other peripherals (except for serial ports, which are the easiest way to use serial ports ),
Virtual Memory Management is not supported (that is, every process in a multi-task can access any address. In this way, a program
The operating system is finished)
3. Determine the compiler to use. GCC is used here, and the file adopts the ELF format. Of course, the final file is the bin format, GCC and
Linux is closely related to its own operating system, which requires C library support and System Call support. Therefore, you need to crop C library by yourself,
Implement system calls by yourself
4. Implementation steps: first, select the CPU and establish a cross-compiling environment. Then, write Bootloader and the operating system.
For more information about the process, see the next page.

Next page:

How to Implement bootloader

1. To implement a dedicated bootloader, one is to better transplant and upgrade itself, and the other is to facilitate the Operating System
Of course, you can fully integrate the functions related to the operating system.
2. Determine the functions to be completed by a simple Bootloader: Here we only need to complete two main functions, one is
Load to the memory to run, the second is to solidify yourself and the operating system kernel to the ROM storage area (here the ROM can be a lot of devices, compared
Such as flash in Embedded chips, floppy disks on PCs, USB flash drives, and hard disks)
3. Compilation of Bootloader:
Step 1: make initial implementations of related hardware, for example, on the embedded board at91rm9200 (this chip will be used in the future,
I am familiar with this chip, hey). I want to do the following work: Switch the CPU mode
System Mode: Disable system interruption, shut down the watchdog, map the memory area according to the actual situation, initialize the memory control area, package
Including parameters related to the memory used, refresh frequency, etc. Second, set the operating frequency of the system, including using the external crystal oscillator and setting
CPU frequency, bus frequency, and external device frequency. Third, set the system interruption, including the timer.
Disconnection, whether to use FIQ interrupt, external interrupt, and so on. There is also the interrupt priority setting. Here, only two priorities are implemented, only the clock
The interrupt level is higher. The rest is the same, and the interrupt vectors point these interrupt vectors at 0x18 during initialization and close all
If the Board is still connected to a flash device, you also need to set a flash-related operating register.
Cache. So far, the chip-related content is initialized.
Step 2: interrupt vector table. Arm interrupt is slightly different from PC chip interrupt vector table. For the sake of simplicity
When an interruption occurs, the CPU directly jumps into a part of the area starting with 0x0 (the ARM chip itself determines that it will jump into the area starting with 0x0 when it is interrupted)
In a region, the specific address to jump to is determined by the interrupt mode. Generally, the reset interrupt, Fiq, and IRQ interrupt are used,
SWI interrupt, command exception interrupt, Data Exception interrupt, and prefetch command exception interrupt), and when the CPU enters the corresponding direction starting from 0x0
In the scale, the user needs to program and take over the interrupt processing program. This means that the user needs to write the interrupt vector scale and interrupt
The vector table stores some jump commands. For example, when an IRQ interrupt occurs on the CPU, it automatically jumps to 0x18.
It is a jump command compiled by the user. if the user has written a jump command to 0x20010000
The address is a total IRQ interrupt processing entry, and a CPU may have multiple IRQ interruptions. How can we distinguish between different
Disconnected? It is determined by user programming. For specific implementation, see the relevant section later. The interrupt vector table generally uses a vector. s file, of course, how to name it is your favorite, but one thing you need to declare is that it must be located at 0x0 during the link.
Step 3: Set the stack. Generally, three stacks are used. One is the IRQ stack and the other is the stack in the system mode (in the system mode and the user mode ).
Share registers and memory space, which is mainly for simplicity). The purpose of setting the stack is mainly to store function calls and local variables.
Put, it is impossible to use assembly, or local variables
Step 4: Copy all future code segments and data segments to the memory and clear the BSS segments.
Step 5: Initialize the serial port (mainly for user interaction and file transmission with PC) and flash
(Here we store the boot and kernel in Flash), and write the Flash Driver (here the driver is different from what we usually call the driver, because
Unlike SDRAM, flash can directly read and write data from the specified address after the relevant controller is set. The write operation on Flash is
Data is written in one byte instead of one byte. For details, refer to relevant materials)
Step 6: Wait for a certain number of seconds to receive user input. If the user does not enter any characters within the specified number of seconds
Boot starts to read all the data in the kernel at the specified position in Flash (you can specify the position by yourself, so it is only for simplicity ).
To the memory (the specific location in the memory is determined by yourself, you can also use the Linux approach, that is, at the beginning of the memory)
Place 0 x) to jump to the first code in the kernel.) If you type a character in the specified number of seconds
(This is mainly for convenience of development. If you do not need this code after the development is finalized), submit the code to the user through the serial port.
Allows you to download files from specified locations in flash. For details, see
Open source projects such as U-BOOT

So far, the boot part has been completed, and this boot is very simple, just curing the files uploaded by the PC into flash,
Then, load the operating system kernel in flash into the memory and give the CPU control to the operating system. The next page begins.
I want to explain how to write the simplest operating system. Well, it's only now that I have started to get started !!!!
How to Implement the simplest Operating System

For simplicity, we do not consider portability, do not receive parameters from the boot section, or perform hardware detection,
You do not need to relocate data segments or code segments. I only read the Linux kernel and did not implement it myself.
An operating system, so what I'm talking about below is just a conceptual thing:
1. Take over the Interrupt Processing of the system. Because the Boot Code determines the interrupt vector table, it determines the system interrupt.
But the boot does not know the location of the interrupt processing function of the operating system. What should I do?
There are several methods, one of which is: If your board can remap the address, that is, the location where the memory stick is located.
Remap to start from 0x0. When the kernel is connected, the interrupt vector table of the operating system is located at 0x0.
The booting operation is completed at the end of the bootloader boot, and the CPU jumps to 0x0 for execution.
I don't know what to do with the function, but I think of a compromise, that is, the bootloader is started.
(That is, when the CPU control is handed over to the operating system kernel), re-rewrite the 0x0 area of flash, that is, the operation
The interrupt of the system kernel is written to 0x0 in the Flash area. For example, when an IRQ occurs, the CPU decides
Jump to 0x18 (assume that flash occupies the address bus 0x0 to 0x0fffffff, and the memory occupies 0x20000000 to 0x2fffffff)
In the end, bootloader changes the code at 0x18 to the code at 0x20000000 plus the address at 0x18.
The address is the relevant jump instruction in the interrupt vector table of the kernel, which is equivalent to the IRQ processing function associated with the kernel.
When the system is powered on again
The interrupt vector table has been modified, unless the boot itself is not interrupted.
Feature interruption required

2. For the sake of simplicity, you do not need to create a page table or perform other operations without using paging memory management.
The system stack settings are the same as those for boot. Then, the BSS segment is cleared. Here, the BSS segment
It refers to the BSS segment of the operating system. the same meaning as the BSS segment of boot is only used in different places.
Jump into the main function.

3. To minimize the possibility of simplicity, use static Task Structure arrays. For example, to create only ten tasks, you must first
Allocate segments of memory for the ten task structures, which can be allocated on the heap (the allocated memory will not be available until the end of the Operating System
Released. You can also specify a memory area that is not used in other parts of the operating system.
It is a bit of a layman's taste, while the pointer to the operator structure array is a global variable, which is stored in the BSS segment or data segment ),
Since a system Stack has been allocated in the previous step, we will share these 10 tasks in the overall stack area.
The focus here is that if you define the structure in the array of each task structure, you can refer to the relevant sections of Linux to design

4. interrupt handling: in step 1, several types of Interrupt jump addresses related to the CPU have been determined, and the same type
There is only one entry address for the interrupt. Here, the interrupt processing will complete several actions:
First, the stack operation, including all registers, is the IRQ stack set in step 2,
Second, all screen interruptions. Oh, for the sake of simplicity, it is not allowed to be interrupted again during the process of interruption.
3. Read the interrupt-related registers to identify what was interrupted and jump into the related Interrupt Processing letter.
Number-based execution (there are only two types of interruptions here, one is clock interruption, the other is SWI interruption, that is, the so-called
)
4. Wait until the interrupt processing is completed, enable the interrupt and exit the stack, restore the scene, and assign the CPU control to the interrupted
Code
Note:
1. In Mian, you must first determine the interruptions to be handled by the entire system, that is, the interruptions to be handled.
Function, and then write the interrupt processing function here.
Second, this operating system does not process virtual memory, and it does not even handle CPU exceptions (everything is simple). Once
If an exception occurs, the system crashes.

5. For timer implementation, first determine the time slice. In order to make the system more stable, and we do not need real-time functions
The time slice may be set to a little longer. For example, if we want a task to run 20 tick messages
Frequency to determine the millisecond used by each system tick. Here, the system timer is interrupted once in 5 milliseconds, then
You need to write the clock register. For details, refer to the chip information. After calculation, a task can run for up to 100 milliseconds.
Note: our operating system does not support kernel preemption and only supports two levels of Interrupt priority, that is, only the clock
The interrupt priority is a little higher, and the other priorities are all a little lower, but this function is removed in the Interrupt Processing Section.
Interruption is prohibited as soon as the interrupt processing is started. Therefore, no matter how high the priority of other interruptions is, it is useless.
The advantage is simple, but the disadvantage is obvious, especially when the related interrupt processing function enters an endless loop
The whole system is dead, and the time slice becomes inaccurate. No real-time or real-time clock is needed.
For how to set the interrupt priority, see the chip information.

6. The implementation of process scheduling, namely the do_timer function (clock interrupt processing function), has a global variable pointer,
It points to the current task structure array (or linked list). When the clock is interrupted, it enters this function and first judges
Whether or not the time slice in the task struct is used up. If it is not used up, it will be reduced by one, and then the interruption will be exited, so that the CPU will continue to run when
If the time slice is used up, reset the time slice and search for the next wait in any structure array.
If a running task is found, it is switched to the new task. For how to switch, see the next page.
Switch to the idle task (similar to Linux, haha, all the processing is to imitate Linux, because I am too good
). Note: For simplicity, the task priority is not implemented or the task is not implemented.
Sleep, etc., that is, as long as ten tasks are determined statically, the ten tasks are executed one by one in sequence.
In addition, each task cannot end. That is to say, the last code in each process must use an endless loop. Otherwise
Then the system runs.) Another point is that the process does not support signals, and there is no sleep or wake-up operation. The CPU is
The CPU is not human, so human rights are not needed !!! Is this scheduling simple?
The ticket cannot be simpler ?????!!!!

7. The serial port is not interrupted, which is the most likely to reduce the difficulty. The serial port is accessed through inquiry (when
However, it is a blocking method, and only the write method is supported and the read is not allowed, because the interruption Method is required during the read,
This is because the polling method is not good, that is, when reading, it is possible that the time slice of the current process is used
When the system switches to another process, the data you input in the serial port of the PC is discarded.
It's easy)

8. The last step is the last part of the mian function. This process is treated as an idle process (equivalent to modifying the task structure ).
Data in the array), enable the interruption, and add the current process to an endless loop to prevent it from exiting.

9. compile your Bootloader and kernel, burn it to flash, and debug it repeatedly.

10. Connect the serial port of your at91rm9200 (or other similar chips) to the PC and open the Super Terminal,
Turn on the board power, maybe your operating system will print "Hello, world !!! One of the simplest operations
The system is out.

The next page is implemented by specific functional modules.

Task Structure array (or linked list) Implementation

The structure of our task is in the form of a linked list, but its length is limited. The header pointer is a global pointer variable (
The pointer variable is an unsigned integer pointer. Its pointer address is in the BSS segment, but it points
Memory allocated to the heap). Use kmalloc to allocate the kernel memory. You must write the kmalloc function by yourself.
For simplicity, this function only accepts one parameter, that is, the size to be allocated. This function is very easy to do. First
There is a global needle that points to the starting position of the entire heap during initialization and has a fixed size, which is called the kernel.
Stack, after the kernel stack, is the user stack. Because there are a total of 10 tasks, of course, excluding the tasks of the kernel itself,
Therefore, the entire stack is evenly divided into eleven parts. Note: After all tasks are initialized, another step is
Moving the kernel task to the user State is equivalent to modifying the stack pointer of the task structure ),
To determine whether the size exceeds the allocable range of the kernel heap, you need to maintain the heap of the Kernel Heap and other tasks,
You need to partition and have a global memory usage identifier. Use an array. It is simple. 0 indicates the corresponding memory.
If some items are not occupied, 1 indicates that they are occupied, and the corresponding kfree is equivalent to setting the flag to 0 ),
Memory maintenance is complicated. For simplicity, it is set to 4 kb and cannot be applied for memory larger than 4 kb because
After 4 K, because there is no virtual address concept, it is impossible to achieve continuous Address allocation on the stack. Of course, it is allocated on the stack.
It can be larger than 4 K, and the stack is determined by the compiler and CPU.

The task structure includes:
1. The remaining time slice
2. Memory Address of the code segment pointed to by this task, which is also the function entry address.
3. The data segment address pointed to by this task. The data segment here is included in the entire kernel, so it is useless and reserved.
4. Whether the function body of this task exists or is scheduled
5. The stack pointer used by this task
6. Heap pointer used by this task
7. ID of the task. 0 indicates idle, and 1 indicates other processes.
8. values of all registers
9. The current Pc value is set to the function entry address during initialization.

First, we will explain the initialization of the task array structure:
Define a global pointer first, then forcibly convert the pointer to a task structure pointer, and use the kmalloc function in the kernel
The occupied heap (previously speaking, the starting point of the kernel heap is the starting point of the entire heap) allocates the memory occupied by ten task structures. Here
No more than 4 K
Assign values to the structure of the ten tasks. Set the first task to idle, the time slice to 20, and the memory address of the code segment to
The data segment address is ignored. The function body exists and can be scheduled. The position pointed by the stack pointer is calculated as follows:
Assuming that each stack can be used for each task is set to 64 K, and the starting position of the entire heap is 0x20030000, the first heap refers
The needle points to 0x20030000, the stack is 0x20030000 + 64 K, and so on after the second.
Note: Before initializing the task structure, the system is not allowed to use the heap, but the stack can be used, so the kernel Task Stack is divided
Two, before scheduling, the stack is set in step 2 on the previous page, so you have to note when setting the stack on the previous page.
It is necessary to set the stack space to ten 64kb, and use the previously largest possible stack space in this step.

Next, let's explain what to do during Task Switching:
When entering the entire interrupt processing entry, all registers will be pushed into the IRQ stack, and the value will be copied to the corresponding fields of the current task structure
And take out the current Pc value of the interrupted process and store it in the corresponding field in the current task structure. Next, identify the interrupt type,
In order to enter the corresponding interrupt handler function, this will enter the do_timer function, the following is the process after entering this function:
There is also a global pointer in the kernel, which is the current task pointer. It is also in the BSS segment of the system. Its definition is as shown in the previous step.
The global pointer is the same. When the system clock is interrupted, the global pointer is taken out.
Point the pointer to the position where the first task structure is located, that is, 0x20030000.
The Slice field determines whether it is 0. If it is 0, perform the following operations:

Save the stack pointer in user mode to the current job structure, save the heap pointer, search for the job structure that can be scheduled, and
This task structure is assigned to the current task pointer and sets the identifier for Task Switching. This identifier is also a global variable, but it is
If the initial value is assigned, it is placed in the Data Segment of the entire system and the do_timer function is returned.

If the value is not 0, perform the following operations:
Subtract one time slice and return the do_timer function.

Next, identify the Task Switch. If the value is 0, perform the following operations:
No task switching is required. All registers exit the stack (the stack here refers to the IRQ stack), enable the interrupt again, and switch to the user mode,
Load the current Pc value field in the current task structure to exit the interrupt handler.

If this parameter is 1, perform the following operations:
Task switching is required to let all registers go out of the stack (here the stack refers to the IRQ stack) and send all
The memory value is restored to the corresponding register, and the stack pointer in the user State is restored to the stack pointer of the current task structure.
The heap pointer of the current task structure, restores the ID of the task to be switched to 0, re-enable the interruption, and switches to user mode.
The switch is implemented by loading the Pc value, that is, by loading the current Pc value field in the current task structure to exit the interrupt handler.

Implementation of system calls

This system does not implement system calls. Because kernel and user mode are not protected
In your own C library, all functions are implemented like kmalloc. You can directly write the function prototype in the kernel.
Extended, let's talk about the system call, which is implemented using the malloc system call.

There is also a heap pointer (there is a heap pointer in front of kmalloc, but that heap pointer is a kernel task
The heap pointer is used for user State. It assigns the initial value before the system initialization is complete.
The value is the starting position of the heap used by the first task structure, that is, the 64 K position is added to the heap used by the kernel.
The implementation steps of the malloc function in the function library are as follows:
1. First, check whether the application size exceeds 4 K. If the application size exceeds 4 K, an error is returned.
2. Call the system (here _ syscall1 is used and only one parameter is passed (the size to be allocated)
Implementation of system call function _ syscall1:
1. Press the Register into the stack (the stack here points to the stack of the current task)
2. Set system call number 1 to R0 and the parameter to R1.
3. Send a SWI command to generate a SWI interrupt (that is, a soft interrupt, a trap)
When the system is interrupted, it will enter the SWI interrupt processing entry. The following describes the implementation of the SWI entry function.
1. Retrieve the R0 value, determine its value, and enter the corresponding branch processing code segment.
2. Enter the _ malloc processing code segment here, retrieve the value of R1, get the current heap pointer mentioned above, and apply for the corresponding number
The data block size is used in the corresponding field of the memory usage identifier. It puts the current heap pointer into r0, moves the current heap pointer, and changes the current
Switch the heap pointer of the service structure to the user State and return the processing result of SWI Interrupt System Call _ syscall1:
For simplicity, when the user State is returned from the kernel state, the task is no longer rescheduled, so the above steps are relatively simple.
1. When the return result is interrupted from SWI, the system runs in the user State. At this time, the R0 value is taken and assigned to the pointer to apply for memory.
2. The register pops up in the user State and returns to the previous function layer.
The return of the malloc function. Now the malloc function returns the pointer directly, and the entire malloc process is over.
Unified calling is similar to this process

So far, this operating system has been initially implemented, but it seems that nothing can be done. If you want it to support serial port interruption, maybe
A little bit of work can be done, such as a single-chip microcomputer function, the difficulty of the entire system is the Interrupt Processing and task switching, in the local
In this example, because arm does not support CPU-level protection modes such as 0x86, you must load
I can't think of a better way, but there is a bad solution to this method, that is, registers.
For inbound and outbound stack protection, registers must be protected when an interruption occurs. However, if re-scheduling is required
How can I switch from the interrupt context to the process context in the process context ?? The method I used here is clumsy:
1. First let the Register into the stack
2. Store the Register to the current task structure array, and save the Pc value of the interrupted process to the task structure.
3. Handle timer interruptions
4. If you want to switch between tasks, find the next schedulable process, and specify the structure of the current task.
Allows the Register to exit the stack, restores the values in the current task structure to the register, restores the stack pointer, switches to the user State
Resumes the suspended process by loading the Pc value of the current task structure.

The task structure is used in the interrupt context, which is not used in Linux. The interrupt context and process context are two.
Different concepts, interrupt context cannot access the task structure in the process context, I really don't have any way to implement the process
Scheduling, so the person who saw this article proposed a better method.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.