A simple way to port Linux code to Windows
A Objective
Linux has a wealth of source code resources, but most of the code is not compiled properly on Windows platforms. The Windows platform simply cannot directly exploit these source code resources. If you want to use the full code, you need to do the porting work. Because of the different and other reasons, the C/C + + library is a difficult task to migrate. This article will take a practical example (Tar) to illustrate how to port the Linux code to the Windows platform. The porting process will modify the code as little as possible so that the code's running logic is not changed. Retain most of the software's main functions.
Two Preparatory work
Tar is one of the following packaging tools for Linux platforms. What do you need to do to migrate such a program to the Windows platform?
The first is some preparation, installing the latest version of Cygwin on the Windows platform, and installing development tools such as GCC in Cygwin. A Windows development environment is also required. You can use the latest version of Visual Studio, Microsoft Visual Studio. NET 2003. Get the latest source code for tar from www.gnu.org, version 1.13. Untie tar-1.13.tar.gz under Cygwin. The source code package. Note Do not use WinRAR or WinZip under Windows to decompress. WinRAR and WinZip will have problems extracting some tar.gz packets. Causes the directory and file after unpacking to appear abnormally. If it is a source code package, it may not be possible to compile correctly under Cygwin. After you unpack the package, go to the tar-1.13 directory and enter it under the current directory.
./configure
command, after the run is complete, enter again
Make
Command. Start compiling the Cygwin version of tar.
Compile basically no problem, into the SRC directory, you can see the newly compiled tar program Tar.exe.
Cygwin is a Linux simulation environment for the API layer. If it can be compiled under Cygwin, run. In fact, it can be compiled and run under Windows, but requires a layer of intermediate APIs to emulate certain Linux-specific operations. Simply determine if a Linux program can be ported to the Windows platform to see if it can compile the source code under Cygwin and run the program.
Compiling the source code of tar in Cygwin, it is only one of the reasons to determine whether the migration is possible. Another reason is that a special header file Config.h is required during the migration of the code. Config.h is the most important source code file in the migration process. The Config.h file is not part of the source code itself. The file is generated when you run the "./configure" command under Cygwin. When you run the "./configure" command under Cygwin, config.h files are generated based on the Cygwin platform development environment. Compile-time also requires the Config.h file to control the code compilation items. The porting work is also based on the Config.h file.
The next step is to construct Windows engineering. First create an empty project (project) with Visual Studio. NET 2003, named Wintar. Based on the compiled output information in Cygwin, the tar main code is in SRC and lib two directories. Copy the two directories into the new project and add the code to the project. Then copy the Config.h to the Wintar project directory below.
The preparation is basically done, and then the transplant. The porting process can be divided into 3 parts.
Three First goal: Enable Wintar to compile (Compiler)
The first goal is accomplished mainly around Config.h. The big difference between the Linux development environment and the Windows Development environment is that the C library header files are different from the various types of definitions. Config.h provides a complete compilation switch to handle the differences between different platforms due to different development environments. You now need to manually modify this file so that the tar source code adapts to the Windows platform.
First, adjust the inclusion of various C library header files (header file). Many similar have_xxxx_h are defined in Config.h. For example, defining Have_config_h as 1 indicates that config.h can be used in the project.
#define HAVE_MALLOC_H 1 indicates that the MALLOC.H header file can be used in the project. By adjusting these defined values, you can remove some of the header files that are not included under the Windows platform. Maybe there's a lot of header files in other places. The file contains relationships that need to be handled, but the definition here basically solves most of the header file containment problems.
/* Define If you have the <linux/fd.h> header file. */
/* #undef Have_linux_fd_h */
/* Define If you have the <locale.h> header file. */
#define HAVE_LOCALE_H 1
/* Define If you have the <malloc.h> header file. */
#define HAVE_MALLOC_H 1
/* Define If you have the <memory.h> header file. */
#define HAVE_MEMORY_H 1
/* Define If you have the <ndir.h> header file. */
/* #undef Have_ndir_h */
The second step, adjust the definition of various data types, there may be a lot of special data type definition under Linux, the Config.h file also contains a part of the data type definition can be changed. These definitions are generally redefined as basic data types. Patching can be done based on the data type definition under the Windows platform. For example, in the Cygwin development environment there is a data type mode_t, Visual Studio C library (the author is very native, contact method jackforce at 163 dot com) cannot find such a data type. A large number of mode_t data types are used in the TAR code. Config.h provides modifications to allow developers to modify the definition of mode_t themselves, and suggests that if mode_t is not defined in <sys/types.h>, it can be defined as an int type. So add # define mode_t int to config.h. This solves the problem of mode_t without definition. Other data types are treated the same way.
* Define to ' int ' if <sys/types.h> doesn ' t Define. */
#define MODE_T INT
/* Define to ' long ' if <sys/types.h> doesn ' t Define. */
/* #undef off_t */
/* Define to ' int ' if <sys/types.h> doesn ' t Define. */
#define PID_T INT
The third step is to adjust the various function definitions. In Config.h there is a pre-defined, have_xxxx in addition to Have_xxxxx_h. Here are some optional function definition switches. #define HAVE_MEMSET 1 indicates that the MEMSET function can be used in the project. This means that the function has already been implemented in the class library used by the project. If not, then you need #undef have_memset, and of course you can provide these functions yourself.
/* Define If you have the Memset function. */
#define HAVE_MEMSET 1
/* Define If you have the mkdir function. */
#define HAVE_MKDIR 1
/* Define If you have the Mkfifo function. */
#define HAVE_MKFIFO 1
/* Define If you have the Munmap function. */
#define HAVE_MUNMAP 1
Finally, in addition to the above header file, function, and data type compilation options in the Config.h file, there are other things, such as environment variables, and other compilation options. The content will vary greatly depending on the project. But you can see from the Config.h basically how much work the transplant is.
After the above adjustment, is bound (the author is very Earth, other articles please check vchelp very earth column) because there is no header file in the Windows environment, such as poll.h, there will be no poll function, no dirent.h will not have dirent structure. and continue to make Wintar compile. This time you need to modify the details according to the specific compilation error information. When you need to use some special definitions of Windows, don't forget to add # include <windows.h> to the top of Config.h.
For details, give an example to illustrate. Like there's an option Have_inttypes_h
/* Define If <inttypes.h> exists, doesn ' t clash with <sys/types.h>
and declares uintmax_t. * *
#define HAVE_INTTYPES_H 1
By analyzing the code, you can see that the code does not need a complete inttypes.h file, but rather a uintmax_t definition. There is no inttypes.h this file in the C Library of visual stdio and there is no uintmax_t this definition. Back to the Inttypes.h file of the include directory of Cygwin and found the definition of uintmax_t
typedef unsigned long long uintmax_t;
Very simple redefinition of data types. As a simple definition, it is entirely possible to add a dedicated version of Inttypes.h to the Wintar project from the Cygwin include directory. This solves the problem that uintmax_t not be defined during the compilation process. The general approach to solving this type of problem is to take the relevant header files from the Cygwin include directory and copy them to the Wintar directory. [This article was completed in 2003. If you need to reprint please contact Jackforce at 163 dot com] The principle of modifying or copying code is to no longer introduce more definitions or header files, just take the required parts. Other similar problems include the direct structure definition and related functions.
In the compilation process, many errors are generated by the files in the Lib directory, but the files in the Lib directory are not fully required. The Lib directory is just a supplemental repository for tar. The required code needs to be compiled. One way to judge it is to refer to the contents of the Library of Windows C. If the same function, the data type is already defined, you do not need the same data type definition and function implementation in the LIB directory. Another way is to try to remove the Lib directory of C files, only the header files, and so that the compilation can pass, according to the link error information to check those Lib C files are needed.
In addition to modifying various peripheral header files, don't forget to modify the project's compilation options, especially the predefined options. The following pre-defined have_config_h,_posix_source,msdos are required in the TAR migration process. Have_config_h indicates that a config.h file is required for program compilation. During the tar migration process, it was placed in the project's precompilation options for convenience. MSDOS, the porting of a console program under Linux, and the Windows platform closest to the Linux console is DOS, specifically the definition of some environment variable settings and global constants. Some of the TAR code has been modified for the MSDOS environment, which can be leveraged during the porting process. There is also an option to __cygwin__. Some Linux programs make special code settings for the Cygwin platform. When encountering such a code, be sure to add __cygwin__ predefined, can greatly reduce the amount of work required for the transplant. There is also the need for __cygwin__ definitions (sometimes other definitions, such as _posix_source, or __inside_cygwin__) in the various CYGWIN codes introduced in the porting process.
After a few steps above. The first goal, the code can be compiled through basically there is no problem. Just grasp the basic principles of two code changes, first. Introduce new code without modifying the original code. It is not allowed to modify the source code before debugging, and the bad changes will cause the final code to run out of logic, and it is difficult to find the problem before the code can run. So unless you are very confident, do not modify the source code of the migrated project. Second, with the introduction of new code, it is not necessary to introduce new code again because of this introduction. This way, it goes into the dead loop. To solve the definition of a data type, a new data type that cannot be interpreted is introduced. It's better not to introduce new code. So introduce new code, especially a lot of header files. Be sure to make changes before you introduce, just keep the parts of the project itself, and remove the unwanted code. Until you can compile the pass. Three: The second goal, so that the code can be linked (link)
Once you have completed your first goal, you will have a large number of link errors. The reason is that many external functions have been introduced, and the external global constants are defined without entities, thus creating a link error. What is needed now is to provide the code with the introduced function entity, the external global variable entity. This is usually the function link (this article was completed in 2003. If you need to reprint please contact Jackforce at 163.com) less than the more.
To solve the link error, you need to understand the difference between the functions on different platforms, especially the differences between some concepts. There are two of the best resources available here. One is the Help file for Windows Services for UNIX (SFU), and one is an article on MSDN, "UNIX Application Migration Guide." SFU is a UNIX-compatible environment that Microsoft offers, a bit like Cygwin. There is a help file after SFU on the installation. Part of this is the description of the Unix,linux function, and some functions provide information that can be substituted by those functions in the Windows Library. This is important for porting (easy). The UNIX Application Migration Guide should not be an article but a bit like a book. It illustrates the many different concepts in many Windows and UNIX systems (Unix-like systems), and provides a lot of relevant information about how to emulate these different concepts. For example, the signals concept in a UNIX system can be replaced with an event in a Windows environment. SIGALRM with Windows message instead.
SFU's Help file provides some information to illustrate which low-order functions (C function Libraries) in the Windows platform can replace related UNIX functions. The UNIX Application Migration Guide provides a way to convert some OS-level concepts on the UNIX platform to Windows. In fact Cygwin has done a lot of these conversions. To solve the link problem, you can refer to the implementation of Cygwin itself.
But there are concepts, such as the concept of security permissions. The Linux platform and the Windows platform are completely interchangeable. and permission function operations in the Windows platform (this article was completed in 2003. Please contact [email protected] For more complexity if you need to reprint. This is true for some Linux functions. For example, Getuid treatment can refer to the Cygwin treatment method. Do nothing directly return 0 (return 0). When you encounter these functions in your code, you can copy a getuid from the Cygwin code. Into the project.
Using this information and using relevant tools such as sourceinsight to search the source code of the Cygwin itself, the link problem is not difficult to deal with. It is possible to revert to the above problem in the process of dealing with link issues, but compile. The code changes at this time must be careful not to introduce too many new code, lest the problem become more and more complex.
Four: code is working properly
In fact, when the link problem is resolved, the program can be run in a Windows environment, and everything is in control. If you do not consider the long platform of the program, this time can be arbitrary to modify the program. However, the code debugging process may require a reference to see how the normal program runs. The program that has just been transplanted in many places is not going to work right away. Back in Cygwin, recompile a version that you can debug (with the GCC compilation option plus-G3), and debug the program in Cygwin when you need it. Debugging can be used with gdb or insight. If you are accustomed to programming under the Windows platform, you can use insight, a TCL/TK script that provides a Windows interface for user-friendly debugging, but Insight eventually calls GDB. The specific debugging here is not detailed.
V: Multi-platform Code
The migrated code (this article was completed in 2003.) If you need to reprint please contact [email protected]) If you need to run on more than one platform, it will be in the Lib directory in a fuss. Provide your own library of functions and make adjustments based on each platform. The TAR code is controlled by Config.h and some compilation options to compile on various platforms. LIB provides a lot of C library functions or alternative versions of other functions under different platforms. This way, the tar does not compile for some functions that are missing from some platforms during the compilation process. Multi-platform support, usually in the code to add a lot of compilation switches, during compilation to separate linux,windows or other platform under the special code. For example, the Utime.h header file contains a problem. Because the file is different under Linux (GCC) and under the C Library directory under Windows (CL). The approach involved is different. This may be required to be completely properly contained.
#if Have_utime_h?----If you have utime.h file
# ifdef WIN32?-----if it's Win32 environment
# include <sys/utime.h>?-----include Sys/utime.h
# endif
# ifdef Linux?----if it's a Linux environment
# include <utime.h>?----include Utime.h
# endif
#else?---If no utime.h defines the required structure
struct UTIMBUF
{
Long Actime;
Long Modtime;
};
#endif
This is basically the case with other code. Compile different code according to the different compilation environment. The define of this kind of partition is mainly for the difference between different platforms. Some of the differences may be that some constants are not defined, and some of the differences are that some functions do not exist. If the calling function in your code does not exist under some platforms, you will need to provide a lib to provide these functions. The same is true of the lib of tar.
Basically porting code is difficult before it is easy. First of all to ensure that the source code itself logic can not be changed, so in terms of modifying the code can only try to modify the peripheral code, rather than modify the source code itself. If link is over, then it is the normal programming of windows, you can modify the migrated code as required. The hardest part may be the replacement of different concepts at the OS level. C Library is different on each platform, but always close, different places can provide their own code to replace. But OS-level concepts, and platform dependencies, are generally not easy to replace.
Six: Scaling issues, issues to be solved
If you need to change the transplanted code into a DLL or LIB to other engineering calls. For example, provide a function to unpack the tar file for other projects. If not modified, then the migrated code has many flaws.
The first is multithreading support issues. If there are many global variables in your code, then you cannot call them after you change to a DLL or LIB.
Second, the DLL interface table. The ported code entry is the main function, although there are many independent functions throughout the project, but the invocation of these standalone functions is accomplished by using different parameters. How to output the interface table for other projects to use, need to do some effort.
Third, the control of the original console program under the operating parameters, the general is a run to the end, there may be some in the middle of the request to enter some information. How is such a program integrated into other projects and controlled by other projects? such as encountering some errors to return and so on. Exit the program directly if you encounter an error in the TAR code. Obviously these places are incompatible with DLL design requirements. You may need to redesign the structure of your code.
Four, output information. There is a lot of information in the TAR project that outputs to the console. This information output needs to be redirected or masked.
The third part can refer to Linux under the Frontend program, which is just for a special program to provide a GUI interface program. The Frontend program controls the operation of the main program and redirects the output information to the GUI interface.
Note 1. Cygwin is a Linux simulation environment under the Windows platform. You can download the entire content from the www.Cygwin.com.
Note 2. Windows Services for UNIX (SFU) SDK can be obtained from the Microsoft Web site http://www.microsoft.com/windows/sfu/
Note 3. The UNIX Application Migration Guide can be obtained from MSDN if no MSDN can be obtained from the Microsoft MSDN website. Http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnucmg/html/ucmglp.asp
Note 4. Tar, tar under Cygwin. However, you can only run under Cygwin or you must provide a Cygwin platform DLL to use the TAR program separately under Windows.
Note 5. CL is Microsoft's C + + compiler, which is included in each version of Visual Studio
This article was completed in 2003. If you need to reprint please contact [email protected], if you see some of the interference information. Please forgive me. The author mainly avoids the loss of information in the process of reprint. You may not think of it, please forgive me.
Ps:
A simple example illustrates some of the issues and workarounds that need to be addressed in porting from the Linux platform to the Windows platform.
Examples are only used to illustrate problems with the transplant process.
A simple way to port Linux code to Windows