A ramble on management tools--from Maven,gradle to go

Last Update:2017-02-09 Source: Internet

Author: User

Tags comparison table naming convention

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

This is a creation in Article, where the information may have evolved or changed.

This paper, from MAVEN, analyzes the main ideas of Maven and the improvement of Gradle to MAVEN, and finally discusses the dependency management of the go language.

Why do I have to rely on management tools?

Before we talk about dependency management, let's first talk about why we have to rely on management tools.

We learned a programming language and then wrote "Hello World" and then declared that we had learned a language, and that there was really no need to worry about dependency.

However, when you want to write a slightly more complex application, that is afraid of the message board, the need to read and write the database, you need to rely on the database driver, you will encounter the problem of dependency management.

Further, you write a library, want to share to others to use, more need to understand the problem of dependency management.

Of course, if the project is simple enough, you can place the source code of the relying party directly in your own project, or place the dependent library binaries (such as Jar,dll) in the project Lib. To be offered to others? Download the binary package or pass it on to someone else. Most of these are done before the dependency management tool appears.

But if it is more complicated, the reliance on the library itself is dependent on how to do it? Will rely on compression packaging, and then put a Readme Help file description, it seems to work.

What if your project relies on several or even dozens of libraries, and the repositories have dependencies and dependencies? How do I detect a version conflict in a library dependency? What to do when upgrading? How to tell if a file under the Lib directory is being relied upon?

To this step you have to acknowledge the need for a dependency management tool, whether you use any language or not. We are also about to know what to do with dependency management. Suppose we have not relied on management tools, we have to design one ourselves, how to start?

To have a naming convention that relies on libraries, or a definition rule called coordinates (coordinates), you can find the dependent libraries exactly by coordinates.
To have a corresponding profile rule, describe and define dependencies.
You need a central repository to hold these dependent libraries, and to rely on the library's metadata (metadata) for the user to pull.
A local tool is also required to parse the configuration file to implement the dependent pull.

These are the core elements of each dependent management tool.

Talk about Maven

Maven was born in 2004 (source wiki), and it should be early in the dependency management tool for each language. Ruby's Gem also appeared in the 2004, but gems are less dependent on the complete dependency management tool until Ruby's bundler appears. Python's Pip appears later.

Maven's habit is to define coordinates by GroupID (usually the organization's domain name is inverted, followed by the Java Package naming convention) + Artifactid (the name of the library itself) + version (version) to make the configuration file through XML, Provides a central warehouse (repo.maven.org) and local tools (MVN).

Dependency definition:

<dependency>
     <groupId> com.google.guava </ groupId>
     <artifactId> guava </ artifactId>
     <version> 18.0 </ version>
</ dependency>

Repo definition:

<repository>
     <id> repo.default </ id>
     <name> Internal Release Repository </ name>
     <url> http://repo.xxxxxx.com/nexus/content/repositories/releases </ url>
     <releases>
     <enabled> true </ enabled>
     <updatePolicy> interval: 60 </ updatePolicy>
     <checksumPolicy> warn </ checksumPolicy>
     </ releases>
     <snapshots>
     <enabled> false </ enabled>
     <updatePolicy> always </ updatePolicy>
     <checksumPolicy> warn </ checksumPolicy>
     </ snapshots>
</ repository>

In order to avoid conflict-dependent problems, MAVEN's dependency configuration provides a exclude configuration mechanism to block the delivery dependency of some libraries.

Ruby's Gem,node Npm,python Pip,ios cocoapods are similar, but there are some differences between the configuration file syntax and the coordinate naming conventions.

At this point, it seems that Maven is simple, why do many people think that Maven is complicated?

The main point is the following two points:

Java is a language that needs to be compiled, and the published library is a binary version of the jar package, which requires a compiled process before publishing, and dependencies and compilation are closely related. Unlike a scripting language like Ruby,node, it is possible to throw the source code and configuration files into the repository.
MAVEN does not simply define itself as a dependency management tool, but as a project management tool that encompasses the entire life cycle of a project.

The 2nd is the difference between Ant+ivy and Maven, Ivy thinks that there is a build-and-pack tool such as ant, just to do a plug-in to solve the dependency problem, and Maven think Ant itself has a place to improve, so it was modified.

The core idea of Maven's improvement is

Convention over Configuration

That is, "Convention is greater than configuration." Since most people are accustomed to the source directory named SRC, it is agreed to use this directory, not specifically to configure. Similarly, clean,compile,package and so on are also agreed, do not need to specifically define ant task. This simplifies the configuration file and reduces the learning cost. An ant-defined project, you need to read the Help file or view the Build.xml file to see how to compile the package, and the MAVEN-defined project runs "MVN packages" directly.

The Java language invention is relatively early, early this idea is not popular, so Java itself does not have the specification of the project, and the new language basically absorbed the idea, the project has made a contract and norms. For example, the go language, if you want to define complex makefile in C/s + + to define the rules for compilation and how to run test cases, these are all conventions in go.

MAVEN is defined as a project management tool that encompasses the entire lifecycle of a project from source to release:

validate → generate-sources → process-sources
→ generate-resources → process-resources → compile 
→ process-classes → generate-test-sources
→ process-test-sources → generate-test-resources
→ process-test-resources → test-compile
→ test → prepare-package → package 
→ pre-integration-test → integration-test 
→ post-integration-test → verify → install → deploy

Since there are so many features and stages involved, MAVEN has introduced plug-in mechanisms, and Maven's own editing and packaging functions are implemented with plugins, allowing users to define their own plugins.

At the same time involved in different phases of the build life cycle, dependencies also need to be determined to be compiler dependent? Test dependencies? Run-time dependent? Therefore, the definition of scope is more dependent.

What if Maven is simply understood as a standardized ant+ivy+ extensible plug-in framework? But real-world projects tend to be more complex.

We have a function for combining code block logic, with object used to combine a set of methods, with Package,namespace for combining a set of related objects, but there is also a higher level of combination definition-–module, or sub-project. Different source directories under the same project may need to be compiled to be packaged into different binaries, which together form a holistic project. This is actually related to the source management habits, is each independent module as a separate source repository or the relevant module all put together? From the perspective of reducing communication costs, or should be through a large warehouse organization.

So Maven introduced the concept of module, the same project can have multiple module, each module has a separate POM file to define, but in order to avoid duplication, maven Pom file supports the parent mechanism, the child project's Pom file inherits the parent Basic configuration of Pom. It can be said that the module's mechanism to improve the complexity of MAVEN to a level, many people encounter maven pit more than here.

Here's a best practice for MAVEN multi-project version management:

The version number is configured in the parent project, and the configuration version number is not displayed in the subproject, directly inheriting the version number of the parent project.
Dependencies between sub-projects are referenced through ${project.version}, and do not explicitly configure the version number.
When the new version is released, all sub-projects are released at the same time, even if the subproject is not changed.
It is best to publish through MAVEN's release plugin to avoid the inconsistency caused by manually modifying the version number.

Even so, Maven often encounters problems with multi-project version management. The main reason is that dependencies between MAVEN's sub-projects also follow the way that third-party libraries rely on configuration, and you need to specify the version number of the subproject. In addition, the parent of the subproject needs to be explicitly configured, as well as explicitly specifying the version number of the parent. Once these version numbers are wrong, they can eventually lead to a variety of weird problems.

MAVEN's release plugin is also more complex to use, the plugin actually does a few things:

Build the project first, and confirm that the project will build properly.
Modify the version number of the Pom file to the official version, then submit it to the source repository and tag.
Check out the source code of the tag, build it again, the build of the jar package version is the official version, the jar package upload to the MAVEN repository.
Increment the version number, modify the version number of the Pom file to snapshot, and submit it to the source repository again.

In this process, due to build two times, submit two source repositories, upload a jar package, any step error will lead to release failure, so use more complex.

This is where Maven's core concepts are analyzed, and others are extensions to the plug-in mechanism. You should also understand why Maven has become so complicated in the end.

But in any case, MAVEN is basically the benchmark for project management tools, and some languages are managed directly via extensions, such as c++,c# (Nmaven), or porting Byldan (C #), but seemingly unsuccessful, The main reason for this estimate is that Maven is written in Java and has a community diaphragm.

Gradle's improvements to Maven

Talk about Maven's ideas and advantages, and what about Maven's shortcomings? We'll have a chat with Gradle. Gradle is an improvement on the basis of MAVEN. The advantages are mainly reflected in the following areas:

Configuration language
MAVEN uses XML, which is limited by the ability to express XML and the redundancy of the XML itself, making the Pom.xml file look lengthy and cumbersome. And Gradle is a DSL-based language defined by groovy, concise and expressive in power. In maven, any extension needs to be implemented through the MAVEN plugin, but the Gradle configuration file itself is a language that can be directly dependent on any Java library. You can define tasks directly in the Build.gradle file like Ant, which is more expressive than ant (Ant itself is also XML-defined).
The project object and environment variables can be obtained directly from the Gradle configuration file, which is useful for complex projects with more granular, custom control of the build process.
Project self-contained (Provisioning Build environment)
Users download a MAVEN-defined project and, if MAVEN is not used, download the MAVEN toolkit to learn about maven. But Gradle can generate a Gradlew script for the project, the user runs the Gradlew script directly, the script will automatically detect whether the Local has gradle, not downloaded from the network, transparent to the user (of course, domestic network is best to download it first).
For the warehouse configuration, MAVEN provides a local settings.xml configuration file that defines the sensitive files that should not be placed in the repository, such as private warehouses and warehouse passwords. However, the inconvenience is that these information items are not self-contained, so Gradle killed the local configuration mechanism, all the definitions are in the project. Private warehouse passwords such as those that can be placed in the Gradle.properties file under the project are not submitted and shared with internal members in other ways. This may have advantages and disadvantages.
task dependencies and execution mechanisms
Every step of the MAVEN build lifecycle is pre-defined (see above), and plug-in tasks can only be cut at a certain stage in the life cycle of the reservation, although the MAVEN lifecycle phase is well thought out, but sometimes it does not meet the requirements. MAVEN executes the task linearly from the beginning of the lifecycle, while Gradle uses directed acyclic graph to detect the dependencies of the task and decide which tasks can be executed in parallel, which makes the task more flexible in terms of definition and execution.
Dependency Management is more flexible
Maven is strict with dependency management, which must be the coordinates of the repository. Although the local path configuration of system scope is also supported, there are many inconvenient places (System scope dependencies, which are not included when packaging). If all the libraries in the world are published through MAVEN, of course, no problem, but the reality is often not the case. It's going to be a bit of a sink. The domestic manufacturers of the release of the SDK and other libraries, almost do not provide a warehouse address, to a compressed package put a bunch of jar package come in, let users themselves to solve the dependency management problems. And Gradle is more flexible in this respect, such as support:
```
 compile filetree (dir: ' Libs ', include: ' *.jar ')
```
Such a configuration rule.
In addition, because Gradle is a language, it is possible to manage dependencies in a programmatic way. For example, most sub-projects rely on a library, in addition to a few, you can write:
```
configure(subprojects.findAll {it.name != 'xxx1’ && it.name != ‘xxx2’}) {  
     dependencies {  
         compile("com.google.guava:guava:18.0”)  
     }  
 }
```
Sub-projects and dynamic dependency mechanisms
Dynamic dependency is mainly used to solve a number of interdependent libraries in the rapid development of the dependency problem, not each time the formation of the library changes to release a new version, the upper library to modify the dependent configuration file, so you need to dynamically set dependent on the latest version.
MAVEN's solution is the snapshot mechanism, and the dependencies between the sub-projects are also implemented through this mechanism. We also analyzed the problems that we encountered before.
Although Gradle is also compatible with the snapshot mechanism of MAVEN repository, it does not introduce snapshot mechanism on its own version management mechanism. Its dependencies support 4.x,2.+ such configuration rules to enable dynamic dependency (Note: Maven also supports similar rules, see Dependency Version Requirement Specification). Dependencies between sub-projects are based on a special dependency configuration, which differs from the configuration rules for third-party libraries. It is used directly:
Compile project (": Subpoject-name");

With this configuration, there is no need to configure a version number, specifically a subproject, to avoid the issue of the version number caused by Maven's subproject dependencies. You do not need to display the configuration parent project in the configuration of a subproject, only the parent project one-way configuration of the subproject list is required.
At the same time, Gradle release mechanism is also more flexible, support release to various warehouses (including Maven warehouse), but do not control the release process version number generation, modify the source warehouse and other steps, left to the user by hand or CI tools, or scripts to solve.

Improvements to Gradle relative MAVEN here are the main points to be listed here, the other can be see the Gradle official comparison table: Maven_vs_gradle, here no longer detailed.

Multi-project and dependency management issues in the Go language

Finally, we talk about the multi-project of Go language and the problem of dependency management. Go officials do not make any agreement or provide tools for these two aspects, so they can only solve each other. A multi-project problem is usually a return to the Makefile+ script solution, such as kubernetes. Rely on management, open source community more use godeps,kubernetes also this. Godeps through the source repository path and source tag to determine the coordinates of the library, only management dependencies, a bit like ivy, do not care about the construction process. Godepes will also add dependent library dependencies to the current project's dependency configuration, not a dynamic dependency delivery mechanism. There is no scope and does not differentiate whether it is a dependency of a unit test. A warehouse supports only one configuration, there is no subproject concept, and the project is much more complicated to manage. In addition, it does not solve the problem of transitive dependency and version conflict at present (there are some related issue).

The multi-project of a language and the dependency management scheme have a great influence on the ecological development of the language, and the development of Java to now, Maven and Gradle, so feel that go official should make a difference in these two aspects. The go language has been slow to rely on management tools personally think there are several aspects to consider:

Go has not yet determined the mechanism of the dynamic library. Compiled language relies on the best binary, not the source. On the one hand can speed up the compilation speed, on the other hand can also achieve source protection, convenient distribution and proxy caching, so that the scope of the language more widely. Many commercial libraries are inconvenient to provide source code. Therefore, the implementation of dependency management tools requires the mechanism of dynamic libraries. The reason why the dynamic library has not been determined I think it is the go language does not want to introduce the binary dynamic library of the format compatibility problem, the initial use of the source code is the most convenient.
Let the community try the water first and see the results and feedback.

Any language, development to a certain stage can not open the dependency management problem. A while ago saw a write go language article, taunt Java's MAVEN build a project hate can not half the Internet download down, I was in the mind of the Elder's classic quote "Pattern Tucson broken." Go is not currently experiencing these problems because go is still relatively young, the library is not rich enough, and many of the go projects are not complex enough. A project like Kubernetes, which currently relies on 226 of them, builds up and downloads half a github. So personally feel that the go community is still very much in need of a tool similar to gradle, to solve the dependency management, construction, multi-project management and other issues.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More