Building an abstract Java API for regular expressions

Source: Internet
Author: User
Tags abstract html page

When you use regular expressions in Java, it is often not a good idea to rely on a specific regexp library. If you use an abstraction layer, you can switch between different regexp libraries, reduce the coupling between your code and a particular library, and choose which library is best for your needs. If you are considering using the Java RegExp library in the next project, software developer Jose San Leandro Armendariz will show you how to make your code independent of the specific libraries you select. And give you a closer look at regexp and how it works, and then provide some practice.

Brief introduction

Although you might think writing a Java application that needs to parse text is a simple task, it can quickly become complex, like many things. That's really my experience when writing code to parse an HTML page. At the beginning, I occasionally use PERL5 regular expressions (regexp). However, for some reason (later), I often use them later.

Background knowledge

In my experience, most Java developers need to parse some kind of text. Typically, this means that they initially spend some time using a Java string-related function or method like indexOf or substring, and want the input format to never change. However, if the input format changes, the code used to read the new format becomes more complex and more difficult to maintain. Finally, the code may need to support word wrapping, case-sensitive, and so on.

Because logic becomes more complex, maintenance becomes very difficult. Because any changes can have side effects and cause other parts of the text parser to stop working, developers need time to fix these minor errors.

Developers who have some Perl experience may also have experience using regular expressions. If you're lucky (or good), the developer can convince the rest of the team (or at least the team leader) to use the technology. The new method cancels writing multiple lines of code to invoke the String method, which means that the core of the parser logic is delegated and replaced with the RegExp library.

After accepting the advice of a developer with PERL5 experience, the team must choose which regex to implement the project that best suits them. Then they need to learn how to use it.

After a brief look at the many alternatives found on the Internet, suppose the team decides to choose a use from a more familiar library, such as the Oro that belongs to the Jakarta project. Next, the parser is refactored or almost rewritten, and the parser eventually uses the Oro class, such as Perl5compiler, Perl5matcher, and so on.

The consequences of this decision are obvious:

The code is tightly coupled with the classes of the Jakarta Oro.

The team takes risks because it does not know whether non-functional requirements, such as performance or threading models, will be met.

The team has spent time and money to learn and rewrite the code so that it can use the RegExp library. If their decision is wrong and a new library is selected, the work will not be much different in cost because it will require rewriting the code again.

What if they decide that they should migrate to a new library (for example, a library included in JDK 1.4), even if the library is working properly?

The benefits of decoupling

Is there a way to get the team to know which implementation is best for them (not only now but also in the future)? Let's try to find the answer.

Avoid relying on any particular implementation

The previous scenario is very common in software engineering. In some cases, such situations can lead to larger investments and longer delays. This is often the case when decisions are made without knowing all the consequences and the decision makers are unlucky or lack the necessary experience.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.