Tutorials for creating declarative mini languages with Python

Source: Internet
Author: User
Tags xslt
When most programmers consider programming, they envision the imperative style and technique for writing applications. The most popular universal programming languages (including Python and other object-oriented languages) are mostly imperative in style. On the other hand, there are also many programming languages that are declarative, including functional and logical languages, as well as common languages and specialized languages.





Let's list several languages that belong to each category. Many readers have used many of the tools in these tools, but they do not necessarily consider the kind of differences between them. Python, C, C + +, Java, Perl, Ruby, Smalltalk, Fortran, Basic, and xBase are all simple command-type programming languages. Some of them are object-oriented, but that's just a matter of organizing code and data, not basic programming styles. Using these languages, you command the program to execute a sequence of instructions: Put some data into a (put) variable, FETCH (FETCH) data from a variable, loop (loop) a block of instructions until (until) satisfies certain conditions, and if (if) a proposition is true, then do something. One of the beauties of all these languages is that they are easy to think about with familiar metaphors in everyday life. Everyday life is made up of doing, choosing, and doing another thing, and maybe using some tools during the period. You can simply imagine a computer running a program as a chef, masons, or car driver.



Languages such as Prolog, Mercury, SQL, XSLT, EBNF syntax, and true configuration files in various formats declare that something is the case, or some constraints are applied. Functional languages such as Haskell, ML, Dylan, Ocaml, and Scheme are similar, but they emphasize the internal (function) relationships between the presentation of programming objects (recursion, lists, and so on). Our daily lives (at least in terms of narrative quality) do not provide direct simulations of the programming constructs of these languages. However, for problems that can be described in these languages, declarative descriptions are far simpler and less error-prone than imperative solutions. For example, consider the following linear equation group:
Listing 1. Linear equation System sample





10x + 5y - 7z + 1 = 0
17x + 5y - 10z + 3 = 0
5x - 4y + 3z - 6 = 0





This is a pretty nice simple expression that illustrates several relationships between objects (x, y, and Z). In real life you may find these answers in different ways, but it's really annoying and error-prone to "solve X" with a pen and paper. From a debugging standpoint, writing a solution step in Python might be worse.



Prolog is a language that is closely related to logic or mathematics. In this language, you just write the statements that you know are correct, and then let the application get the results for you. Statements are not made in a particular order (as with linear equations, without order), and you (programmers or users) do not know what steps are taken to produce the results. For example:
Listing 2. Family.pro Prolog Sample





/* Adapted from sample at:
This app can answer questions about sisterhood & love, e.g.:
# Is alice a sister of harry?
?-sisterof( alice, harry )
# Which of alice' sisters love wine?
?-sisterof( X, alice ), love( X, wine)
*/
sisterof( X, Y ) :- parents( X, M, F ),
          female( X ),
          parents( Y, M, F ).
parents( edward, victoria, albert ).
parents( harry, victoria, albert ).
parents( alice, victoria, albert ).
female( alice ).
loves( harry, wine ).
loves( alice, wine ).





It is not exactly the same as EBNF (extended Backus paradigm, Extended backus-naur Form) syntax, but essentially similar. You can write some of the following declarations:
Listing 3. EBNF sample





word    := alphanums, (wordpunct, alphanums)*, contraction?
alphanums  := [a-zA-Z0-9]+
wordpunct  := [-_]
contraction := "'", ("clock"/"d"/"ll"/"m"/"re"/"s"/"t"/"ve")





If you encounter a word and want to express what it might look like, and actually don't want to give a sequence of instructions on how to recognize it, this is a concise approach. The regular expression is similar to this (and in fact it satisfies the needs of this particular grammar product).



There is another declarative example of a document type declaration that describes a valid XML document dialect:
Listing 4. XML Document Type Declaration


As with other examples, the DTD language does not contain any instructions on how to identify or create a valid XML document. It only describes what it would be like if the document existed. The declarative language adopts subjunctive mood.
Python as an interpreter vs Python as an environment



The Python library can take advantage of declarative language in one of two distinct ways. Perhaps the more common technique is to parse and process non-Python declarative languages as data. An application or library can read into an external source (or a string that is internally defined but used only as a "blob") and then indicate a set of imperative steps to be performed, which are in some form consistent with those external declarations. In essence, these types of libraries are "data-driven" systems, and there are conceptual and categorical differences between declarative languages and the actions that Python applications perform or take advantage of their claims. In fact, it is quite common that libraries that deal with the same claims are also used to implement other programming languages.



All of the examples given above are part of the first technique. Library Pylog is the Python implementation of the Prolog system. It reads a Prolog data file like a sample, and then creates a Python object to model the Prolog declaration. The EBNF sample uses a specialized variant, Simpleparse, which is a Python library that converts these declarations to be MX. The state table used by the Texttools. Mx. Texttools itself is a Python extension library that uses the underlying C engine to run code stored in a python data structure, but has little to do with python in nature. Python is an excellent binder for these tasks, but the language that is glued together differs greatly from Python. Also, most Prolog implementations are not written in Python, as is the case with most EBNF parsers.



DTDs are similar to other examples. If you use a validation parser like Xmlproc, you can use a DTD to validate the dialect of an XML document. But the language of the DTD is not Python, and Xmlproc only uses it as data that needs to be parsed. Furthermore, XML validation parsers have been written in many programming languages. XSLT transformations are similar, not Python-specific, and modules like ft.4xslt use Python only as an "adhesive."



Although the above approach and the tools mentioned above (which I have been using all the time) are nothing wrong, if Python itself is a declarative language, then it may be more subtle and some aspects will be more clearly expressed. If there are no other factors, a library that helps in this way does not allow programmers to consider whether to use two (or more) languages when writing an application. Sometimes it is simple and useful to rely on Python's introspection to implement a "native" statement.



The magic of Introspection



The parser Spark and PLY let the user declare Python values in Python, and then use some magic to make the Python runtime environment parse the configuration. For example, let's look at the PLY syntax equivalent to the previous Simpleparse syntax. Spark is similar to the following example:
Listing 5. PLY sample





tokens = ('ALPHANUMS','WORDPUNCT','CONTRACTION','WHITSPACE')
t_ALPHANUMS = r"[a-zA-Z0-0]+"
t_WORDPUNCT = r"[-_]"
t_CONTRACTION = r"'(clock|d|ll|m|re|s|t|ve)"
def t_WHITESPACE(t):
  r"\s+"
  t.value = " "
  return t
import lex
lex.lex()
lex.input(sometext)
while 1:
  t = lex.token()
  if not t: break





I've written about PLY in my forthcoming book, Text Processing in Python, and I've written about Spark in this column (see Resources for a link). Rather than delving into the details of the library, you should note that it is the Python binding itself that configures the analysis (which is actually lexical parsing/tagging in this example). The PLY module runs in a Python environment to act on these schema declarations, so you know exactly what the environment is.



Ply how to know what it does, this involves some very bizarre Python programming. At first, intermediate programmers would find it possible to pinpoint the contents of the Globals () and locals () dictionaries. It would be nice if the declaration style was slightly different. For example, the hypothetical code is more like this:
Listing 6. Using the imported module namespace





import basic_lex as _
_.tokens = ('ALPHANUMS','WORDPUNCT','CONTRACTION')
_.ALPHANUMS = r"[a-zA-Z0-0]+"
_.WORDPUNCT = r"[-_]"
_.CONTRACTION = r"'(clock|d|ll|m|re|s|t|ve)"
_.lex()





The declarative nature of this style is not bad, and it can be assumed that the Basic_lex module contains simple content similar to the following:
Listing 7. basic_lex.py





def lex():
  for t in tokens:
    print t, '=', globals()[t]





This will produce:





% python basic_app.py
ALPHANUMS = [a-zA-Z0-0]+
WORDPUNCT = [-_]
CONTRACTION = '(clock|d|ll|m|re|s|t|ve)





PLY managed to insert the namespace of the import module using the stack frame information. For example:
Listing 8. magic_lex.py





import sys
try: raise RuntimeError
except RuntimeError:
  e,b,t = sys.exc_info()
  caller_dict = t.tb_frame.f_back.f_globals
def lex():
  for t in caller_dict['tokens']:
    print t, '=', caller_dict['t_'+t]





This produces the same output as the output given to the basic_app.py sample, but with a declaration using the previous T_token style.



The actual PLY module is more magical than this. We see that a token named with pattern T_token can actually be a string containing a regular expression, or a function that contains a regular expression document string and manipulation code. Some types of checks allow the following polymorphic behavior:
Listing 9. Polymorphic_lex





# ...determine caller_dict using RuntimeError...
from types import *
def lex():
  for t in caller_dict['tokens']:
    t_obj = caller_dict['t_'+t]
    if type(t_obj) is FunctionType:
      print t, '=', t_obj.__doc__
    else:
      print t, '=', t_obj





Obviously, the real PLY module can do more interesting things with these declared patterns than the example used to play, but these examples illustrate some of the techniques involved.



The Magic of Inheritance



Let the support library insert and manipulate the application's namespace everywhere, which enables subtle declarative styles. In general, however, the use of inheritance structures with introspection makes for greater flexibility.



Module Gnosis.xml.validity is the framework used to create classes that map directly to the DTD product. Any gnosis.xml.validity class can only be instantiated with parameters that conform to the validity constraints of the XML dialect. In fact, this is not quite true, and the module can infer the correct type from simpler parameters when there is only one definite way to "lift" the parameter to the correct type.



Since I have written the gnosis.xml.validity module, I tend to think about whether its use is interesting. But for this article, I just want to look at declarative styles that create effectivity classes. A set of rules/classes that match the previous DTD sample includes:
Listing 10. gnosis.xml.validity Rule Declaration





from gnosis.xml.validity import *
class figure(EMPTY):   pass
class _mixedpara(Or):   _disjoins = (PCDATA, figure)
class paragraph(Some):  _type = _mixedpara
class title(PCDATA):   pass
class _paras(Some):    _type = paragraph
class chapter(Seq):    _order = (title, _paras)
class dissertation(Some): _type = chapter





You can use the following command to create an instance from these declarations:





ch1 = LiftSeq(chapter, ("1st Title","Validity is important"))
ch2 = LiftSeq(chapter, ("2nd Title","Declaration is fun"))
diss = dissertation([ch1, ch2])
print diss





Note that these classes are very well matched to the previous DTD. The mapping is basically one by one, except that it is necessary to quantify and alternately use the intermediary for nested tags (the name of the intermediary is marked with a leading underscore).



Also note that although these classes are created in standard Python syntax, they are also unusual (and more concise): They have no method or instance data. Classes are defined separately to inherit classes from a framework that is constrained by a single class attribute. For example,it is a sequence of other tokens, i.e.followed by one or moreMark. But to ensure that the constraints are adhered to in the instance, all we need to do is declare the chapter class in this simple way.



The main "trick" involved in writing a parent program like GNOSIS.XML.VALIDITY.SEQ is to study the. __class__ property of an instance during initialization. The class chapter itself is not initialized, so the __init__ () method of its parent class is called. But the self that is passed to the parent class __init__ () is an instance of chapter, and self knows chapter. To illustrate this point, some of the GNOSIS.XML.VALIDITY.SEQ implementations are listed below:
Listing 11. Class Gnosis.xml.validity.Seq


classSeq(tuple):
  def__init__(self, inittup):
    ifnothasattr(self.__class__,'_order'):
      raiseNotImplementedError, \
        "Child of Abstract Class Seq must specify order"
    ifnotisinstance(self._order,tuple):
      raiseValidityError,"Seq must have tuple as order"
    self.validate()
    self._tag=self.__class__.__name__


Once an application programmer attempts to create an chapter instance, the instantiation code checks to see if the chapter is declared with the required. _order class property and checks to see if the property is the desired tuple object. method. Validate () a further check is made to ensure that the object used to initialize the instance belongs to the appropriate class specified in. _order.



When to declare



Declarative programming styles are almost always more straightforward than imperative or procedural styles in declaring constraints. Of course, not all programming problems are about constraints-or at least this is not always the law of nature. But if rules-based systems (such as grammar and inference systems) can be declarative, their problems are easier to deal with. The syntax-compliant imperative validation quickly becomes a very complex and difficult-to-understand "spaghetti Code" (spaghetti), and is difficult to debug. The declaration of patterns and rules can still be simpler.



Of course, at least in Python, the validation and enhancement of claim rules will always boil down to procedural checks. However, it is appropriate to place this process check in the library code that is well tested. A separate application should rely on a simpler declarative interface provided by libraries like Spark or PLY or gnosis.xml.validity. Other libraries, such as Xmlproc, Simpleparse, or FT.4XSLT, can use declarative styles, although they are not declared in Python (Python certainly applies to their domain).


  • Related Article

    Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.