Tuesday, May 13, 2008

Building a better Pyrex - part 1

With our recent decision to drop the Pyrex codebase and start from scratch, we find ourselves repeating a process from 2 years ago;

When PySoy first forked off of Soya, we had a small list of things that needed to be fixed based on our negative experiences working with that project. The API needed to be cleaned up, pydocs added, Shape needed to be mutable and implemented with vertex arrays or VBOs rather than display lists, etc.

As we worked on it, that list grew longer, including rewriting large parts of the codebase, until we found ourselves at a point where starting over would be more expedient. Of course now there's little doubt that was a good move, learning from both Soya's mistakes and our own failed early attempts we now have a pretty rocking architecture.

The first and most foundational change we're making from Pyrex, as previously stated, is in the lexical scanner and parser. Both Pyrex and Cython are based on Plex, which was Greg's answer to processing a Python-like language.

In contrast, we'll be using Python's own tokenize module for our lexical scanner and an ASDL-based parser akin to the parser used in Python 3.0. We're extending Python.asdl with cdef, ctypedef, cimport, cinclude, etc.

This way we have less code to debug and a solid foundation to start with. With lexing and parsing almost "for free" we'll be able to focus on the important bits and get the platform able to support the kind of features we need quickly.

We've also reached a rough consensus on the following syntax changes from Pyrex:
  • unicode, generators, decorators, etc will be supported
  • all custom for syntaxes dropped in favor of Python's standard
  • with nogil: replaced with full with support
  • special methods will always have self as first argument (ie, both __sub__ and __rsub__ used)
  • cinclude added for direct C header parsing

There's more to come, which I'll post in additional installments as work progresses on this project.

3 comments:

vivaladav said...

You should code a new Gnu/Linux kernel too... I can't wait for that!

Three cheers for Arc!!!

Unknown said...

Yes, let's approach every ambitious project with sarcasm and doubt, great way to build community.

In all seriousness, language transformation is really not very complicated once you have the parser done.

Unknown said...

Sounds like great features to me. I've been meaning to get into these transformers lately. Hopefully, with statement support will be there when I finally do. :)