[Oil-dev] Brace Expansion

Andy Chu andychup at gmail.com
Mon Mar 13 00:26:17 PDT 2017


As mentioned toward the end of http://www.oilshell.org/blog/2017/03/09.html
, I'm working on the word evaluation pipeline.

Brace expansion was tested but never implemented, and I just started.  It's
work in progress, but here is a somewhat interesting commit:

https://github.com/oilshell/oil/commit/43581dec1a85b748698d6aa592e268187f003b43

I greatly expanded the tests:

http://www.oilshell.org/git-branch/master/43581dec/andy-home/spec/brace-expansion.html

I found that dash, mksh, zsh, and bash implement 0,1,2, and 3 kinds of
brace expansion (see commit message)

-----

And I found at least one bug in bash, and cases where zsh and bash
disagree.  For example, on negative steps like:

echo {1..5..2} # 1 3 5
echo {1..5..-2}  # 1 3 5 in bash, 5 3 1 in zsh

Bash ignores the sign while zsh inverts the order.

-----

Also the very last test case shows that bash is the only that implement
brace expansion BEFORE variable/command/arith substitution:

http://www.oilshell.org/git-branch/master/43581dec/andy-home/spec/brace-expansion.test.html#L218

If you have a side effect like i++, it's done 3 times in bash, but once in
the other shells.

-----

Since brace expansion isn't a POSIX feature, I guess I'm going with bash
since it's the most popular shell.

However, I will be more strict about stuff like

$ echo {a,b}{
a{ b{

If braces are unbalanced, I think I will abort the whole thing, and it will
just print

{a,b}{

This is because I don't like the "punning" of "operator" braces and litearl
braces.  Any of these will work, and are clearer:

$ echo {a,b}\{  # escaped
$ echo {a,b}'{'  # single quoted
$ echo {a,b}"{"  # double-quoted

It's possible this may break scripts but the fix is easy... and the bonus
is that the documentation can actually be accurate.  You can't actually
document what bash does -- there are too many edge cases that are hard to
predict/describe, like {{a,b} .   Which one is a literal, which one is an
operator?  Or neither?

-----

Also, I think the algorithms I used are farily clean -- I split it into
detection (converting a word into a tree of alternatives) and expansion
(one word to multiple words), whereas bash and mksh seem to do it all at
once.

ASDL continues to pay off -- I like how the AltPart and NumRangePart
variants fit into the existing schema.    (See the diff of osh.asdl)

If anyone is interested I can explain the WordPart representation a little
more.  It's a little different than existing shells but I think very clean.

The next step will be actually integrating these two algorithms into the
code.  I guess that BraceDetect will go into osh/cmd_parse.py -- so that
the expanded tree is preserved in the LST.  We will need to convert brace
expansion to Oil.

BraceExpand could actually be a "compile stage" step in an optimized
interpreter/compiler, because it's INDEPENDENT of any input like
variables.  For the Python prototype, I'll just put it in core/cmd_exec.py.

I'll update this thread with more details as I implement them.  Hopefully
this is giving people a sense of how I'm doing things.  Feel free to ask
questions if things don't make sense.

Andy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.oilshell.org/pipermail/oil-dev-oilshell.org/attachments/20170313/d67b4139/attachment.htm>


More information about the Oil-dev mailing list