[Oil-dev] Brace Expansion
Andy Chu
andychup at gmail.com
Mon Mar 13 00:26:17 PDT 2017
As mentioned toward the end of http://www.oilshell.org/blog/2017/03/09.html
, I'm working on the word evaluation pipeline.
Brace expansion was tested but never implemented, and I just started. It's
work in progress, but here is a somewhat interesting commit:
https://github.com/oilshell/oil/commit/43581dec1a85b748698d6aa592e268187f003b43
I greatly expanded the tests:
http://www.oilshell.org/git-branch/master/43581dec/andy-home/spec/brace-expansion.html
I found that dash, mksh, zsh, and bash implement 0,1,2, and 3 kinds of
brace expansion (see commit message)
-----
And I found at least one bug in bash, and cases where zsh and bash
disagree. For example, on negative steps like:
echo {1..5..2} # 1 3 5
echo {1..5..-2} # 1 3 5 in bash, 5 3 1 in zsh
Bash ignores the sign while zsh inverts the order.
-----
Also the very last test case shows that bash is the only that implement
brace expansion BEFORE variable/command/arith substitution:
http://www.oilshell.org/git-branch/master/43581dec/andy-home/spec/brace-expansion.test.html#L218
If you have a side effect like i++, it's done 3 times in bash, but once in
the other shells.
-----
Since brace expansion isn't a POSIX feature, I guess I'm going with bash
since it's the most popular shell.
However, I will be more strict about stuff like
$ echo {a,b}{
a{ b{
If braces are unbalanced, I think I will abort the whole thing, and it will
just print
{a,b}{
This is because I don't like the "punning" of "operator" braces and litearl
braces. Any of these will work, and are clearer:
$ echo {a,b}\{ # escaped
$ echo {a,b}'{' # single quoted
$ echo {a,b}"{" # double-quoted
It's possible this may break scripts but the fix is easy... and the bonus
is that the documentation can actually be accurate. You can't actually
document what bash does -- there are too many edge cases that are hard to
predict/describe, like {{a,b} . Which one is a literal, which one is an
operator? Or neither?
-----
Also, I think the algorithms I used are farily clean -- I split it into
detection (converting a word into a tree of alternatives) and expansion
(one word to multiple words), whereas bash and mksh seem to do it all at
once.
ASDL continues to pay off -- I like how the AltPart and NumRangePart
variants fit into the existing schema. (See the diff of osh.asdl)
If anyone is interested I can explain the WordPart representation a little
more. It's a little different than existing shells but I think very clean.
The next step will be actually integrating these two algorithms into the
code. I guess that BraceDetect will go into osh/cmd_parse.py -- so that
the expanded tree is preserved in the LST. We will need to convert brace
expansion to Oil.
BraceExpand could actually be a "compile stage" step in an optimized
interpreter/compiler, because it's INDEPENDENT of any input like
variables. For the Python prototype, I'll just put it in core/cmd_exec.py.
I'll update this thread with more details as I implement them. Hopefully
this is giving people a sense of how I'm doing things. Feel free to ask
questions if things don't make sense.
Andy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.oilshell.org/pipermail/oil-dev-oilshell.org/attachments/20170313/d67b4139/attachment.htm>
More information about the Oil-dev
mailing list