This is the mail archive of the cygwin@sourceware.cygnus.com mailing list for the Cygwin project. See the Cygwin home page for more information.
[Date Prev][Date Next][Thread Prev][Thread Next]
[Date Index] [Subject Index] [Author Index] [Thread Index]

Re: AW: how to use emacs in -batch mode from bash?



[added cygwin@sourceware.cygnus.com]

On Tue, 16 Feb 1999 17:42:20 +0000, "Dr Francis J. Wright" <F.J.Wright@qmw.ac.uk> said:
>OK.  Putting the pieces together, this works and appears to do what you
>want:
>
>bash-2.02$ hi=HO; emacs -batch --eval "(message \\\"$hi\\\")"
>HO
>
>But that leaves the question: why does it work?
>
>bash-2.02$ set -x
>bash-2.02$ emacs -batch --eval "(message \\\"hi\\\")"
>+ emacs -batch --eval '(message \"hi\")'
>hi
>
>Hence, this is equivalent to my previous suggestion after variable
>interpolation.  But I agree with you, Mike, that so many \s should not
>be necessary.
>
>Could it be that NTEmacs is parsing its command line based on an
>assumption that is wrong when the shell is bash?  It's probably using
>libraries that assume the shell is COMMAND or CMD, which have different
>quoting rules.  Hence, when using bash it is necessary to quote in a way
>that makes no sense from a UNIX/bash perspective.

That's pretty much right on the nose (except that command.com/cmd.exe
don't really have quoting rules; they are too dumb for that).

This is the old "Microsoft vs Cygnus" quoting rules problem, but in
reverse this time.

The basic problem is that Windows applications normally rely on the C
library startup code to construct the argv[] array (list of command line
arguments) by parsing the command line.  (On DOS/Windows, the command
line is passed as a single string and it is entirely up to the
application how it interprets that string.  On Unix, applications
receive a list of argument strings exactly as provided by the parent.
The C libraries for Windows compilers provide startup code to
reconstruct the list of argument strings to emulate the Unix
environment.)

This technique of the startup code parsing the command line to construct
the argument list is perfectly reasonable, but Cygnus put a fly in the
ointment by using slightly incompatible rules for parsing the command
line.  The basic rule is the same for both: arguments are separated by
white space (which is discarded), so quotes must be put around arguments
that are intended to contain white space.  The rules diverge when
handling the case where a quote character itself appears in an argument
(an embedded quote), and must be escaped so it isn't misconstrued as the
end of the argument.

Now Emacs was made aware of the two quoting rules back in 19.34.6 days,
to solve the problem of constructing the command line for subprocesses
started from Emacs, so that the subprocess will "see" the list of
arguments that Emacs intends even if there are embedded quotes.  (Aside:
At the same time, I added some magic so that Emacs would detect
automatically which rules to use by looking at the application
executable, specifically to check whether it imports cygwin.dll.  That
has worked well, except that the magic broke with newer releases of the
Cygnus library when the dll name changed.  The next version of Emacs
will have better magic which works with all releases of the cygwin
library, and will hopefully continue to work with any future releases.)

However, we are now seeing the same problem occuring, this time on the
Cygnus side.  The Cygnus port of bash will be applying the normal shell
quoting rules to parse the command line typed by the user (or entered in
the shell script), to construct the list of arguments to pass to Emacs.
However, when bash invokes spawn() or exec() or some similiar library
function to actually invoke Emacs, it has to flatten the argument list
into a single string.  Clearly, the library function that does that is
assuming the subprocess will use the Cygnus quoting rules to reconstruct
the list of arguments.  That fails when an argument contains an embedded
quote and the application doesn't use the Cygnus rules, which is the
situation here.

Note that this is a problem with bash that applies when it invokes any
application not compiled with the cygwin library, not just Emacs.

I see two possible solutions to this general problem:

 1. Change the cygwin spawn/exec/whatever library functions to use the
    Microsoft rules for escaping embedded quotes when running non-cygwin
    applications (I believe they already detects when they are spawning
    non-cygwin applications; if not, the method Emacs uses could be
    reused for this).

 2. Change the cygwin quoting rules to match the Microsoft ones.  This
    would apply to spawn/exec and the startup code, and would cause some
    breakage when mixing with applications compiled with old versions of
    cygwin.

Since cygwin-compiled applications tend to be recompiled when new
releases of the library come out, option (2) might actually be viable,
and would be the preferred solution since it would maximise the
interoperability between applications.  But even option (1) would be a
major improvement.

AndrewI

PS. There is a certain amount of irony in all this: the Microsoft
startup code looks like it was intended to support escaping embedded
quotes by doubling them (as Cygnus does), but the parsing code contains
a bug which prevents this from working.  If not for this bug, the
problem with bash invoking non-cygwin applications wouldn't arise.


--
Want to unsubscribe from this list?
Send a message to cygwin-unsubscribe@sourceware.cygnus.com