Discussion:
problems making a ccall to a varargs routine (v1.37.20)
Eric Mandel
2017-09-06 19:33:01 UTC
Permalink
In v1.37.20, I don't seem to be able to use Module.ccall to invoke a
varargs routine, even if I specify the exact number of arguments for the
given invocation. Should this work?

Consider a cvals() routine that just prints out the double precision
varargs until the stop marker is found (or until we have gone too far):

int nmax = 4;

int cvals(double a, ...){
int i=0;
double b;
va_list args;
// declared value
va_start(args, a);
fprintf(stdout, "in cvals: [declared: %f]\n", a);
while( 1 ){
// get next double precision value
b = va_arg(args, double);
// stop if we reached the end marker or go beyond max
if( b < 0 ){
fprintf(stdout, " found end marker\n");
break;
} else if( i > nmax ){
fprintf(stdout, " went past max args (BAD): %d\n", nmax);
break;
} else {
fprintf(stdout, " vararg %d: %f\n", i, b);
i++;
}
}
va_end(args);
return i;
}


The expected result when calling this directly in C, e.g.:

cval(100.0, 1.01, 2.02, 3.03, -1.0)


is:

in cvals: [declared: 100.000000]
vararg 0: 1.010000
vararg 1: 2.020000
vararg 2: 3.030000
found end marker

Using Module.ccall with a specific number of args gives a bogus result:

Module.ccall("cvals", "null", ["number", "number", "number", "number",
"number"], [100.0, 1.01, 2.02, 3.03, -1.0])
in cvals: [declared: 100.000000]
vararg 0: 0.000000
vararg 1: 0.000000
vararg 2: 0.000000
vararg 3: 0.000000
vararg 4: 0.000000
went past max args (BAD): 4


I see from a previous post (Binding varargs function, 9/15/13) that cwrap
does not (yet) support varargs. But should ccall with a specific number of
args work? If not, are there any suggested work-arounds, short of re-coding
the C varargs routine?

Thanks!

Eric
--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-discuss+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Alon Zakai
2017-09-07 01:11:17 UTC
Permalink
Yeah, ccall doesn't currently support C varargs methods. Under the hood,
the ABI we use is to pass a pointer to the location of the arguments in
memory, so the function actually has 1 argument (that pointer). ccall isn't
aware of this, so it'll just pass the first argument you provide it as that
pointer, so nothing works.

As a workaround, you can write C wrapper functions for various fixed
numbers of arguments, something like that.

If someone wants to try, btw, it should be possible to add varargs support
to ccall. Basically if we tell ccall the target is varargs (we'd need to
add a way to do that) then it should allocate some stack space, write the
arguments, and call the method with a pointer to those arguments.
Post by Eric Mandel
In v1.37.20, I don't seem to be able to use Module.ccall to invoke a
varargs routine, even if I specify the exact number of arguments for the
given invocation. Should this work?
Consider a cvals() routine that just prints out the double precision
int nmax = 4;
int cvals(double a, ...){
int i=0;
double b;
va_list args;
// declared value
va_start(args, a);
fprintf(stdout, "in cvals: [declared: %f]\n", a);
while( 1 ){
// get next double precision value
b = va_arg(args, double);
// stop if we reached the end marker or go beyond max
if( b < 0 ){
fprintf(stdout, " found end marker\n");
break;
} else if( i > nmax ){
fprintf(stdout, " went past max args (BAD): %d\n", nmax);
break;
} else {
fprintf(stdout, " vararg %d: %f\n", i, b);
i++;
}
}
va_end(args);
return i;
}
cval(100.0, 1.01, 2.02, 3.03, -1.0)
in cvals: [declared: 100.000000]
vararg 0: 1.010000
vararg 1: 2.020000
vararg 2: 3.030000
found end marker
Module.ccall("cvals", "null", ["number", "number", "number", "number",
"number"], [100.0, 1.01, 2.02, 3.03, -1.0])
in cvals: [declared: 100.000000]
vararg 0: 0.000000
vararg 1: 0.000000
vararg 2: 0.000000
vararg 3: 0.000000
vararg 4: 0.000000
went past max args (BAD): 4
I see from a previous post (Binding varargs function, 9/15/13) that cwrap
does not (yet) support varargs. But should ccall with a specific number of
args work? If not, are there any suggested work-arounds, short of re-coding
the C varargs routine?
Thanks!
Eric
--
You received this message because you are subscribed to the Google Groups
"emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-discuss+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Jukka Jylänki
2017-09-07 07:56:48 UTC
Permalink
Marked down https://github.com/kripken/emscripten/issues/5563 to
remember this for later. If you want to try to implement it yourself,
or a workaround, check out this snippet:
https://github.com/juj/emscripten/blob/multithreading/src/library_html5.js#L209,
which calls a vararg function from JS side.
Yeah, ccall doesn't currently support C varargs methods. Under the hood, the
ABI we use is to pass a pointer to the location of the arguments in memory,
so the function actually has 1 argument (that pointer). ccall isn't aware of
this, so it'll just pass the first argument you provide it as that pointer,
so nothing works.
As a workaround, you can write C wrapper functions for various fixed numbers
of arguments, something like that.
If someone wants to try, btw, it should be possible to add varargs support
to ccall. Basically if we tell ccall the target is varargs (we'd need to add
a way to do that) then it should allocate some stack space, write the
arguments, and call the method with a pointer to those arguments.
Post by Eric Mandel
In v1.37.20, I don't seem to be able to use Module.ccall to invoke a
varargs routine, even if I specify the exact number of arguments for the
given invocation. Should this work?
Consider a cvals() routine that just prints out the double precision
int nmax = 4;
int cvals(double a, ...){
int i=0;
double b;
va_list args;
// declared value
va_start(args, a);
fprintf(stdout, "in cvals: [declared: %f]\n", a);
while( 1 ){
// get next double precision value
b = va_arg(args, double);
// stop if we reached the end marker or go beyond max
if( b < 0 ){
fprintf(stdout, " found end marker\n");
break;
} else if( i > nmax ){
fprintf(stdout, " went past max args (BAD): %d\n", nmax);
break;
} else {
fprintf(stdout, " vararg %d: %f\n", i, b);
i++;
}
}
va_end(args);
return i;
}
cval(100.0, 1.01, 2.02, 3.03, -1.0)
in cvals: [declared: 100.000000]
vararg 0: 1.010000
vararg 1: 2.020000
vararg 2: 3.030000
found end marker
Module.ccall("cvals", "null", ["number", "number", "number", "number",
"number"], [100.0, 1.01, 2.02, 3.03, -1.0])
in cvals: [declared: 100.000000]
vararg 0: 0.000000
vararg 1: 0.000000
vararg 2: 0.000000
vararg 3: 0.000000
vararg 4: 0.000000
went past max args (BAD): 4
I see from a previous post (Binding varargs function, 9/15/13) that cwrap
does not (yet) support varargs. But should ccall with a specific number of
args work? If not, are there any suggested work-arounds, short of re-coding
the C varargs routine?
Thanks!
Eric
--
You received this message because you are subscribed to the Google Groups
"emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-discuss+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Eric Mandel
2017-09-07 13:28:50 UTC
Permalink
Thanks for creating an issue. I may try to pick at a solution in my
so-called free time, in which case we can continue this on GH/5563.

Perhaps a varargs-styled syntax extension like this would be natural:

Module.ccall("cvals", "null", ["number", "..."], [100.0, 1.01, 2.02, 3.03,
-1.0])


where "..." is only valid in the last "type position", and varargs types
after the explicitly typed arg(s) must be either "string" or "number",
determined by the implicit type of the arg itself. Though it looks like
this might cause complications in cwrap ...

I might be misunderstanding your code snippet: it looks more like optional
args (known number, but not all of them necessarily present) instead of
varargs (unknown number of args determined either by a terminating marker
or with the number of args itself passed in the call). Our varargs routines
are used to specify (for example) an annulus with an arbitrary number (tens
or even hundreds) of successive radii. If I've got your example wrong,
please let me know.

Thanks again,

Eric
--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-discuss+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Jukka Jylänki
2017-09-08 14:19:00 UTC
Permalink
The code there is calling a varargs function, but only with a fixed
number of arguments. The same method applies though for calling an
arbitrary number of args, one will then use a for loop to populate the
parameters.

I think we'd want to avoid using an ellipsis string "..." as an
identifier, but preferably use some other kind of method to identify,
perhaps just the presence of a secondary array would denote varargs.
One thing that is important though is that there will need to be a
field that identifies the signature of the varargs, because it needs
to be possible to call both integer and float signatures. Non-default
C conversions can also have integers and floats of different size, so
we'll need to have a way to be forward compatible to those as well,
even if we did not implement them right away.
Post by Eric Mandel
Thanks for creating an issue. I may try to pick at a solution in my
so-called free time, in which case we can continue this on GH/5563.
Module.ccall("cvals", "null", ["number", "..."], [100.0, 1.01, 2.02, 3.03,
-1.0])
where "..." is only valid in the last "type position", and varargs types
after the explicitly typed arg(s) must be either "string" or "number",
determined by the implicit type of the arg itself. Though it looks like this
might cause complications in cwrap ...
I might be misunderstanding your code snippet: it looks more like optional
args (known number, but not all of them necessarily present) instead of
varargs (unknown number of args determined either by a terminating marker or
with the number of args itself passed in the call). Our varargs routines are
used to specify (for example) an annulus with an arbitrary number (tens or
even hundreds) of successive radii. If I've got your example wrong, please
let me know.
Thanks again,
Eric
--
You received this message because you are subscribed to the Google Groups
"emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-discuss+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Eric Mandel
2017-09-08 14:51:42 UTC
Permalink
I’ve written some exploratory code, using a cheap implementation of printf as the compiled C routine, just to see what is involved. Packing the varargs arguments into stack space is ugly 
 perhaps that is what your code does 
 but once it's done:

Module.ccall("miniprintf", "null", ["string", "..."], ["%s %f %s %f\n", "foo", 1.234, "goo", 2])
foo 1.23399999999999 goo 2


 although the ellipsis problem rears its ugly head immediately:

Module.ccall("miniprintf", "null", ["string", "..."], ["%s %f %s %d\n", "foo", 1.234, "goo", 2])
foo 1.23399999999999 goo 0

So, yes, we will need to integrate the signature into the varargs identifier. I think you are suggesting something like this:

Module.ccall("miniprintf", "null", ["string", “[f,s,i]"], ["%s %f %s %d\n", "foo", 1.234, "goo", 2])

which looks promising.
Post by Jukka Jylänki
The code there is calling a varargs function, but only with a fixed
number of arguments. The same method applies though for calling an
arbitrary number of args, one will then use a for loop to populate the
parameters.
I think we'd want to avoid using an ellipsis string "..." as an
identifier, but preferably use some other kind of method to identify,
perhaps just the presence of a secondary array would denote varargs.
One thing that is important though is that there will need to be a
field that identifies the signature of the varargs, because it needs
to be possible to call both integer and float signatures. Non-default
C conversions can also have integers and floats of different size, so
we'll need to have a way to be forward compatible to those as well,
even if we did not implement them right away.
--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-discuss+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Eric Mandel
2017-09-08 20:06:33 UTC
Permalink
I attach a working replacement of ccall (renamed to ccall_js, to avoid
Google attachment issues) that uses a varargs specification string of the
form "[d,i,s, ...]" to process varargs. So you can do this:

Module.ccall("miniprintf", "null", ["string", "[d,i,s,i,s,d]"], ["%f %d %s
%d %s %f\n", 1.234, 2, "foo", 3, "goo", -100.100])
1.23399999999999 2 foo 3 goo -100.09999999999999


(miniprintf also attached in calljs.c)

The varargs spec repeats if there are extra arguments, which would be our
typical astrophysics case with hundreds of vararg doubles making annuli:

Module.ccall("miniprintf", "null", ["string", "[d]"], ["%f %f %f %f\n",
1.234, 2, 3.14, -100])
1.23399999999999 2 3.14 -100


but it repeats as a whole, so you can do this:

Module.ccall("miniprintf", "null", ["string", "[d i s]"], ["%f %d %s %f %d
%s\n", 1.234, 2, "foo", 3.14, -100, "goo"])
1.23399999999999 2 foo 3.14 -100 goo


You will see that I speak Javascript with a heavy C accent, so I am not
suggesting it as a PR. Just let me know whether you want to pursue this
angle ...

Thanks,

Eric
--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-discuss+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Jukka Jylänki
2017-09-11 13:54:28 UTC
Permalink
That looks very promising. I think we'd probably want to flag the
presence of varargs with a separate function signature altogether,
such as Module.ccall_vararg(), to avoid any overhead to non-vararg
function calls (Module.cwrap() and Module.ccall() can be very on a
performance sensitive path).

In several different places in Emscripten toolchain, there already
exists these kind of "signature strings", where a single character
denotes the return type for multiple parameters. See e.g.
https://github.com/kripken/emscripten/blob/master/src/library_gl.js#L1024.
For example, a signature string "vii" would be a function taking two
32-bit integers (or pointers), and returning a void. I think same
signature string style could be used here, except that the first
character would not denote a return value, and the signature string
would only specify the varargs portion of the parameters, the
non-varargs portion is not needed.

Perhaps regex/globbing style * and + characters could be used to
denote a variable number trail. E.g. a string "iiif*" would denote
that the varargs part would have three 32-bit integers first, and
after that, 0-N single-precision floats. A string "ffd+" would say two
single-precision floats, followed by 1-N double-precision floats. This
would allow requiring one to start crafting strings that have as many
f's or d's as there are parameters in the input array. This way one
could be explicit about whether "iii" means exactly three, or 2-N, or
3-N, or 4-N.

If you're interested in championing this further, it would be best to
continue in a GitHub PR with work towards tests and patches.
Post by Eric Mandel
I attach a working replacement of ccall (renamed to ccall_js, to avoid
Google attachment issues) that uses a varargs specification string of the
Module.ccall("miniprintf", "null", ["string", "[d,i,s,i,s,d]"], ["%f %d %s
%d %s %f\n", 1.234, 2, "foo", 3, "goo", -100.100])
1.23399999999999 2 foo 3 goo -100.09999999999999
(miniprintf also attached in calljs.c)
The varargs spec repeats if there are extra arguments, which would be our
Module.ccall("miniprintf", "null", ["string", "[d]"], ["%f %f %f %f\n",
1.234, 2, 3.14, -100])
1.23399999999999 2 3.14 -100
Module.ccall("miniprintf", "null", ["string", "[d i s]"], ["%f %d %s %f %d
%s\n", 1.234, 2, "foo", 3.14, -100, "goo"])
1.23399999999999 2 foo 3.14 -100 goo
You will see that I speak Javascript with a heavy C accent, so I am not
suggesting it as a PR. Just let me know whether you want to pursue this
angle ...
Thanks,
Eric
--
You received this message because you are subscribed to the Google Groups
"emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-discuss+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Eric Mandel
2017-09-11 16:01:22 UTC
Permalink
Great, I'll try to clear some time in the Autumn to think about this
seriously -- this first offering was just a throw-away to see if there was
any general agreement/interest. A separate function signature is fine. And,
in principle, I agree that regex/globbing would be a preferred method of
defining a number trail. But it could get complicated when trying to
support the important use case of a trailing array of structs, where
something like "iii(didis)+" would be needed. That looks like a
(potentially slow) mess to parse and process, but we'll see.

BTW, I was not aware that floats (or chars or shorts) were allowed in the
va_start() macro. They are promoted to double, leading gcc and clang to
issue a compiler warning if you try to do something like: va_arg(argp,
float). There may have to be an emscripten-specific judgement call on how
to deal with that.
--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-discuss+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Jukka Jylänki
2017-09-13 16:09:54 UTC
Permalink
Post by Eric Mandel
I agree that regex/globbing would be a preferred method of
defining a number trail. But it could get complicated when trying to support
the important use case of a trailing array of structs, where something like
"iii(didis)+" would be needed. That looks like a (potentially slow) mess to
parse and process, but we'll see.
By regex, I only meant to support the characters + and * with the same
meaning as regex syntax has, I don't mean to support arbitrary regex
style string expansion. So we would not need parentheses or anything
like that, just a simple "if last char is a + or *, the preceding type
is multiplied 1-N or 0-N times". If someone has interest in supporting
full regex expansion like that, feel free, though I'd argue that
should only be available for cwrap() and not at all for ccall().
Post by Eric Mandel
BTW, I was not aware that floats (or chars or shorts) were allowed in the
va_start() macro. They are promoted to double, leading gcc and clang to
issue a compiler warning if you try to do something like: va_arg(argp,
float). There may have to be an emscripten-specific judgement call on how to
deal with that.
Hmm, that might be the case. I was under the impression that the
"standard promotion" only applied to C standard library functions, and
that arbitrary custom functions could do anything they wanted, but
perhaps the standard promotion applies to all types. In any case, good
to reuse the same signature string style so that it'll be ready for
extending to future uses, if needed/possible.
--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-discuss+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Eric Mandel
2017-09-13 16:48:05 UTC
Permalink
Post by Jukka Jylänki
By regex, I only meant to support the characters + and * with the same
meaning as regex syntax has, I don't mean to support arbitrary regex
style string expansion. So we would not need parentheses or anything
like that, just a simple "if last char is a + or *, the preceding type
is multiplied 1-N or 0-N times". If someone has interest in supporting
full regex expansion like that, feel free, though I'd argue that
should only be available for cwrap() and not at all for ccall().
Right, I was just trying to point out that restricting a repeating pattern
to the last variable does not satisfy an important use case, in which a
*group* of variables repeats:

weighted_centroid(x1, y1, n1, x2, y2, n2, x3, y3, n3 ..., xn, yn, nn, -1,
-1, -1);


where x,y positions are double and counts are int, and all three repeat as
a group until the end marker is found. My throw-away dealt with that using
a quick mod, so that I could prove to myself that our particular needs
could, in principle, be met without much processing overhead (we can call
our varargs routine thousands of times over a 2D image).

If I interpret our combined comments correctly, it's that we need a
*familiar* way to specify repeat groups, e.g., pseudo-regexp syntax, even
if full regexp will not be implemented.
--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-discuss+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Eric Mandel
2017-11-06 15:45:34 UTC
Permalink
As promised back in September, I make a PR a few weeks ago that implements
a varargs version of ccall. Are there any other actions I need to take in
order to help move this along?
--
You received this message because you are subscribed to the Google Groups "emscripten-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-discuss+***@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Loading...