Arcueid 0.0.8 is now available. For those of you just joining us, Arcueid is a C implementation of Arc. This new version is mostly a bugfix release, no really big new features:
* More garbage collection bug fixes. Finally managed to track down an irritating bug that was causing the gc to release memory that it shouldn't
* Some further bug fixes in some I/O functions such as sread and write to conform to Arc 3.1's behavior
* Hashing now works for port and socket objects
* Other small bug fixes too numerous to mention
As one might gather from the very low version number Arcueid is still pre-alpha, and there are even more bugs that still aren't fixed. Most of the core functionality is working well enough: most of the bugs have to do with the I/O functions which have not seen as much testing. There are in particular some bigger bugs in the networking code that have been causing crashes when attempting to run srv.arc. Well, I've not programmed sockets in C without the aid of an event library in a very long time, so I suppose now's a good time as any to do so...
Might the community be better served by a more traditional mailing list or Usenet newsgroup? comp.lang.arc maybe, and have a FAQ and all that. Just wondering.
Well, I don't have a great dislike for overloading operators per se. I don't mind the fact that you can add together arguments of many different numeric types together and get a reasonable result (I was particularly irked by the +. operator in Ocaml for example). What I don't particularly like is that the + operator is used as a concatenation operator as well as an addition operator. The two operations are not at all the same, and using one operator to do both things seems more troublesome than not. I even remember that Paul Graham once wrote that he felt that using + in that way was not such a good idea after all for many of the same reasons, but unfortunately it seems that that opinion didn't quite gain traction in the most recent versions of Arc that have come down to us.
Well, I suppose I'll just have to suck it up, my personal opinions aside. I set out to make with Arcueid a C implementation of Arc, not my own personal Arc dialect. The most recent git head now accepts this behavior.
"Well, I suppose I'll just have to suck it up, my personal opinions aside. I set out to make with Arcueid a C implementation of Arc, not my own personal Arc dialect."
Once you start down the Arc path, forever will it dominate your destiny. :-p Really though, I'm glad to hear you're seeing your goal through before you veer off in some other direction.
For a while Rainbow has been the fastest Arc implementation, by my measure. Now Rainbow.js and Nu have come along and challenge that, and with Arcueid there's a C competitor in the race as well. :)
"Now Rainbow.js and Nu have come along and challenge that"
While Nu is drastically faster in certain areas, like the + function, in general it's slower than Arc 3.1 (by my estimates, about 5-10%). Unfortunately any Arc implementation built on Racket will have a hard time beating Arc 3.1's speed, simply because Arc 3.1 already pushes Racket pretty far. To get faster, I think you'd either need to find some serious oversight in Arc 3.1, or use Racket's modules.
Oh, sorry. :) Because of your your active work to make it fast, I figured it would challenge at least Arc 3.1, if not Rainbow, and I guess I got carried away. :-p
While you could probably trivially start from Arc 3.1 and streamline things here and there (like '+), you have a point about Nu adding other features.
That's right, and because the compiler is only a (small but important) part of Arc, I can only do so much. Arubic, however, can potentially be a lot faster than Arc 3.1, because it makes many more changes to the language. But that's a different topic.
Anyways, the only real reason Nu would be slower than Arc 3.1 is because it wraps every global Arc variable in a function. This is so useful that I consider it worth the 5-10% performance hit.
"Well, I suppose I'll just have to suck it up, my personal opinions aside. I set out to make with Arcueid a C implementation of Arc, not my own personal Arc dialect."
I too struggled with this. I wanted to change ar significantly in ways that I considered better. Then when I started the Nu project, it too made many incompatible changes to Arc 3.1. I still believe these ideas are good and improve Arc 3.1, but they're incompatible nonetheless.
But now my opinion is that any new language based on Arc should be cleanly separated. So I've started my work on Arubic (my own language based on Arc) as a separate project from Nu. To be more specific, it's one language implemented with Nu, the other one being Arc 3.1.
So now the Nu compiler should be very compatible with Arc 3.1, and any incompatible changes (like Arubic) will be implemented as something akin to a module or library. I think, ideally, an Arc implementation should be reasonably compatible with Arc 3.1, but still allow you to easily change it into a different language if you wish.
"What I don't particularly like is that the + operator is used as a concatenation operator as well as an addition operator. The two operations are not at all the same, and using one operator to do both things seems more troublesome than not."
Concatenation has an identity element, and it's associative, making it a monoid operation. I just think of '+ as a polymorphic monoid operation. When the monoid is a group, unary '- comes in to be an inverse operator. When the group is a field, '* comes in to be a multiplication operator, with unary '/ as its inverse.
So I don't have any mathematical objection to using '+ for concatenation, and I actually think it fits quite well. I've even thought about using '+ for function composition, since that's another monoid. (However, recently I've been more attracted to the idea of managing each monoid theory as a separate entity, rather than sniffing the arguments to figure out which operation to use. This is made easy by Haskell type classes, which essentially do the sniffing at compile time based on the static type system.)
After a couple weeks of sometimes frustrating debugging, Arcueid 0.0.7 is finally released. For those of you just joining us, Arcueid is a C implementation of Arc. As the very low version number implies it is still very much in a pre-alpha state and there's a lot of work to be done on it before it gets to being usable enough to be a replacement for the standard MzScheme/PLT Scheme/Racket-based runtime.
New in this release:
* Directory and file system calls
* Threading and synchronization
* Networking
* Major bug fixes in the garbage collector
The functions provided I think now cover most all of the functionality provided by Arc 3's ac.scm, although there are still a number of bugs in certain areas that prevent it from running news.arc properly. Threading is provided by a very basic green thread scheduler, and atomic-invoke is implemented by means of a global communication channel.
What I have done with Arcueid is to forcibly remove ownership of the atomic lock from any thread that is killed in that way. The code you have linked there has no problems on Arcueid for that reason. I tend to think of kill-thread as like doing a kill -9, and a thread that gets it will terminate immediately, and just as with kill -9 whoever does it gets to clean up the rest of the mess afterwards. All guarantees about atomicity get thrown out the window. If you wanted to preserve atomicity then you should have used break-thread or some other more pansy-sounding function to attempt to terminate it. :) Thus, break-thread would have the use case of running a computation for some time and then aborting it. The use case for kill-thread, I think, should be an attempt to stop some runaway computation immediately. If a thread is blocked inside an atomic you might want to stop it no matter what.
I actually made use of this unorthodox method, which looks like it is courting fandango on core, in Arcueid's implementation of threading. I needed to find a way to interrupt execution of C code when it would have blocked on I/O, and make it resume at the same point after Arcueid's thread scheduler determines I/O can proceed. I've not figured out any other ways of doing that. Apparently setcontext and getcontext don't do this.
Changes in this version include:
- Optimization of compose, complement, and andf in a functional position
- Math functions, everything Arc3 has available plus quite a bit more (every math function that C99 defines, including trigonometry, hyperbolic functions, etc.). Most of them work for complex arguments just as well.
- Basic I/O functions (read, write, disp, etc.) cleaned up and implemented. Seems that call-w/stdin and similar had to be implemented like protect, which is annoying, but that had to be how it was done to get them to function in the face of continuations and exceptions.
"Seems that call-w/stdin and similar had to be implemented like protect, which is annoying, but that had to be how it was done to get them to function in the face of continuations and exceptions."
I, for one, appreciate that you're doing things the way they have to be done, lol.
Any plans to generalize 'call-w/stdout and 'call-w/stdin to 'parameterize? Now that I take a look at http://docs.racket-lang.org/reference/parameters.html, there's one more complication I hope is already on your mind: Threads.
I don't tend to care much about threads myself, but since they're on your roadmap, hopefully you have a good plan for them and the way they interact with 'call-w/std{out,in}. XD
If I understand the notion of parameters correctly from yours and Pauan's mention of them (I'd not heard of them before now), they are essentially a way to do dynamic binding in a language like Arc that normally uses static binding. This is in large part exactly what call-w/stdin does with stdin: it sets up a dynamic binding for that function. Apparently, in my attempts at implementing this functionality I've also independently kludged up a special-purpose version of what Racket calls a continuation mark, and obviously doing such a thing bothers me to no end.
Now that I think about it, implementing parameterize might actually not be that difficult, and Pauan's implicit parameters might actually be easier than explicit parameters that have to be applied in order to obtain their values. It would also get rid of the special-purpose "continuation mark" I created to support call-w/std(out|in) and replace it with a more general-purpose structure capable of storing other dynamic bindings as well.
Well, indeed threads are on my mind, but I will keep things simple for now, and make them green threads whose scheduling is controlled entirely by Arcueid's runtime. I had for a time considered using real POSIX or other OS-level threads to be able to take advantage of multiple cores but soon realized that this would introduce quite a bit of complication. Using real threads affects just about every aspect of the implementation. For instance, I am at the moment using an incremental garbage collection algorithm that ought to be amenable to multi-threaded operation in theory but in order to really use it in a multi-threaded context I'd also have to have a good multi-threaded memory allocator, and by the time I'd had a look at all the literature on such algorithms I realized that I was in way over my head.
Green threads simplify matters considerably. This means that call-w/std(out|in) and the more general notion of parameters can be handled without too much trouble. In the plan I have for Arcueid's green threads, a thread is basically a structure that contains everything that the virtual machine needs to run, including all continuations. The only thing directly shared by all threads is the global environment, and then I'd also have to make available a flattened version of the structures I used to store the call-w/std(out|in) bindings from the thread's creator, and the more general dynamic bindings created by parameterize as well.
And no, while Arcueid's main goal is to produce a version of Arc compatible with at least Paul Graham's Arc3.1, I am of course not above introducing improvements and extensions, provided that they do not also break compatibility. I'd like to be able to at least run news.arc unmodified before I release version 1.0.0. :)
"Apparently, in my attempts at implementing this functionality I've also independently kludged up a special-purpose version of what Racket calls a continuation mark, and obviously doing such a thing bothers me to no end."
Why? Are you worried your runtime will be exactly like Racket but less mature? When I looked at your call-w/std(out|in) commits, I liked your approach exactly because I noticed it was in the same vein as continuation marks. :-p
As you've noticed with complex numbers, Arc exposes lots of accidental complexity that it inherits from Racket. In fact, speaking of accidents, Arc 3.1 without modification exposes pretty much all of Racket: http://arclanguage.org/item?id=11838
When it comes to threads and parameters, I'd say Arc pretty much specifies nothing and leaves it up to Racket to provide the meaning and implementation. Arc literally defines 'call-w/stdin and 'call-w/stdout in terms of Racket's 'parameterize. If Arcueid doesn't end up with (internal) functionality equivalent to thread cells and continuation marks, there's a good chance it'll have certain corner-case inconsistencies with Arc, even if there aren't enough inconsistencies to break the programs we actually care about.
But even so, I wouldn't worry about it too much. I personally consider Arc to have shoddy support for threads (just exposing a tiny subset of Racket's functionality and imposing a GIL) and also for reentrant continuations (not defining 'dynamic-wind, implementing loops with mutation), so I don't really blame an Arc implementation for being incompatible in these areas. In some cases, full compatibility might be more harmful than not trying!
---
"I had for a time considered using real POSIX or other OS-level threads to be able to take advantage of multiple cores but soon realized that this would introduce quite a bit of complication. Using real threads affects just about every aspect of the implementation."
If you want to give an Arc program power to take advantage of those, but you're having trouble with multithreaded allocation, an alternate path might be to have the Arc namespace and most data structures be local to an OS thread but then to have other tools to write and read manually-managed shared memory. I dunno, maybe that's not very inspiring. :-p
---
"The only thing directly shared by all threads is the global environment, and then I'd also have to make available a flattened version of the structures I used to store the call-w/std(out|in) bindings from the thread's creator, and the more general dynamic bindings created by parameterize as well."
Er, local scopes and first-class data structures might need to be shared too, right?
(let foo (list nil nil)
(for n 1 10
(thread (push n foo.0) (push n foo.1)))
(def get-foo ()
foo))
"Why? Are you worried your runtime will be exactly like Racket but less mature?"
Not in the slightest. It just bothered me that I had to embed a special-purpose data structure inside Arcueid's continuations just to support one language feature. Now that I see that there is a natural generalization to this feature, that makes me feel a lot better. :)
'protect is implemented with 'dynamic-wind, so the only functionality we lose is the ability to specify a pre-thunk. Are there any areas where that would be useful?
Dynamic-wind gives us most of the ability to implement parameters ourselves. We just mutate a box upon every entry and exit of the expression. Unfortunately, it might take some crazy trampolining to get the last expression of (parameterize ...) in tail position. I'm not even sure if tail position is possible....
I think the last missing piece is thread-friendliness. In the face of threads, we'd need the box to be thread-local like Racket's parameters. But my point here is just that the pre-thunk is useful for something. ^_^
"[...] they are essentially a way to do dynamic binding in a language like Arc that normally uses static binding. This is in large part exactly what call-w/stdin does with stdin: it sets up a dynamic binding for that function."
That is correct. In fact, in Arc 3.1, std{in,out,err} are Racket parameters[1], and call-w/std{in,out} use Racket's parameterize. My point was merely that it is useful to provide parameters to Arc programmers so they can define their own parameters beyond just stdin/stdout/stderr.
* [1]: That's why you need to use (stdin), (stdout), and (stderr) rather than stdin, stdout, and stderr.
---
"And no, while Arcueid's main goal is to produce a version of Arc compatible with at least Paul Graham's Arc3.1, I am of course not above introducing improvements and extensions, provided that they do not also break compatibility."
Glad to hear it. I would just like to note that any changes whatsoever will break compatibility. For instance, if you provide a "parameterize" form, a library written in Arc might also define a "parameterize" global, etc. My feeling on such things is that there should be a social convention for specifying implementation-specific global variables.
Something like, "if a global variable starts with % it is implementation-defined, so portable Arc libraries shouldn't use or define global variables starting with %".
Then your implementation could provide "%parameterize" to Arc and there would be no problems, because Arc libraries aren't supposed to use variables starting with %, so there's no conflict.
This should be solely a social convention, not enforced by the compiler. I may want to write an Arc library that does use/define implementation-specific globals, while
understanding that such a library won't be portable and may break in the future.
"Any plans to generalize 'call-w/stdout and 'call-w/stdin to 'parameterize?"
As a side note to this, I think it would be very preferable to have a `parameterize` form which `call-w/stdin` and `call-w/stdout` would call. It should behave similarly to Racket's parameterize.
This isn't necessary for an implementation of Arc 3.1, but it's very useful in practice: you could provide a way for users to create their own parameters and then call parameterize on them. This is what ar and Nu do, and it's incredibly convenient, especially when you provide a way to make the parameters implicit[1].
It really does depend on your goals, though. Do you intend for this to be just an implementation of Arc 3.1 and nothing more? Or do you intend to provide convenient features that Arc 3.1 doesn't have? Your work on numerical functions seems to suggest that you're not entirely against extending your Arc runtime to do things that Arc 3.1 doesn't.
---
* [1]: By "implicit parameters" I mean parameters that you don't have to call to extract their value. In other words, you can just say `stdin` rather than `(stdin)` for instance.
Thanks for that... The Scheme version has even more parentheses and was much harder to understand. I'm starting to see how the algorithm works, and I'll see if this comes out cleaner than the kludge I came up with and incorporated into Arcueid 0.0.5. What I plan to do is restore the very simple continuation invocation it used to have, then wrap it up that way. Exceptions are of course simple enough to implement by using ccc, and implementing them on top of the ccc that supports dynamic-wind should provide us with exceptions that support dynamic-wind as well.
By the way, I haven't seen the orig-cc:fn idiom before. So even a special form like fn works with ssyntax. So I suppose it would not do to just expand it into ((compose orig-cc fn) ...), and we have to actually make it a real function composition.
"So I suppose it would not do to just expand it into ((compose orig-cc fn) ...), and we have to actually make it a real function composition."
Not so. If you look at line 29 in ac.scm you'll see this:
; the next three clauses could be removed without changing semantics
; ... except that they work for macros (so prob should do this for
; every elt of s, not just the car)
((eq? (xcar (xcar s)) 'compose) (ac (decompose (cdar s) (cdr s)) env))
((eq? (xcar (xcar s)) 'complement)
(ac (list 'no (cons (cadar s) (cdr s))) env))
((eq? (xcar (xcar s)) 'andf) (ac-andf s env))
For those not familiar with the Arc compiler, what it's doing is basically these transformations:
If you wish for compose, complement, and andf to work on macros and special forms like fn, your compiler will need to do a similar transformation. The catch is that this transformation only works in functional position:
(map no:do (list 1 2 3 nil nil)) ;; doesn't work
It's all very hacky and whatnot, macros aren't very clean at all in Arc. The other catch is that it hardcodes the symbols 'compose, 'complement, and 'andf, but my Nu compiler fixes that.
Whew, lots of new features and fixes in this release! Highlights include:
- on-err, err, and details implemented
- ccc (call/cc) implemented (actually proved ridiculously simple, at least until protect was implemented anyway)
- protect implemented (it's astounding to see how much implementing this function complicated the implementation of ccc and exception handling)
- major bug fix: evaluation order is now left to right. Previous versions stupidly did right to left evaluation. We didn't do a lot of work with side effects before which is why it went unnoticed for so long.
I just have to wonder if there's a better way of implementing protect than the rather kludgy way that I wound up doing it.
"I just have to wonder if there's a better way of implementing protect than the rather kludgy way that I wound up doing it."
Based on a quick skim of https://github.com/dido/arcueid/commit/65a252a87fd817ec33f21..., it looks like you're doing it in a similar way as Rainbow, lol. You're collecting protect handlers on your way down and then pushing them all back onto the stack in a particular order.
I think a more natural way to do this might be to stop at the first protect handler, then enact instructions that accomplish "call this handler, pop the frame of (or otherwise exit) the protect body, then make the same continuation call again." In fact, I wonder why you and Conan Dalton didn't do this to begin with. :-p
Just to explore this a bit, to help both of us understand... if this approach were extended to dynamic-wind, if you encountered a dynamic-wind form on your way up the stack, you might stop there and enact instructions of the form "call this handler, push the frame of (or otherwise enter) the dynamic-wind body, then make the same continuation call again." Does this make sense? Part of my concern is to have clear semantics for what happens if a continuation call exits or enters a handler block.
Meanwhile, an alternate (but not necessarily better) way to do it is to define a core language without 'protect and then wrap that core in a standard library that hides the original version of 'ccc and exposes a version that consults a global stack of 'protect handlers. This Ruby library does that: https://github.com/mame/dynamicwind.
Another day, another release. Unit testing against arc.arc has uncovered more serious bugs and missing features in the code, and now we have version 0.0.3. Get it here:
Or clone the tag from the git repository if you prefer.
New in version 0.0.3:
- A lot of bug fixes, most notable is the fact that 'nil and nil were not the same in previous versions.
- Readline support in the REPL should be a bit cleaner
- Some feeble attempts at tail call optimization
- Some compatibility fixes for reference Arc, mainly in the behavior of the type function, e.g. (type 1+1i) => num, where Arcueid used to say (type 1+1i) => complex. The previous behavior (which I think might be more than useful) is available in the atype function.
- Tracing support now disabled by default. Can be enabled with a command line switch.