The reason you want commonly used operators to be short is so that most of the information you see on your screen is the content of your program. It's the same point Tufte makes about data-ink ratio in graphs. Conversely, long names and required boilerplate are like the stuff he calls "chartjunk."
You see the same thing in math notation. Do you think mathematicians use an integral symbol instead of writing out the word "integral" because it makes them feel clever? I think it's because the resulting bloat would make math expressions harder to read, not easier.
I would agree that conciseness is a good reason to create a special symbol when it outweighs the cost of obfuscation and would only suggest measuring it and weighing the two instead of going with terseness by default.
For example, most equations have relatively few symbols and typically writing out "integral" might double that number. If it could be shown that "mac" vs. "macro" or "car" vs "first" halves your character count, then I think that would make a good case for their use. But if it only reduces the character count by a few percent or less, then AFAICS, it's not a reasonable tradeoff and can only guess that readability isn't the underlying motivation.
Initially I tried to make operator/variable names descriptive, but as time went on I became tired of typing every descriptive name again and again so, obviously, I tried to establish a balance between readability and convenience... However, the more I code the more I notice the names become shorter and find it takes effort to achieve meaningful names. That balance is going to be different person to person, but I will hazard a guess that the more you code the shorter the names become.
as an example (not that the function works):
(def fill (table-name with)
(filltable table-name with))
(def fillcup (with size)
(let cup size
(fill cup with)))
(fillcup "coffee" 8oz*)
The above is ideally how I would name names, but then find myself trying to incorporate the data type into the name...
(def fillcup (with-string size-table)
(let cup size-table
(fill cup with-string)))
which I think is equally as bad as:
(def fillcup (s tb)
(let cup tb
(fill cup s)))
This last one is bad as 'tb' may tell it's a table, but what is it conceptually? A size table...
So here are my options:
* the idealized version isn't descriptive enough
* if all my code were fully descriptive I would be
worse off reading and typing everything.
* the short form is just as bad as both the idealized
version and the descriptive version.
So it doesn't surprise me that people go with the short form, the lesser of evils - perhaps ?
The worst part is that because all the options suck
I end up doing things like:
(def fillcup (with-str size-tb)
(let cup size-tb
(fill cup with-str)))
Which by then I hate my code and should have just stuck with the idealized or the short.
I personally wish there were a means to quickly identify the data-type without having to incorporate it into the name, something like a hover tag - this would, more often, allow myself to afford the meaningful without being overly descriptive.
(I currently use textmate to write code and the terminal for the repl).
Just and opinion from a hobby programmer with only a few hundred hours programming experience.
"The sticking point is compression-tolerance. As you write code through your career, especially if it's code spanning very different languages and problem domains, your tolerance for code compression increases. It's no different from the progression from reading children's books with giant text to increasingly complex novels with smaller text and bigger words."
If my understanding is vaguely correct, Yegge argues for compact code for the same reasons pg does.
I personally wish there were a means to quickly identify the data-type
Something that an editor for a dynamic language can give you is run-time analysis of your code - so, indeed, a mouse-over could in fact give you a list of the types assigned to any given symbol. This is something I'd love to get welder to do ... it's entirely possible, in theory :)
Alternatively, you could write your own function definition macro that lets you specify param types and checks them for you at runtime:
I enjoyed the first half - then he went on and on and on....
My goal is to try making code readable enough that documentation or 'metadata' is barely needed. I'm going to rework my code and start keeping my code idealized - not descriptive - not short.
I'm going see how it works when I drop the data type tagging all together and hope one day when I am a little more knowledgeable I will be able to craft a solution.
Maybe in a few years I'll let you know how it went :)
T
For now I'm going to leave Windows ports as something for other people to do downstream if they want to. It should be pretty easy, and I don't want to do it myself because I don't have access to a Windows machine or understand anything about the OS.
Currently, ensure-dir and date use system to run Unix commands. It would be much more portable if they used mzscheme operations instead. This has caused me trouble not only on Windows, but different versions of Linux. And I think make-temporary-file could replace /dev/urandom.
I changed date to get the date from Mzscheme, but it's not so easy to change ensure-dir. Mz's make-directory doesn't create intermediate directories like mkdir -p, and I don't want to get into trying to understand pathnames.
If anyone wants to take on being a distributor for a Windows port (or for different variants of Linux, for that matter), I do have a patch for date here: http://catdancer.github.com/date.html
Another option is the Anarki stable branch (http://github.com/nex3/arc/commits/stable/), which has most of fixes necessary to make most of Arc work portably on Windows and other OSes. (And, being a bug-fix branch, the amount of other random material is limited.)
I agree with the option, but I also think these posts will fall off the deep end in about 2 months. By then new members may only discover the option after they've discovered the problem. So we're not really saving new members wasted efforts unless the install page guides people.
pg: Even if you'd rather not think about supporting Windows portability yourself, providing a link to the Anarki stable branch would help new Windows users. Finding this stuff on the forum after it's fallen off the top couple of pages is not very easy.
I have most of them already fixed myself, with the exception of pipe-from, but an average person (aka me) with Windows OS is going to spend time both trying to get Arc working and sifting through the forum which isnt easily searchable (I had to sift through over 300 posts, with 80% of the content being way over my head, to determine pipe-from wasn't expected to be working for a specific reason.... had no clue what "dev/urandom" even was).
If I might suggest: if your not going to fix these that you start a known issues document to save people from the headaches and tail chasing. Even putting something on your install page noting that Windows isn't supported would be appreciated by newcomers.
[Edit] As an idea maybe we could have one posting thread for OS related issues that your install page has a link to ? ....
The reason I didn't simply use Scheme's gensym is that it would have imported the concept of interning into Arc with it. That has always seemed rather a hack to me. It may turn out to be the best solution, but for now I want to take more time to think about it.
You could come up with a new "subtype" of symbols which is reserved for gensymed symbols -- then use some mapping of unique names for new such gensyms (say, a counter), and then use some new syntax like `__name' to evaluate to these things (assuming that you want code to be able to write such gensyms literally)...
But doing all of that is equivalent to just creating gensyms as done now, with a `__' prefix -- so if you really want that last property then nothing needs to be changed (perhaps only the `gs' prefix)...
It would be a mistake to take the atomic out of expansions of =.
Operators that modify things have to be atomic in the stretch between reading the value and writing the modified value.
Suppose x is initially 0 and you have two threads both evaluating (++ x). If the ++s
aren't atomic, you could end up with this sequence:
thread 1 reads the current value of x, 0
thread 2 runs in its entirety, leaving x = 1
thread 1 resumes, setting x to 1 + the 0 read earlier, or 1
You could similarly have two threads evaluating pushes onto the same list that
ended up losing one of the pushed values.
I don't think you should expect to be able to throw control out of an atomic expression-- at least not short of some abort-as-disaster operator. That's the definition of atomicity: it all has to complete.
It would be a mistake to take the atomic out of expansions of =
Right, but it's also a mistake to leave them in. Too much locking is as bad as too little; e.g. (obj a (readfile "foo")) will hang my web server if reading foo happens to take a long time.
Here's my current chain of reasoning around locking...
Arc's approach to programming is exploratory, building larger programs out of small, composable parts.
Locking is a problem with exploratory programming because buggy locking code usually works most of the time, unlike bugs in functional code which are usually visible. With exploratory programming, you try things and see if they work. Sure, with functional code there are the corner cases that you miss and the occasional incorrect algorithm that happens to return the right value for the input you give it, but most of the time with exploratory programming you try things and you get to see that they don't work. But with locking, you throw together some locking code and try it out, and hey, your program runs and doesn't hang and gives the right answer. The bug, if there is one, only bites once in a blue moon when the different threads happen the hit the code in exactly the wrong way.
There's an interesting social aspect to this as well. I've noticed that if I tell someone about a bug in their code, it's less likely to get fixed if it's a threading bug. If their code returns the wrong value for an input, they say "oh my gosh!" and fix it right away :-). But if it's a threading bug, well, yes, it looks like a bug, but the program appears to run ok anyway, so there's little urgency, and how do we know if we've really fixed it or haven't added another threading problem?
The social aspect goes both ways. One of the things I find so delightful about Arc is that because of your work to write concisely, I can look at a function and say "oh, there's a bug". Or, at least that the function isn't doing what I want it to do. Which I can't do with most code, not as easily, because it is surrounded with so much cruft. But I don't have the same feeling of clarity when I look at Arc's locking code. I can read through the code and perhaps pick up on a locking bug or two (e.g. atomic-invoke), but overall, is everything locked that needs to be? Anything locked that shouldn't be? I can't tell. This part of Arc feels like regular software to me... complex enough so that I imagine there are probably bugs, and I don't expect to be able to get them all.
Composibility with locking is a problem too. You have a couple of perfectly good expressions (obj a 1) and (readfile "foo") and you put them together and they break.
My next thought in the chain is, so why use threading anyway? MzScheme only runs on one CPU, so what threading gives us is a) not having to call yield in a long CPU intensive calculation and b) having our program execution randomized for us so that our program doesn't return the same output for the same inputs... unless we very carefully add the right locking in the right places.
So my current inclination is to rewrite the web server to a single threaded event driven model.
Or maybe something like Erlang, where you're not stuck with a single thread, but you're also not trying to deal with sharing modifiable data between threads either.
Hmm, a different way of looking at the issue just occurred to me: pushing atomic inside of expansions of = may make Arc plus News shorter but Arc plus my program longer.
If ++ didn't do locking, then in places where you were using x from multiple threads you'd need to say (atomic (++ x)). Pushing atomic inside ++ makes this shorter because now you can just say (++ x). But pushing atomic into expansions of = also means that locking occurs at other times, and so I can't use it in places where I'm doing things like throw and readfile that break with locking, which makes my program longer.
Which leads to a fascinating idea, if we get a large enough body of open source Arc code that we can start optimizing for code size globally... :)
That would increase the conceptual load of programming in Arc a lot. It would make people have to think about the expansions of operators like ++ to know when to wrap things in atomic and when not to. You need to be able to treat built-in operators like that as black boxes. Once you start thinking about macroexpansions, it's as if you had to write them.
Hmm, well, I can only speak with any knowledge about my own conceptual load... I expect with your background (professor, Lisp book author, tutorial writer, mentor, etc.) you have a much better idea of what other programmers would find easy or difficult.
I know that some things should be atomic, such as accessing shared mutable data structures, and some things that I need to avoid being atomic, such as doing I/O.
I find Arc's making some operations atomic for me doesn't help me all that much, because without knowing the details of the macroexpansion, I don't know if everything that needs to be atomic has been made so. And I find it unhelpful in other cases, when I need an operation to not be atomic, and so I need to look at the macroexpansion of = to find out if that particular expansion is doing something that I need it to not do, or if it's ok.
On the other hand, I have no alternative to offer yet ^_^. I surmise that if I factored Arc + News + my code, perhaps I might come up with a useful suggestion to offer, and if I do, I'll certainly post about it!
Ok, this is now fixed. The Arc def of list now makes an explicit copy. Rtm has ambitions of one day making rest args ac-niltreed copies in ac.scm, whereupon we can return to the nice simple def of list.
(Yes, I realize I could just say (rem [is _ elt] seq).
But since Arc basically uses tables as structs, I think
this change would bite a lot of people the way it did me.)