Journal of my investigation into the development of a new programming language
Friday, January 21, 2005
Screenshot from Scatterplot
Wednesday, January 19, 2005
I have run across two representations of rectangles (actually Axis-Aligned Bounding Boxes, or AABBs). Mac OS represents them (or at least once did) as the four integers left, top, right, bottom; Windows represents them as the position point (generally if not always meaning the top left) and a two-dimensional Size structure. I think both of these are poorly factored representations.
Many of the things we do with rectangles have direct counterparts in a one-dimensional range. For example:
- check whether they contain points
- check for overlap with other rectangles
- expand them to encompass points
- intersect with other rectangles
- union with other rectangles
- test for emptiness
And having written these methods for range, if we represent a rectangle as a horizontal and vertical range, we can simply delegate most behavior to the ranges. Here are some examples.
. . low
. . length
. . high:
. . . . get: return low + length - 1
. . . . set: length = high - low + 1
. . empty: return length = 0
. . contains x:
. . . . x < low? . . return false
. . . . x > high?. . return false
. . . . return true
. . intersection that:
. . . . this.empty?. . return that
. . . . that.empty?. . return this
. . . . lo = max(low, x.low)
. . . . hi = min(high, x.high)
. . . . lo < hi?
. . . . . . yes:. . return range(lo, hi - lo + 1)
. . . . . . no: . . return range(0, 0)
. . x
. . y
. . empty: return x.empty or y.empty
. . contains p:
. . . . return x.contains p.x and y.contains p.y
. . intersection that:
. . . . return rectangle(x.intersection that.x,
. . . . . . . . . . . . .y.intersection that.y)
I think that's pretty nice.
Thursday, January 13, 2005
Testability of Procedures
I'm using the Pascal distinction of functions (which return a value) and procedures (which don't) here. Procedures in the C family are just functions with the void return type.
For zero button testing, I've come up with syntax that I'm very happy with: expect...returns. This allows tests to be written right in the function body. For instance:
. . expect 1 returns 2
. . return 2 * x
For test-driven development, this seems like a great way to go - you write the function name, then immediately the test(s). With zero button testing, the test name starts out yellow (it has no tests) then turns red as soon as the first test is written (and fails, because the function doesn't return anything
yet). Then you write the function body, correctly, and it all turns its natural shade of white or perhaps gray. As you type.
A little extension to this to answer the immediate question of how to test member functions - in addition to expect
, let's add archetype
to our list of keywords. With that, now we have:
. . archetype none(0, 0)
. . archetype some(1, 2)
. . contains x:
. . . . expect none 1 returns false
. . . . expect some 1 returns true
. . . . x < low? return false
. . . . x >= high? return false
. . . . return true
That is to say, we expect that when the empty range none is asked whether it contains 1, it should (and does) return false; when some is asked, it returns true. Some contains 1 and none doesn't.
That's all well and good - but when it comes to procedures, which don't return anything at all, I don't know how to express the tests so nicely. Invoking the routine and testing its effects are separate steps. So - a puzzle.
works well for functions, but not procedures.
One of the nice features of C# is properties; from the outside they behave like instance variables but internally they act more like functions. Each property has a getter and/or setter (named, reasonably enough, "get" and "set").
C#'s setters use the keyword "value" for the value being passed in. If I put properties into my language, I think I may instead use the name of the thing, as in this example, from the range class:
. . get: return low + length - 1
. . set: length = high - low + 1
Of course this is inconsistent - the meaning of high
everywhere else in the program will be to invoke the getter or setter - but it seems very clear. Intention-revealing.
I've been playing with C# and Python lately, working my way toward an editor for my language, and working on some of the language ideas as well. The picture below is a screenshot from my C# app, holding Python code. I like Python's indentation-based blocks, and plan to use the same approach in my own language.
I was focussing on the ability to highlight code that failed tests, as I did in my "zero button testing" prototypes, and realized that I wanted to have both syntax coloring (keywords in blue, method names in green, etc.) and
correctness coloring, at the same time. In the ZBT prototypes, I used text forecolor to indicate correctness; for the new app, I want to reserve forecolor for syntax and use background shading to indicate correctness.
As I explored the ways to do background shading, it occurred to me that in addition to the very limited red-for-error, yellow-for-untested palette I was planning, it would also be possible to do a more nuanced coloring that showed nesting level. I like it; it makes it easy to focus attention on one particular block. Note that in addition to the line-by-line coloring, I'm also splitting individual lines at the colon (and, for my own language, at the question mark); this emphasizes the columnar nature of some sections of code; I think it might work well with SQL, which I've long preferred in a nice columnar format.
Example of nest shading
Wednesday, October 15, 2003
There are lots of ways to ask questions about ranges:
(1, 5) contains x?
x is_in (1, 5)
Today I was looking at some C code and saw this common pattern:
1 < x && x < 5
And, really, the natural way to represent this would be:
1 < x < 5
But that doesn't work in any language I'm familiar with, because their logic is more machine-focussed. The result of operator < from the right expression would be passed to the first operator and therefore compared with 1. Or it just wouldn't be allowed.
I'd like a language that supported this syntax. It's inconsistent, looked at in conventional ways - because it's not generally meaningful:
1 < 3 > 4 == 2 >= 7
But we are inconsistent, and our natural languages reflect that. I think I'd like a programming language that did, too. There are - there must be - limits to this, and there are undoubtedly implementation problems. Problems, too, of ambiguity - too much idiosyncracy heads in the direction of unreadable code. I'm interested in exploring those boundaries, and I think I'd probably put
1 < x < 3
1 < x <= 3
on the side of good.
Friday, April 11, 2003
Language Futures Link
via Lambda the Ultimate
, Paul Graham's The Hundred-Year Language
says lots of interesting things. Like:
One way to design a language is to just write down the program you'd like to be able to write, regardless of whether there is a compiler that can translate it or hardware that can run it. When you do this you can assume unlimited resources. It seems like we ought to be able to imagine unlimited resources as well today as in a hundred years.
Maybe that's what I'm trying to do.
Tuesday, April 08, 2003
Guiding Principle - Clarity
Can there be emergent
guiding principles? As I play with these ideas, I'm finding myself reacting to this feature or that one of one language or another, looking at that reaction, and trying to analyze what drives it.
So at some level, the primary basis for all of my decisions here is visceral. But what makes me react the way I do? I think there are genuine, consistent principles underlying my reactions, and I think I can discover them. Thus: emergent, guiding principles.
One thing I see in the examples I'm coming up with is brevity. In Adjectives
, shorter code fragments were a compelling argument in favor of the addition. But brevity can make code harder to understand - I'm thinking here of Perl's regex features. And when it makes code harder to understand, I don't like it. So brevity isn't primary.
Well, then, how about clarity? Expression of intent? Yeah. How many ways are there to say this? Language is about communication. With people. Lots of programmers seem to think that programming
language is about communicating with the computer, that as long as it works - or as long as it compiles - it's fine. I don't. I don't want the burden of unnecessary work of figuring out what I or some other programmer meant six month or six years - or six hours - ago. I don't want the added risk of introducing new bugs because I didn't understand what it was doing in the first place, or because I was confused about how to describe my changes.
A well-written program is easy to read; you look at it and know what it does. And if it's broken, or if it needs modification, you know how to go about changing it. So, naturally, a great programming language makes it easy to write programs that are easy to read.
So, clarity, or the facilitation of clarity, seems to be my first guiding principle. Stay tuned.
Monday, April 07, 2003
Part of the art of programming, in languages I've worked with at least, is balancing nouns and verbs. Before there were modules, data structures and objects, routines (verbs) had to bear all of the organizational burden of a program. Transferring some of that burden to data structures (nouns) can be a great simplification.
Human language, of course, has much more than nouns and verbs. And while I don't think human language represents a perfect model for computer languages, some of my favorite languages are English-like, so I'm willing to at least consider suggestions this analogy presents.
And I think a lot of the things I do in programming could be simpler if I had adjectives. The fields of a database record or a data structure, or the properties of objects, frequently serve the role that adjectives fill in speech. But asking simple questions about the properties can require more complexity than might be necessary.
Suppose a language had adjectives, binding a comparative test to terms for, at a minimum, superlative and comparative. I tend to think that opposite, and anti-comparative and anti-superlative are also appropriate.
Let's use width for our examples.
With adjectives Without
is r1 wider than r2? is r1.width > r2.width?
width of the widest of the_rects max = VERY_NARROW
foreach rect (the_rects)
max = rect.width if rect.width > max
is any of the_rects wider than 3? foreach rect (the_rects)
succeed if rect.width > 3
Arguably, the two loops are a more accurate reflection of what's really going on in the computer - but for that matter, GOTO would be a more accurate reflection of what's going on in a loop. I think how we express things in English is a better reflection of how we think of them. So in terms of communicating intent, I think adjectives are an improvement.
SQL makes short work of questions like this; arguably it has better support for what I am calling adjectives than for nouns and verbs. But I'm after a language that falls more into the procedural family, I think, and should be useful outside of a database context.
Peeking at semantics for a moment, I think that the terms should be bound to a test
, rather than a field
as I've used in these examples. It would allow for more sophisticated comparisons.
03/30/2003 - 04/06/2003
04/06/2003 - 04/13/2003
10/12/2003 - 10/19/2003
01/09/2005 - 01/16/2005
01/16/2005 - 01/23/2005