Generalized Programming Language
Any programming language syntax can basically be decomposed into nested lists. Something like XML. The lists would include parameters, keywords, function calls, whatever. It's essentially taking every structure and ordering its terms in the same way and conveying every operator to hierarchical structure or keywords. The idea is that if we could make a generalized language grammar, somewhat like XML but easier to type/read and perhaps more rich with structures, we could express any programming language is this form. That way learning a new language would be much easier, because you don't have to learn a new syntax or grammar--merely its constructs and functions--and also you wouldn't have to put up with really ugly syntax.
It isn't necessarily that every new language would have to use this specification, but that people could write front-ends that can convert from this specification to given languages and back, preferably as IDE plugins.
How exactly this language should be designed is hypothetical--I could take a shot at it, but that doesn't mean that my suggestion for a universal language is inextricably linked to my particular idea of an implementation of it.
One thing that comes to mind is that, although every nested structure in the program could be nested in the universal language in the same way, that could make it much less readable.
take, for example: for(int x=0;x<=255 && !y;x++) {do_this(exp((x+1),2)+3); }
you could write it as
for:
initial:
declare int x 0
compare:
and:
le x 255
not: y
action:
inc x
do:
function do_this:
sum:
exponent:
sum: x 1
2
3
and the above works okay for control structures, but is horrible for and's and or's and math--basically any operators.
and on the other hand, you could do it like this:
for(int(x,0), and(le(x,255),not(y)), inc x, function(do_this, sum(exp(sum(x,1), 2), 3))))
which is a little better for operators, but isn't so good for control structures.
and, of course, you could simply allow arbitrary line breaks and do it like this
for(
int x 0,
and(le(x,255),not y),
inc x,
function(do_this, sum(exp(sum(x,1), 2), 3))
)
but that still could be made a little bit more elegant, by allowing two forms of nesting:
for:
int x 0
and(le(x,255),not y)
inc x
function(do_this, sum(exp(sum(x,1), 2), 3))
and even futher, we could be more kind to operators, and technically we wouldn't even be changing the definition of the universal language:
for:
int x 0
(x le 255) and not y)
inc x
function(do_this, ((x plus 1) exp 2) plus 3)
although it might do to make some standards about how things in lists are ordered, so for example, you can't have the function/operator name be the 4th element in the list unless there are only three elements in which case it's the first element but only on tuesdays and depending on the price of beans as declared earlier in the source.
one thing we should not allow, though, is inexplicit priority of operators. all nesting should be explicit, that way you don't have to worry about learning the order of precedence for the particular language or thinking about it when you interpret some source code. exceptions maybe should be made, though, for basic numerical operators. i.e., everyone learns in elementary school or junior high that it goes: explicit grouping, then ^, then * and /, then + and -. although it's still on the table whether or not symbolic operators should be allowed in the specification. in some cases it makes it more readable, in other cases words would make their meaning more obvious. one solution would be to allow only >, <, <=, >=, *, /, +, -, . (namespaces), and either <> or !=. ^ shouldn't be allowed since it means exponent in some languages and XOR in others. and % can mean percent, modulus, string interpolation, etc. i'm being strict about it to make it easier for those who haven't done any learning of the language, although it could, perhaps, be made a language intended for people who do a little bit of studying. but that could make it a little more concise but a little less 'accessible'..
while it's up to whomever to specify how a particular language is translated into the universal language, there should probably be some guidelines set to foster consistency at little cost. for example, for loops exist in most every language, and we could dictate that for loops should start with the name 'for' as the first item. which they would probably do anyway, but perhaps there are other cases that are less normative. and more than just the 'for' would be specified.
commen elements of a for loop include:
initialization
comparison
incrementation or whatever
variable name
list you're selecting from
what to do
different languages would use different items of that list. each item could be given an official name, and a language uses whichever items are appropriate. it would be somewhat like the first example of code in this text, rather than the later examples where i just allowed positions in the list to determine meanings.
obviously mechanisms for literal strings and also comments need to be included. i'm a fan of Python's flexibility when it comes to literals. for comments i like C, I think they visually stand out well as being extraneous to the code. even moreso if it's all //'s but then you need an editor that can block comment and uncomment.
you may have noticed that i pulled some tricks with being able to use spaces to separate list items in some cases and commas in others. basically i tried to allow as much flexibility for the programmer in that as possible while maintaining that it can be interpreted determinately. so the three levels of separators would be spaces, commas and newlines, but they can be shifted up or down at whim. and parentheses can help too.
i suppose other things that really demand symbols are dereferencers and subscripts. moreso dereferencers, because
a[10] can be handled as (a 10), a 10 or a(10), or even a sub 10, but dereferencers might be get tedious with having to type ptr ptr a, ptr ptr (ptr b), etc. however, instead of doing that we can do this: p2 a, p2(p b), etc.