Tales From The Code Front Stories in words and pictures

Well, here's another nice mess you've gotten me into!

10 dim a$(2)
15 def fn a(x) = x * 2
20 a$(0) = "abc"
30 input a$(1)
40 if a$(0)=a$(1)then print sin(1), fna(2)

This piece of code doesn't make any sense, but it's a nice test case for #Genesis64's pre-parser. And now you ask yourself: "Why a pre-parser (and what is it)?"
When I first started to work on this, the Genesis64 parser actually was an interpreter, you entered #C64 #BASIC code and ... interpreted it using a boatload of RegExes (because, hey, BASIC looks like it's so easy, you could parse it with Regular Expressions alone - boy was I wrong).

Because I used RegExes to parse the code, it seemed like a good idea to make finding arrays easier, by using the C-like [ and ] instead of ( and ). I used the same reasoning for =, thinking that simple compare operations should be easier to find when I use == instead of =.

Here's the result of the (very first) pre-parser used:

10 dim a$[2]
15 def fn a[x] = x * 2
20 a$[0] == "abc"
30 input a$[1]
40 if a$[0]==a$[1]then print sin[1], fna[2]

Using this nice RegEx to detect the "start" of an array in code:
/[a-z]+\d*[\%\$]*\s*\(/

It will find everything that looks like an array up to the opening bracket, so just run this over each line, find the closing bracket and replace ( and ) with [ and ].

Almost .... correct.
Let's start with line 10, obviously a$[2] is wrong, as we're not accessing the array, but DIMing it. So it needs to check if the array variable is used in a DIM statement.
Line 15, wrong, DEF FN is followed by a "name" rather than an array
Line 20, wrong, it's a LET command, without LET, as we don't need it in BASIC v2.
Line 30, wow, correct.
Line 40, almost correct, but SIN is a function, so not an array and thus has to keep its ( and ) intact, and FN A is a function as well.

So 3 out of 5 lines that needed further testing, and I've not even really tackled comparison.
First check for DIM and DEF FN / FN, then filter out the BASIC functions (SIN, ABS, ...), oh and deal with nested things like a$(b(2)), or SIN(a((ABS(3))) ...

Nonetheless, I kept using that idea throughout three (or 4?) iterations of the parser, adding conditions for DEF FN, functions and what not, to the point where it perfectly converted C64 BASIC to "G64"-BASIC. It just took a few hundred lines of code and conditions to arrive there. With the current rush of motivation (I just released v. 2.6.7.3 of our e-learning suit WideBight), I re-read the code for the pre-parser and decided to drop that idea and just make sure the parser can deal with BASIC the way it is written instead of jumping hoops to convert it to something else first.

Mastodon