The operation and main routines of a minimal text interpreter - Part 3
This post is merely a description of the first implementation of the text interpreter looking at the principal routines. It's so I can remember what I did in 6 months time.
Currently only the basics have been implemented - by way of a proof of concept, and running on a 2K RAM Arduino. Later this will be ported to various ARM Cortex parts, the FPGA - softcore ZPUino and ultimately the J1 Forth processor.
There are probably many ways in which this could be implemented - some giving even more codespace and memory efficiency. As a rookie C programmer, I have stuck to really basic coding methods - that I understand. A more experienced programmer would probably find a neater solution using arrays, pointers and the strings library - but for the moment I have kept it simple.
The interpreter resides in a continuous while(1) loop and consists of the following routines:
txt_read
Reads the text from the UART into a 128 character buffer using u_getchar.
Checks that the character is printable - i.e. resides between space (32) and tilde ~ (127) in the ascii table and stores it in the buffer.
Keeps accepting text until it hits the buffer limit of 128 characters or breaks out of this if it sees a return or newline \r or \n character.
colon_check
This checks if the text starts with a colon, and so is going to be a new colon definition.
sets flag colon=1
calls the build_buffer function
word_scan
If the leading character is not a colon, this function determines that the word is either within the body of the definition, or it is for immediate execution. It calls build_buffer, but only builds the header to allow a word match. It should not add the word to the dictionary, if it gets a match and is already there.
build_buffer
This checks the first 3 characters of the word and puts them into a new header slot in the headers table.
It also calculates the word length by counting the characters as it stores them into the dictionary table, which it continues until it sees a terminating space character.
It increments the dictionary pointer ready for the next word
word_match
This compares the 4 characters of the header of the newly input word with all the headers in the header table.
If all 4 characters match then it drops out with a match_address (for the jump address look-up table) and sets a match flag match= 1.
header_list
This is a utility routine which prints out a list of all the headers in the order they are stored in the headers table.
dictionary_list
This is a utility routine which prints out a list of all the words in the dictionary in the order they were stored in the dictionary table.
txt_eval
This is the main character interpretation function which implements the SIMPL language core
is_a_word
Not yet implemented. Returns true if it finds a word and invokes build_buffer and word_match
is_a_num
Not yet implemented. Converts the ascii text to a signed integer and stores it in a parameter table.
Might possibly use ascii 0x80 (DEL) to signify to the header builder that the following bytes are a number. Will need a conversion routine to go between printable and internal storage formats.
UART Routines
These provide getchar and putchar support directly to the ATmega328 UART. Saves a huge amount of codespace compared to Serial.print etc
uart_init
Initialises the ATmega328 UART to the correct baudrate and format.
u_putchar
Waits until the Tx register is empty and then transmits the next character
u_getchar
Waits until a character is present in the UART receive register and returns with it
Printing Routines
Having banished Serial.print - I had to implement some really basic functions
printnum()
Sends a 16 bit integer to the UART for serial output
printlong()
Sends a 32 bit integer to the UART for serial output
This post is merely a description of the first implementation of the text interpreter looking at the principal routines. It's so I can remember what I did in 6 months time.
Currently only the basics have been implemented - by way of a proof of concept, and running on a 2K RAM Arduino. Later this will be ported to various ARM Cortex parts, the FPGA - softcore ZPUino and ultimately the J1 Forth processor.
There are probably many ways in which this could be implemented - some giving even more codespace and memory efficiency. As a rookie C programmer, I have stuck to really basic coding methods - that I understand. A more experienced programmer would probably find a neater solution using arrays, pointers and the strings library - but for the moment I have kept it simple.
The interpreter resides in a continuous while(1) loop and consists of the following routines:
txt_read
Reads the text from the UART into a 128 character buffer using u_getchar.
Checks that the character is printable - i.e. resides between space (32) and tilde ~ (127) in the ascii table and stores it in the buffer.
Keeps accepting text until it hits the buffer limit of 128 characters or breaks out of this if it sees a return or newline \r or \n character.
colon_check
This checks if the text starts with a colon, and so is going to be a new colon definition.
sets flag colon=1
calls the build_buffer function
word_scan
If the leading character is not a colon, this function determines that the word is either within the body of the definition, or it is for immediate execution. It calls build_buffer, but only builds the header to allow a word match. It should not add the word to the dictionary, if it gets a match and is already there.
build_buffer
This checks the first 3 characters of the word and puts them into a new header slot in the headers table.
It also calculates the word length by counting the characters as it stores them into the dictionary table, which it continues until it sees a terminating space character.
It increments the dictionary pointer ready for the next word
word_match
This compares the 4 characters of the header of the newly input word with all the headers in the header table.
If all 4 characters match then it drops out with a match_address (for the jump address look-up table) and sets a match flag match= 1.
header_list
This is a utility routine which prints out a list of all the headers in the order they are stored in the headers table.
dictionary_list
This is a utility routine which prints out a list of all the words in the dictionary in the order they were stored in the dictionary table.
txt_eval
This is the main character interpretation function which implements the SIMPL language core
is_a_word
Not yet implemented. Returns true if it finds a word and invokes build_buffer and word_match
is_a_num
Not yet implemented. Converts the ascii text to a signed integer and stores it in a parameter table.
Might possibly use ascii 0x80 (DEL) to signify to the header builder that the following bytes are a number. Will need a conversion routine to go between printable and internal storage formats.
UART Routines
These provide getchar and putchar support directly to the ATmega328 UART. Saves a huge amount of codespace compared to Serial.print etc
uart_init
Initialises the ATmega328 UART to the correct baudrate and format.
u_putchar
Waits until the Tx register is empty and then transmits the next character
u_getchar
Waits until a character is present in the UART receive register and returns with it
Printing Routines
Having banished Serial.print - I had to implement some really basic functions
printnum()
Sends a 16 bit integer to the UART for serial output
printlong()
Sends a 32 bit integer to the UART for serial output