Features

  • Free and Open Source (distributed under MIT License)

  • Easy to use and simple enough for 8-year-old-child to understand

  • Fully-developed procedural programming language

  • IDE for Windows and Linux (GTK+2)

  • Multiplatform. Runs on 32-bit and 64-bit Linux/Windows/MacOS

  • Built-in help

  • Documented (English and Italian Guides)

  • Examples include Tetris, Mine Hunter, Breakout, Calculator, TicTacToe

  • Tiny version is suitable for embedded systems

https://chocolatey.org/packages/nubasic
https://github.com/eantcal/nubasic/releases/
Download nuBASIC

You can find nuBASIC samples by following this link :

https://github.com/eantcal/nubasic/tree/master/examples

Build and run nuBASIC

nuBASIC source has been written in C++11 and compiles under several operating systems including Windows, Linux and MacOS.

To compile nuBASIC you may create a Visual Studio console application or build it by using GCC >= 4.8 (both VS project files and autoconf/automake scripts have been provided) or using MinGW >= 4.8.

Building nuBASIC by using cmake

Note: in order to build 'tiny' version of nuBASIC you need to replace the cmake command with the following one:

cmake .. -DWITH_X11=OFF -DWITH_IDE=OFF

Building nuBASIC by using maiken

To build nuBASIC you can also use maiken which is a C++14 Cross platform YAML based build tool for GCC/CLANG/ICC/MSVC/NVCC (please see also https://maiken.github.io).

Maiken uses a specific make-script file. You can download the nuBASIC's maiken script file at link https://github.com/eantcal/nubasic/blob/master/mkn.yaml (it will be included in the new versions of nubasic source code).

Once you have installed maiken and you have got the script file, just copy this file within the nuBASIC source code directory and execute the following commands from there:

mkn clean -p nubasic-release

mkn build -p nubasic-release

At the end of build stage you will find the binary executable file nubasic within <source-code-dir>/bin/nubasic-release.

Getting the latest released code

You can download the latest released version from GitHub

The source code is managed using the git version control system.

To get your own copy of the project sources, use the following command.

git clone https://github.com/eantcal/nubasic.git

To build nubasic you need g++ compiler and x11 development libs.

To build nubasic you need to install GNU GCC Compiler and Development Environment.

For example, using a Debian/Ubuntu distros open the Terminal and then type the following apt-get command as root user:

sudo apt-get install build-essential

Unless you configure nuBASIC to create the "tiny" version (./configure --enable-tinyver), you have to install following additional packages: libx11-dev, sdl2-dev, xmessage, xterm

Install script for development libs:

sudo apt-get -y install libx11-dev

sudo apt-get install libsdl2-dev

sudo apt-get install xterm

sudo apt-get install xmessage

See also

Ubuntu:

Others:



Build and run nuBASIC on iOS devices

If you want to run text version of nuBASIC on iOS devices you can use iSH (https://ish.app)

  • From AppStore download and install iSH on your device.

  • Run iSH and use the shell to type the following commands:

apk add g++

apk add make

apk add cmake

apk add git

git clone https://github.com/eantcal/nubasic.git

cd nubasic

mkdir build

cd build

cmake .. && make

./nubasic

Making the interpreter

It was quite fun to write a BASIC interpreter in modern C++ and BASIC itself is a simple language which recalls memories from the early days of personal computing, where each computer - such as the glorious Commodore 64 - had one of those embedded inside.

Main interpreter components

To write nuBASIC interpreter I followed the approach which mainly consists in parsing the BASIC source into a parse tree and then execute it, so I wrote the following main components, however they often do not have one to one correspondence with a single class or other C++ element:

  • Tokenizer: breaks language string into tokens

  • Expression Parser: transforms mathematical/logical/string expression into an internal evaluable tree object

  • Expression Evaluator: evaluates expression tree object

  • Statement Parser: creates an executable syntax tree

  • Statement Executor: executes a BASIC statement (which can or cannot contain one or more BASIC expressions).

  • Static program context: collects and handles program meta-data (needed to handle and execute multiple-lines BASIC constructs).

  • Run-time program context: handles objects such as variable state, procedure stack, function return value and so on, dynamically generated during the program execution.

  • Command Line Interface (CLI): a console oriented interface which is the main user interface of the interpreter. It has line-oriented editor which allows programmer to insert or modify program on-the-fly.

  • Interpreter: executes CLI commands, runs programs handling program data and meta-data, including debugging stuff.

  • Built-in function library: implements predefined functions used within expressions and statements.

  • Language statement objects: the implementation of language statement such as control structures (e.g. If-Then-Else, For-Next, Do-Loop-While, etc.), I/O built-in procedures (Print, Input, Open, etc.) and so on.

  • Syntax and run-time error handlers: helper classes used to handle C++ exceptions which implement syntax and run-time language errors

  • Built-in help: an interactive guide containing information about interpreter commands and language statements that a user can query via specific CLI commands.

A line-oriented language interpreter

BASIC language is line-oriented and this produces one of the key differences between BASIC and other programming languages where the position of hard line breaks in the source code is irrelevant.

Each code line in a BASIC program forms a self-contained unit.

For such reason the nuBASIC interpreter is itself line-oriented: program source text is split into lines which are owned from Interpreter class (source lines are stored in a map of pairs <line-number, text-line>).

Each code line is parsed into self-contained execution unit.

The interpreter builds a static program context which represents the glue code among program lines (and statements).

Indeed, the Statement Parser recognizes complex language constructs although are split in different lines, and builds meta-data which refers them. Each control structures line can also contain more than one statement.

Handling the tokens

The parsing of each line is preceded by the separate lexical analysis provided by the Tokenizer, which creates tokens from the source text.

A token is implemented just as a class which contains original source text, token position within the text, length, type (token type can be one of the following: 'identifier', 'operator', 'literal string', 'integer', etc...) and other data.

For token representation I did not use a pure OOP approach, which generally classifies token types building a class hierarchy. I decided to use a flat representation of token type which is just an “enum-class” attribute of the token object. The rational is that homogeneous token objects are easier to collect and handle.

Parser uses the token type attribute to recognize and validate the language syntax and build statement objects and the meta-data that interpreter needs at run-time in order to execute a program.

Token list container

To reduce parser complexity, a Token List container class, wrapped around a standard Deque, has been provided.

Token list class adds some facility thought to make simple handling token lists and reduce parser implementation complexity.

Parsing the code

While an unique Tokenizer exists, more than one Parser has been implemented:

  • Expression Parser analyzes an expression (such as “2+2*3/Sin(PI()/2)”) and produces an executable object instance.

  • Statement Parser analyzes each source code line generating an executable statement object.

  • To complete this job the Statement Parser invokes the Expression Parser for each expression encountered in the source text in order to obtain its executable equivalent representation

  • CLI Parser: the simplest of three parsers, is part of the Interpreter and it is responsible to validate and execute commands such as List, Run, Load, Save, and so on, which are not statements of the language but just interpreter environment commands.

One of main difficulties in parsing our language comes from expressions.

For instance, having parsed the beginning of an expression such as the following “2+4*17”, the resulting syntax tree for that depends on the whole expression, while knowing only the beginning of it, “2+4”, is not enough to build correctly the syntax tree.

Expression Parser has to recognize operators precedence and re-arrange the previous expression in something like this “(2 + (4 * 17))” in order to build a well-formed syntax tree, as shown below:

'+'

/ \

'2' '*'

/ \

'4' '17'

Expression parsing is done in different steps:

  • Tokenizer breaks an expression (a string) down in tokens held from a token list object instance.

  • The token list is rearranged (special 'begin' and 'end' sub-expression markers are inserted into the list) to obtain the following operator precedence order:

    • Unary identity and negation (+, –).

    • Exponentiation (^), Multiplication and floating-point division (*, /), Integer division (\, Div), Modulus arithmetic (Mod).

    • Addition and subtraction (+, –).

    • Comparison operators (=, <>, <, <=, >, >=).

    • Logical Conjunction (And), Inclusive disjunction (Or), Exclusive disjunction (Xor), bit-wise operators

    • Expression Parser builds an evaluable syntax tree where each node is an instance of a class belonging to the hierarchy formed by expr_any_t derived classes which represents: empty expressions, binary expressions, functions, variables and literal constants.

For example, considering following mathematical expression “2+4*17”, it becomes something like the following objects tree:

binary_expression

(sum)

/ \

literal \

(integer) \

2 binary_expression

(multiplication)

/ \

/ \

literal literal

(integer) (integer)

4 17

The abstract class expr_any_t defines the virtual method eval(), which sub-classes implement.

The prototype of eval() method is the following:

    • virtual variant_t eval(rt_prog_ctx_t & ctx) const = 0;

The method accepts a reference to run-time program (execution) context and returns a variant_t object, that we shall discuss next section.

Variant

Variant class is provided to manipulate several distinct BASIC language types in a uniform manner.

This reduces drastically the evaluator complexity.

I did not use C++ union because it supports only plain old data types and it is not adapt for non-trivial complex types.

Tracing execution of a simple program

Let us consider the following simple BASIC program, containing just a unique line with a unique statement:

10 PRINT 2+4*17

Suppose you have already inserted the program so you have just type “RUN” to execute it.

First nuBASIC_console() function gets the command string “RUN” from standard input, then invokes the function exec_command() which is a helper function that invokes the related interpreter exec_command() method, catching any exceptions.

This method parses a command in order to recognize it and perform the action required.

In this case it calls the rebuild() interpreter method which clears static and run-time context objects (removing both dynamic data and meta-data), then for each source line calls the update_program() method. This method creates a Tokenizer object and calls the compile_line() method of the Statement Parser object, held as attribute of the Interpreter object.

compile_line() method uses the Tokenizer to break down the source line into a language token list like the following (for simplicity no all token fields are reported below):

{ (“PRINT”, identifier), (“ “, blank), (“2”, integer), (“+”, operator), (“4”, integer), (“17”, integer) }

Each line of code is treated as a “block” which is a container of statements. Thus the method parse_block() is first called. This method iterates while the token list is not empty calling for each iteration the method parse_stmt(), which is able to recognize the statement in order to select the specific parse_xxx() method().

In our example, it recognizes the token “PRINT” (which is the first token of the token list), and calls the specific parse_print() method and this builds a stmt_print_t object which holds an expression object instance. parse_print() method calls the template function parse_arg_list() which builds an expression list for the PRINT statement. Each item of the list is an expression object built via the Expression Parser, ready to be evaluate by its eval() method against a run-time program (execution) context.

Finally, parse_print() method returns to the calling parse_block(). Which in turn returns an handle to statement block object by means of a (smart standard) shared pointer to the class object.

The statement handle is stored in a map (prog_line_t) where the key element is just the processing line number.

After building the program, interpreter calls the run() method which creates a program_t object instance passing to it program line and program context objects coming from previous building phase.

The program object executes each code line (represented by block statement object as discussed before) by calling the related run() virtual method. In our example the unique program line (a block statement object) contains the print statement object. Calling its run() method the related print statement run() method is finally invoked.

stmt_print_t run() method evaluates each argument of its argument list. The argument list is just a collection of expression objects which export the eval() method. The eval() method returns a variant object that can be printed out to the standard output.

How to extend the built-in function set

We are going to show, step by step, how to add a new built-in function to nuBASIC native API.

Just for example let us assume we want to add a 1d-convolution math function which returns a convolution of two vectors. If they are vectors of polynomial coefficients, convolving them is equivalent to multiplying the two polynomials.

For example, let's image to have two vectors u=(1,0,1) and v=(2,7) representing the coefficients of the polynomials x2+1 and 2x+7, in order to convolve them we want to use a nuBASIC program which employees a new function Conv as shown in the following example:

Dim u(3) as Double

Dim v(2) as Double

Dim w(4) as Double

u(0)=1

u(1)=0

u(2)=1

v(0)=2

v(1)=7

w = Conv(u,v)

For i=0 To 3

Print w(i);" ";

Next i

and the expected output is the following:

2 7 2 7

which means the vector w contains the polynomial coefficients for the polynomial 2x3+7x2+2x+7.

nuBASIC Conv prototype

Such function Conv will have a prototype defined as follows:

function conv( v1(n) as Double, v2(m) as Double) as Double(n+m-1)

We could also have an additional two (optional) parameters representing the size of vectors in order to allow to reuse the same array for representing different vectors.

In other words supporting a prototype like the following:

function conv( v1(arraySize1) as Double, v2(arraySize2) as Double, n as Integer, m as Integer) as Double(n+m-1)

Where parameters arraySize1>=n>0 and arraySize2>=m>0.

Implementation of conv function in C++

A possible implementation in C++ of conv function could be the following:

template<typename T>

std::vector<T> conv(const std::vector<T> &v1, const std::vector<T> &v2) {

const int n = int(v1.size());

const int m = int(v2.size());

const int k = n + m - 1;

std::vector<T> w(k, T());

for (auto i=0; i < k; ++i) {

const int jmn = (i >= m - 1) ? i - (m - 1) : 0;

const int jmx = (i < n - 1) ? i : n - 1;

for (auto j=jmn; j <= jmx; ++j) {

w[i] += (v1[j] * v2[i - j]);

}

}

return w;

}

We don't need to analyse in detail such function. It basically returns a new vector which is the convolution of two given vectors v1 and v2.

We can just assume it works fine for our purpose.

Extending the global function set

In general, to add a function to existing built-in API set we need to modify lib/nu_global_function_tbl.cc, and more in detail the static function of class global_function_tbl_t.

global_function_tbl_t& global_function_tbl_t::get_instance()

First time the static class function get_instance() is executed, populates a global map, which associates to each nuBASIC function name to a C++ functor (in other words a callback function) having the follow prototype:

variant_t functor_name( rt_prog_ctx_t& ctx, const std::string& name, const nu::func_args_t& args );

Such functor takes as parameters:

  • a runtime context (ctx) as defined in include/nu_rt_prog_ctx.h

  • a function name (take into account that multiple names could be mapped to the same functor) and

  • a vector of arguments, each of them is a nuBASIC expression.

A generic implementation of such functor requires to:

  • Get the number of input arguments (args size) and validate it.

  • Convert and validate any input args into C++ equivalent objects.

  • Call the C++ function for doing the actual job (that could be implemented inline as well).

  • Convert the result into nu::variant_t object, which is how at the end nuBASIC interpreter represent its data.

  • Return the result.

Defining new functor conv

To proceed creating a conv_functor we need to modify the file lib/nu_global_function_tbl.cc. A skeleton of conv_functor could be the following:

variant_t conv_functor(

rt_prog_ctx_t& ctx,

const std::string& name,

const nu::func_args_t& args)

{

// Get number of arguments

// TODO

// Validate the number of arguments

// TODO

// Process and validate the arguments

// TODO

// Convert the input parameters into C++ parameters

// TODO

// Compute the result

// TODO

// Convert the result into nu::variant_t object

// TODO

return result;

}

Get and validate the number of input arguments

According the two nuBASIC prototypes of Conv, we want to implement a function which accepts either 2 or 4 arguments.

So, first implementation step is to get the number of the arguments, which means we need to inspect the args parameter size.

We also need to check if such size is valid.

Knowing args is a vector of nuBASIC expressions (see also include/nu_expr_any.h), its size() method will return the number of arguments of the nuBASIC function.

variant_t conv_functor(

rt_prog_ctx_t& ctx,

const std::string& name,

const nu::func_args_t& args)

{

// Get number of arguments

const auto args_num = args.size();

// Validate the number of arguments

// TODO

// Process and validate the arguments

// TODO

// Convert the input parameters into C++ parameters

// TODO

// Compute the result

// TODO

// Convert the result into nu::variant_t object

// TODO

return result;

}

To validate the number of arguments we have to check if it is 2 or 4, otherwise we have to generate a run-time error.

For generating an error we can call the method throw_if() of the rt_error_code_t singleton object which generates an error depending on a specific error code.

variant_t conv_functor(

rt_prog_ctx_t& ctx,

const std::string& name,

const nu::func_args_t& args)

{

// Get number of arguments

const auto args_num = args.size();

// Validate the number of arguments

rt_error_code_t::get_instance().throw_if(

args_num != 4 && args_num != 2, 0, rt_error_code_t::E_INVALID_ARGS, "");

// Process and validate the arguments

// TODO

// Convert the input parameters into C++ parameters

// TODO

// Compute the result

// TODO

// Convert the result into nu::variant_t object

// TODO

return result;

}

In the previous source code throw_if() generates a runtime error by throwing an exception rt_error_code_t::E_INVALID_ARGS whenever the predicate args_num != 2 and args_num != 4 is true.

See the rt_error_code_t class implementation (lib/nu_error_codes.cc) for more information on run-time error codes and related messages.

Converting the input args into C++ equivalent parameters

Because we have validated the number of nuBASIC function arguments which is at least 2, we can now pick the first two of them up.

To do that we have to get the the first two element of vector args (which are shared pointers to instance of nuBASIC expression) and call the method eval() against the runtime context related to executing nuBASIC function ctx, as shown in the following source code:

variant_t conv_functor(

rt_prog_ctx_t& ctx,

const std::string& name,

const nu::func_args_t& args)

{

// Get the number of input arguments

const auto args_num = args.size();

// Validate the input arguments (args).

rt_error_code_t::get_instance().throw_if(

args_num != 4 && args_num != 2, 0, rt_error_code_t::E_INVALID_ARGS, "");

// Process and validate the arguments

auto variant_v1 = args[0]->eval(ctx);

auto variant_v2 = args[1]->eval(ctx);

// ...

// Call the C++ function conv for processing the input data and doing the actual job.

// TODO

// Get the result and convert it into nu::variant_t which is how internally the result is represented.
// ...

return result;

}

Note that the execution order matters for respecting the original semantics which evaluates expression arguments from left to right, and the arguments are listed from left to right using the lowest index 0 to the highest index which is given by the number of arguments - 1. We have to evaluate the arguments respecting such order, otherwise we could violate the semantics introducing unexpected behaviours. Each expression, in fact, could execute functions with side effect on the global context, and the order of such side effects must be preserved.

As result of the previous two highlighted statements we get two variant objects that we can assume to be vector of numbers (no matters if Integer, Float, Double, ... etc, because all of them have a safe conversion into Double).

Checking the method vector_size() of a variant object we can verify if they actually represent vectors.

Processing and validation of all the input arguments

Now we can proceed to check the actual vectors size. If we have just two parameters we can assume the vector size is coincident with size of array.

If the number of arguments is 4, we have to process the additional two arguments assuming them integer numbers.

We have to generate a runtime error whenever the first two arguments are not vectors, or in case any additional two arguments are invalid because greater than the array size.

In other words, we can implement something like the following statements:

variant_t conv_functor(

rt_prog_ctx_t& ctx,

const std::string& name,

const nu::func_args_t& args)

{

// Get number of arguments

const auto args_num = args.size();


// Validate the arguments

rt_error_code_t::get_instance().throw_if(

args_num != 4 && args_num != 2, 0, rt_error_code_t::E_INVALID_ARGS, "");


auto variant_v1 = args[0]->eval(ctx);

auto variant_v2 = args[1]->eval(ctx);

const auto actual_v1_size = variant_v1.vector_size();

const auto actual_v2_size = variant_v2.vector_size();

const size_t size_v1 =

args_num == 4 ? size_t(args[2]->eval(ctx).to_long64()) : actual_v1_size;

const size_t size_v2 =

args_num == 4 ? size_t(args[3]->eval(ctx).to_long64()) : actual_v2_size;

rt_error_code_t::get_instance().throw_if(

size_v1 > actual_v1_size || size_v1<1, 0, rt_error_code_t::E_INV_VECT_SIZE, args[0]->name());

rt_error_code_t::get_instance().throw_if(

size_v2 > actual_v2_size || size_v2<1, 0, rt_error_code_t::E_INV_VECT_SIZE, args[1]->name());

// Convert the input parameters into C++ parameters

// TODO

// Compute the result

// TODO

// Convert the result into nu::variant_t object

// TODO

return result;

}

Calculate the result and convert it into nu::variant_t which is how internally the result is represented.

At this point, in absence of exceptions, we will have two variant objects, the input vectors, and two respective sizes: size_v1 and size_v2.

We need to transform them into std::vector<double>, and we can proceed as follows:

std::vector<double> v1(size_v1);

bool ok = variant_v1.copy_vector_content(v1);

rt_error_code_t::get_instance().throw_if(

!ok, 0, rt_error_code_t::E_INV_VECT_SIZE, args[0]->name());

std::vector<double> v2(size_v2);

ok = variant_v2.copy_vector_content(v2);

rt_error_code_t::get_instance().throw_if(

!ok, 0, rt_error_code_t::E_INV_VECT_SIZE, args[1]->name());

v1 and v2 represent C++ vectors of double (initialised to 0).

The method copy_vector_content() will copy the variant_v1 content into a vector of double precision floating point numbers.

If the conversion is impossible the method would fail and a runtime error will be generated as result.

If we don't have any error at this point, we can call the C++ function conv which does the actual work:

auto vr = conv(v1, v2);

Then we can convert the result into a variant (internally typed as a vector of Double)

nu::variant_t result(std::move(vr));

Finally, we can return the result:

return result;

And putting all together:

variant_t conv_functor(

rt_prog_ctx_t& ctx,

const std::string& name,

const nu::func_args_t& args)

{

// Get number of arguments

const auto args_num = args.size();

// Validate the number of arguments

rt_error_code_t::get_instance().throw_if(

args_num != 4 && args_num != 2, 0, rt_error_code_t::E_INVALID_ARGS, "");

// Process and validate the arguments

auto variant_v1 = args[0]->eval(ctx);

auto variant_v2 = args[1]->eval(ctx);

const auto actual_v1_size = variant_v1.vector_size();

const auto actual_v2_size = variant_v2.vector_size();

const size_t size_v1 =

args_num == 4 ? size_t(args[2]->eval(ctx).to_long64()) : actual_v1_size;

const size_t size_v2 =

args_num == 4 ? size_t(args[3]->eval(ctx).to_long64()) : actual_v2_size;

rt_error_code_t::get_instance().throw_if(

size_v1 > actual_v1_size || size_v1<1, 0, rt_error_code_t::E_INV_VECT_SIZE, args[0]->name());

rt_error_code_t::get_instance().throw_if(

size_v2 > actual_v2_size || size_v2<1, 0, rt_error_code_t::E_INV_VECT_SIZE, args[1]->name());

// Convert the input parameters into C++ parameters

std::vector<double> v1(size_v1);

std::vector<double> v2(size_v2);

bool ok = variant_v1.copy_vector_content(v1);

rt_error_code_t::get_instance().throw_if(

!ok, 0, rt_error_code_t::E_INV_VECT_SIZE, args[0]->name());

ok = variant_v2.copy_vector_content(v2);

rt_error_code_t::get_instance().throw_if(

!ok, 0, rt_error_code_t::E_INV_VECT_SIZE, args[1]->name());

// Compute the result

auto vr = conv(v1, v2);

// Convert the result into nu::variant_t object

nu::variant_t result(std::move(vr));

return result;

}

Still we have to map the function inserting the functor into the fmap defined inside the function global_function_tbl_t::get_instance() which is declared in the file nu_builtin_help.cc:

fmap["conv"] = conv_functor;

To complete our integration we can extend the console inline help and the list of reserved words.

To extend the inline help we need to add a specific entry to the _help_content collection, as follows:

static help_content_t _help_content[] = {

// ...

{ lang_item_t::FUNCTION, "conv",

"Returns a vector of Double as result of convolution 2 given vectors of numbers",

"Conv( v1, v2 [, count1, count2 ] ))" },

// ...

};

That will allow to use help and apropos commands from nuBASIC console to get information about the new function conv.

Final step is extending the list of reserved words by modifying the file nu_reserved_keywords.cc which defines the set of strings named reserved_keywords_t::list:

const std::set<std::string> reserved_keywords_t::list = {

// ...

"conv",

//

};

nuBASIC IDE

nuBASIC IDE (Integrated Development Environment, for Windows and Linux/GTK+) includes a syntax highlighting editor and debugger. It can be an alternative to the inline editor of CLI.

IDE provides comprehensive facilities to programmers for software development, like the syntax highlighting, which is the ability to recognize keywords and display them in different colors.

Debugger lets you place breakpoints in your source code, add field watches, step through your code, run into procedures, take snapshots and monitor execution as it occurs.

This is particularly useful to write non-trivial programs.