FANDOM



Ameba Essence

"Amoebas at the start were not complex -They tore themselves apart and started sex."- Arthur Guiterman

This article is about a system called Ameba, which is a programmable calculator with a metalanguage (see: "http://en.wikipedia.org/wiki/Metalanguage") that is context-sensitive (see: "http://en.wikipedia.org/wiki/Context-sensitive_language"), extensible (see: "http://en.wikipedia.org/wiki/Extensible"), homoiconic (see: "http://en.wikipedia.org/wiki/Homoiconic"), and expression-based. Ameba operators (i.e., built-ins and macros) combine lexical scanning, syntax analysis, and semantic processing. It does not have monolithic processes for lexical scanning and syntax analysis; they are distributed among the operators.

The Ameba calculator has an indirect accumulator and memory organized as a symbol table, called the Directory that is an vector of Entries. Each Entry contains a Symbol and Definition, and is identified by a subscript, called a Code, that is used as an instruction code by the Ameba interpreter. Symbols are variable length strings that contain Codes (i.e., Code strings). Definitions can be Code strings or other data. Entries, Symbols, Definitions and Codes can be read and written using built-ins; thus, Ameba is extensible and homoiconic.

Given C is a Code of an Entry, E, and S is the Symbol contained in E, then C and S are aliases and interchangeable within a Code string. Both lexical scanning and syntax analysis make Symbols by combining Codes. Lexical analysis only works with ASCII characters to make lexemes. The characters occupy the first 128 Entries. Syntax analysis makes expression Symbols from all Codes, but mainly lexemes and expressions.

Each ASCII character code equals the subscript of the Entry for that character; thus, each ASCII character code is also a Code. Each ASCII character has a built-in Definition that is called by the interpreter when it encounters the character in source code. The Ameba metalanguage is described by the following sets:

Ƨ = (Σ ∪ l ∪ e)               /* is Symbols in Entries, where */
Σ = {\x00,\x01,\x02,...,\x7F} /* the ASCII character set are the terminal symbols */
/* NOTE: ASCII character code = Entry subscript (i.e., Code) for all characters! */
E ⊆ Σ*                        /* expressions are ASCII strings (Kleene star) */
L ⊆ E                         /* lexemes are ASCII strings */
l ⊆ L                         /* l are lexemes defined by Ameba developers and users. */
e ⊆ E                         /* e are expressions defined by Ameba developers and users */
/* ASCII NUL = empty ASCIIZ string (""). */

A call to a character built-in constructs a lexeme from characters that follow it in the source code. Lexemes are character strings, they are stored as the Symbol of an Entry, and they may or may not have a Definition, for example a comment does not have a definition. A string of lexemes can be data or an expression, but in general an expression can contain subexpressions, lexemes or characters.

Expressions are infix style that are composed of operators and literals. Infix operators combine reverse Polish notation (like Forth) and Polish notation (like Lisp with fewer parenthesis). Thus, preceding arguments are on the accumulator stack and succeeding arguments follow the operator in the call-expression. An operator may use preceding arguments only, succeeding arguments only, both preceding and succeeding arguments, or no arguments.

In Ameba, an operator that uses N arguments does not need parentheses scoping, since it is redundant information. For example, the add operator uses one preceding and one succeeding argument; thus, parenthesis are redundant. Although, the number N may variable because it can depend on run time state. If an operator cannot otherwise determine N, parentheses scoping is necessary. Each operator can determine whether it needs parentheses scoping or not. In addition, other Symbols can be used for argument scoping, for example if_then_else_fi.

Entries are productions that associate a Symbol with a Definition. A Symbol may be a terminal, a non-terminal, or a string of terminals and non-terminals. And a Definition may be a built-in, a non-terminal, or string of terminals and non-terminals. The list below characterize typical Entries, and uses abbreviations for non-terminal (NT) and terminal (T):

   Symbol            Definition
   (T)  character    a built-in
   (NT) whitespace   void
   (NT) comment      void
   (NT) literal      data (i.e., binary integer or quoted string)
   (NT) identifier   vector, array or structure
   (NT) identifier   alias to another Symbol that is a literal or identifier
   (NT) identifier   alias to another Symbol that is an expression or object
   (NT) expression   alias to another Symbol that is an optimized expression

In the Author's opinion, Kuroda Normal Form (see: "http://en.wikipedia.org/wiki/Kuroda_normal_form") describes Ameba productions. This opinion should be verified by an expert. In any case, Ameba syntax for operators can depend both on nearby Symbols and on run time state. Of course,

context-free operators can also be defined.

The information provided herein is not theoretical; context-sensitive languages are already known to exist. This article It is about making a practical language, with a style that is easy to implement and understand, yet extensible and powerful. It describes functions and features that are necessary to describe the essence of Ameba, and omits many things that are required for a full language implementation but are not necessary to understand the essence.

Additional Details

Paragraphs below discuss types, memory, classes, defines, lexemes, operators, expressions and arguments. The types, classes and operators are limited a few of each, to minimize the size of this article. The types are void and class. The classes include integer and string. The salient operators are substring, append, add, subtract, equal, greater, conditional and define. Ameba operators also process arguments, lexemes and syntax, which in most languages are hidden in the interpreter or compiler.

Types

There are two types, void and class. Types derived from void are built-in and plug-in. All types are built-in; thus, a type cannot be defined in Ameba. But, classes can be coded using Ameba. The two kinds of built-in types are the following:

The type void has subtypes built-in and plug-in. The type void built-in requires an argument value; it sinks the value, and returns nothing.
The type class is used to define other classes. It has two built-in classes, integer and string. There may be any number of user defined classes.

Memory

Ameba memory is allocated when Symbols and Definitions are created, reallocated when they are changed, and deallocated when a Symbol or Definition is deleted. Both Symbol and Definition are properties of an Entry, which are accessed via its Code (subscript). Alternatively, an Entry can be accessed via its Symbol, which is converted into a Code. The syntax for accessing Symbol and Definition is described in Extended Bacus Naur Form (EBNF see: "http://en.wikipedia.org/wiki/Ebnf") as follows:

   “Symbol.”, ?a Symbol? (* access the Symbol memory fragment *)
   “Definition.”, ?a Symbol? (* accesses the Definition memory fragment *)

The example below illustrates using Symbol.<symbol> and Definition.<symbol>; it also illustrates an effect of evaluating a value in the define !(Y=(41)).

   !(X=41) /* define of X as Symbol.41 */
   !(Y=(41)) /* define of Y as Definition.41 */
   Symbol.X = X /* use of Symbol. */
   Symbol.Y = Y
   Definition.X = alias 41 /* use of Definition. */
   Definition.Y = integer 41 
   Symbol.41 = 41 /* 41 is a literal Symbol */
   Definition.41 = integer 41 /* a binary number */

A memory fragment is an allocation of memory that varies in length depending on the amount needed for a given Symbol or Definition. Symbols always contain strings, but a Definitions may contain any kind of data. Refer to Paragraph Substring [l,o,w], below.

Thus far, the size of an integer has not been discussed; it is important because it determines the maximum size of fragments, the maximum number of Entries, and the size of string elements. There are five useful sizes for a Code: 8-bits, 16-bits, 32-bits, 64-bits and variable word length. A large application may require millions of Entries, which requires codes to be at least 32-bits. To save space requires variable word length, but it is complicated. A simple implementation can be done using 32-bit integers.

Although it needs improvement, The Design of An Object Base (see: "http://computer.wikia.com/wiki/The_Design_of_An_Object_Base") discusses variable word length. The variable word length is too complex to discuss in this document. There are 16-bit character sets; thus, assume integers are 32-bits.

Classes

The two built-in classes are integer and vector. A class derived from vector is string. Code strings can be processed as trees, because each Code is an alias for a Symbol that is itself a Code string. Definitions can be Code strings, but a Definition Code string should be converted into another Symbol (S); whereupon, the Definition can be replaced with an alias to S. Code strings are complex structures comparable to Lisp lists. The types for this initial version of Ameba are the following:

The class integer, abbreviated int, has derived classes character (abbr. char) and code.
The class vector has derived classes, including strings (abbr. str), arrays, and structures (abbr. struct).

Ameba Symbols are made from classes char, code and str. Thus, they are mentioned in this document more than the others.

See the Paragraph Defines below for an example class definition.

Integer

The class integer uses built-in Definitions for lexical scanning, syntax and semantics, and its Entries are created by initialization code in the interpreter. The Symbols defined for this class include characters, lexemes and methods as follows:

The integer class Symbols are the following: integer and int
The lexical scanning Symbols are the following: `-, `0, `1, `2, `3, `4, `5, `6, `7, `8, and `9.
The integer methods, add, subtract, equals and greater, are Symbols as follows: , +, -, =, and >.

Note the Symbol 0 is a string (i.e., lexeme) and the Symbol `0 is a character. The built-in Definitions of the digits (character Symbols) convert a string of digits (i.e., a lexeme) into a binary integer value and make Symbol for the lexeme and Definition for the integer value. For example, the integer literals 0 and 00 have equal binary values, but the Symbol `0 has a built-in Definition that does lexical scanning.

Char

The class char uses built-in Definitions for lexical scanning, syntax and semantics, and its Entries are created by initialization code in the interpreter. The Symbols defined for this class include characters, lexemes and methods as follows:

The character class Symbols are the following: character and char.
The lexical scanning Symbol is the following: ``.
The character methods, append, equals and greater, are Symbols as follows: +, =, and >. The accumulator may contain a string, a Code or character. Append char appends a succeeding character to the accumulator, with the result being a string. The equals and greater operators test an accumulator character with a succeeding argument character.

Code

The class code uses built-in Definitions for lexical scanning, syntax and semantics, and its Entries are created by initialization code in the interpreter. The Symbols defined for this class include characters, lexemes and methods as follows:

The code class Symbols is the following: code.
The lexical scanning Symbol for a Code is the exclamation mark, which must be followed by a number, (i.e., !N), where N is an integer decimal number (e.g., !471) or a hexadecimal number (e.g., !x038F).
The code methods, append, equals and greater, are Symbols as follows: +, =, and >. Append char appends a succeeding Code to the accumulator, with the result being a string. The equals and greater operators test an accumulator character with a succeeding argument character.

Vector

The class vector uses built-in Definitions for syntax and semantics, and its Entries are created by initialization code in the interpreter. There is no literal for a vector; thus, no lexical scanning. The Symbols defined for this class include characters and methods as follows:

The vector class Symbols is the following: vector.
The vector methods, add, subtract, equals and substring, are Symbols as follows: +, -, = and [l,o,w]. Add and subtract vector take the sum or difference of two vectors, element by element. The equals operator tests two vectors for equality, element by element, one in the accumulator and one succeeding argument. The substring operator selects part of a vector with length l offset o, and width w; see the paragraph titled Substring, below.

String

The class string uses built-in Definitions for lexical scanning, syntax and semantics, and its Entries are created by initialization code in the interpreter. The Symbols defined for this class include characters, lexemes and methods as follows:

The string class Symbols is the following: string.
The lexical scanning Symbol is the quote character, which must be terminated by a quote character. The lexeme may contain 0 or more characters between the two quote characters.
The string methods, add, subtract, equals, and substring (see: Paragraph Substring below) are Symbols as follows: +, =, > and l,o,w]</FONT></A>. Append string appends a succeeding string to the accumulator, with the result being a string. The equals and greater operators test an accumulator character with a succeeding argument string. The substring operator selects part of a string with length l offset o, and width w; see the paragraph titled Substring, below.

Array

The class array is defined using Ameba without special built-ins. The Symbols defined for this class include characters, lexemes and methods as follows:

The array class Symbols is the following: array[x,y], where different values for x and y make different class arrays.
The array methods, add, subtract, equals and substring, are Symbols as follows: +, -, = and [l,o,w]. Add and subtract array take the sum or difference of two arrays, element by element. The equals operator tests two arrays for equality, element by element, one in the accumulator and one succeeding argument. The substring operator selects part of a array with length l offset o, and width w; see the paragraph titled Substring, below.

A strategy for implementing operators such as +, - and = to operate on arrays requires changing the name of the built-in operators and redefining them to work with arrays and built-ins. And, a general overloading facility should be developed to simplify overloading operators for other classes. Subsequently, the overloading facility can be used to overload = for struct.

Struct

The class struct is defined using Ameba without special built-ins. The Symbols defined for this class include characters and methods as follows:

The code class Symbols is the following: struct., where each struct x is a different class struct. Elements of struct may be integer, vector, and string. Integers are specified by word length of 1, 2, 4 or 8 bytes.
The struct methods, equals and substring, are Symbols as follows: = and [l,o,w]. The equals operator tests two structs for equality, element by element, one in the accumulator and one succeeding argument. The substring operator selects part of a struct with length l offset o, and width w; see the paragraph titled Substring, below.

Defines

Evaluating a define creates an Entry. An Entry has two parts, a Symbol and Definition, because Ameba defines have the same two parts. The EBNF that describes a define syntax follows:

   define = “!(“, ?a Symbol?, “=”, ?a Definition?, “)”

A few annotated examples follow:

   !(five = 5) /* is an alias since the right is a Symbol instead of an expression. */
   !(++ = +1) /* is an operator that adds one to the accumulator */ 
   !(foo = 
      !(bar = 1) /* a property */
      !(plus = +) /* a method */
      class)
   foo X /* make an instance of foo named X */
   X.bar plus 1 /* add 1 to bar in the instance X */
   delete X.bar /* delete the bar property in X */
   delete X /* delete the object X and all its contents */ 

Symbols may contain any Code, including ASCII characters except NUL. The common Symbols, a subset of Symbols, are made from printable ASCII characters except “=”, and may contain spaces but cannot begin or end with a space. The remaining Symbols, called forced Symbols, are made of any Code, including ASCII characters except NUL. However, forced Symbols require use of a forcing character, “\,” to specify Codes, non-printable ASCII characters and “=.”

Both Codes and non-printable ASCII characters must be coded as either decimal or hexadecimal numbers, for example \32 and \x20 are the space character. The “=” must be preceded by the forcing character; in other words, “\=” is a forced “=.” The forcing sequences specified in EBNF are the following:

   “\”, ?digits?
   “\x”, ?hexadecimal digits?
   “\=” (* a forced equals *)
   “\\” (* a forced forcing character *)

Whenever the “\” character occurs in a Symbol, it is a common printable character, except when it occurs in a forcing sequence. To overcome a forcing sequence (e.g., to make a three character Symbol “\32” rather than a space) use “\\.” Thus, “\\32” is the forced Symbol “\32” that contains three characters.

Defined Symbols are called identifiers.

Use recursion for loops.

Lexemes

Lexemes are Symbols made of character strings. There are literals, whitespace, comments and identifiers. The paragraph titled Defines describes how identifiers are created. This paragraph describes the remaining lexemes, which are self defining.

Literals are self defining lexemes, because a literal Symbol has enough information to calculate a value for its Definition. For example, the literal Symbol 223 can be converted into a binary integer Definition value 223. Literals are either integers or quoted character strings. For example the following:

   1944
   “December”

Whitespace and comments are self defining literals with a void Definition. Ameba does not throw away anything from a source program, because it may be important. For example Python uses indenting to indicate scope, and specially formatted comments can be processed to

produce documents. An example comment follows:

   /* comment */

Lexical scanning uses a context-free (see: "http://en.wikipedia.org/wiki/Context-free_grammar") grammar to identify literals, whitespace, and comments. However, it is possible for an Ameba identifiers to make a language ambiguity. Thus, identifiers cannot be described by a context-free grammar. To disambiguate Ameba searches for identifiers in the set of defined Symbols, and scans for self defining lexemes using a context-free grammar. If both context-free and search algorithms return a lexeme, maximal munch (see: "http://en.wikipedia.org/wiki/Maximal_munch") is used to select one of the two; in other words, the longest lexeme is selected.

Operators

The salient operators discussed in this article are define, add, append, subtract, equal, greater, substring, and conditional. The define operator has already been discussed. The others are discussed in subordinate paragraphs below.

Add +

The add operator (i.e., +) takes an integer succeeding argument that it adds to an integer accumulator.

Append +

The append operator (i.e., +) takes a string succeeding argument that it appends to a string accumulator.

Operator +

The operator + either adds or appends depending on its two arguments, which are the accumulator value and a succeeding value. If both arguments are integer, + adds the two integers and stores the result in the accumulator. If both arguments are string, + appends the succeeding argument to the accumulator and stores the result in the accumulator. If the two arguments are not both integers or strings, + posts an error.

A handy programmer interface for overloading operators based on type is not specified herein because it is not in the scope of this document. Assume the operator + contains conditional code to implement either string or integer operation.

Subtract -

The subtract operator (i.e., -) subtracts a succeeding argument from the accumulator and stores the result in the accumulator.

Equal =

The operator = compares two arguments, accumulator value and succeeding value, which may be either integer or string. If the two values are equal, = sets a variable named Compare Status (abbr. CS) to -1 (true); otherwise, = sets CS to 0.

Greater >

The operator > compares two arguments, accumulator value and succeeding value, which may be either integer or string. If the accumulator is greater than the succeeding argument, > sets a variable named Compare Status (abbr. CS) to -1 (true); otherwise, = sets CS to 0.

Substring[l,o,w]

The substring operator has the most complex syntax and semantics of the salient operators. It accesses parts of vectors, strings, arrays, and structures, and it can transpose data for example from big to little endian.

There are two substring operators, symbol[l,o,w] (abbreviation sym[l,o,w]) and definition[l,o,w] (abbreviation def[l,o,w]).

Conditional if_then_else_fi

The Ameba conditional is standard if-then-else, closed with fi to minimize parentheses. Using special scoping syntax for different purposes makes source code a little easier to read compared to only parentheses. A variety of conditional statements follow:

   if =0 then 1 fi
   if =1 else 0 fi
   if =0 then 1 else 0 fi

Expressions

Ameba is based on expressions and objects. Programs, systems and systems of systems are objects and expressions. Expressions are infix, with preceding arguments on the accumulator stack and succeeding arguments in code following an operator. Thus, a programmer may make operators infix, prefix or suffix.

Expressions evaluate left to right, except for scoping syntax such as parentheses. The accumulator, which is the top of the accumulator stack, is used during expression evaluation as a source of preceding arguments and to store the result of evaluating an expression.

The indirect accumulator is named A, which means A is an alias for another Symbol (e.g., X), a temporary accumulator. In other words, A references the accumulator X. The indirect accumulator A can also reference other Symbols (e.g., Y). To change the Symbol that A references, just mention the new reference Symbol in an expression. Examples illustrate use of the accumulator as follows:

   Y+1 /* which is like math notation Y=Y+1, since answers are stored in the new accumulator Y. */
   +Z /* does not change the accumulator because Z is an argument of +. */ 
   Y 1 /* is like Y=1, because entering a literal into a calculator loads the accumulator */

Assignments are implicit, as shown in the example above ( Y 1). Thus, there is not need for either an assignment or store operator. Since Ameba is a metalanguage, it would be possible to make a production that translates “Y=1” into “Y 1,” “Y=Y+1” into “Y+1,” etc.

Symbols contain objects; thus, A references an object. And, expressions that affect A will contain methods (operators) of the object.

The operators push and pop operate on the accumulator and stack, in typical fashion. Although, since the accumulator is an object, an instance of that object must be created by push and destroyed by pop.

Arguments

Arguments are gotten rather than being passed via a lambda expression.

There are two ways to get an argument, as a value or as a Code; although, getting a value also returns the Code of the Symbol that contains the value. There are no free values; they are bound to Symbols, even if it is necessary to create a temporary Symbol.

There are two positions for an argument, preceding or succeeding an operator. Arguments following an operator have not been processed semantically, but arguments preceding the operator have been processed semantically, and are either in the accumulator or on its stack.

Thus, there are four operators to get arguments, as follows:

The getNextCode (abbreviated gNC) operator has two meanings, depending on whether the current Code is a character code or otherwise. If the current code is a character, then Ameba is doing lexical scanning, and gNC returns character codes or NUL when it cannot get a character. Otherwise, Ameba is doing syntax analysis and semantic processing,

and gNC gets the next Code.

The getNextValue

(abbreviated gNV) operator evaluates the next Code to get a value.

The getPriorCode (abbreviated gPC) operator gets a Code, since lexical scanning, syntax analysis and semantic processing have already processed values in the accumulator or on its stack. If the Code is used, the

accumulator should be popped.

The getPriorValue (abbreviated gPV) operator gets a value and Code from the

accumulator. If the value is used, the accumulator should be popped.

As gNC and gNV get an argument, they increase the return address to always point past the last argument. Instead of peaking at the next argument, save the return address and restore it.

Ameba Conclusion

Why design and develop the Ameba system? First, Ameba has features that are not commonly found it computer languages (e.g., context sensitive). My idea for Ameba has always been as a practical, rather than theoretical, language. Start with a small kernel to implement the calculator and metalanguage, which is extensible via Plug-ins and the reflexive metalanguage. The language internals are exposed and string expressions may be constructed and evaluated at run time. Using these facilities, evolve Ameba.

I, the Author, feel Ameba is a discovery more than an invention. It is the result of asking a question, “What does it take to make a computer language that is easy to start using, yet powerful, extensible, and capable of a modeling natural language?” Very slowly parts of the puzzle became clear, and they fit together as snugly as a precision manufactured kit.

I have selected a few identifiers, but they can be changed like a coat of paint. Ameba has neither key words nor reserved punctuation. Codes are nearly immutable, but not symbols, Garbage collection can change codes. This is not a mathematically precise paper; that is beyond my capability. However, I have tried to feature the essence of Ameba with as few design features as possible.

I feel obligated to publish this discovery; although, my writing capabilities are meager. I am dyslexic and have autistic traits, including learning language late in infancy. Preparing this document has been very difficult, and I fear I have not presented my thoughts clearly and concisely. I am not a language expert, merely a self taught amateur. I have omitted much detail, and hope not too much.

Moreover, I am old and ill and do not have time and energy or time to implement Ameba as it should be. This article may be the last I publish about Ameba. If my thoughts are clear and string, someone will be interested enough to work on Ameba. I hope and believe that will be the case.

Computers are finally large and powerful enough for programmers to have super tools. And, there are many exceptional tools. But, the last major development in programmer tool was the Integrated Development Environment (IDE). It is time for a paradigm shift to an Integrated Environment (IE). Ameba is a tool that can be configured in endless ways for any computing requirement.

An IE is a computer and its software. Assume an environment in which Ameba is the first language built for a programmable calculator. Electronic engineers can design a multiprocessor terabyte-calculator for the Ameba metalanguage, which would help early software development. ROM would contain the built-ins. As an operating system evolves, Ameba is its scripting language and assembler. The set of Codes define the machine language.

As soon as other languages are coded for the Ameba calculator, applications written in those languages can be compiled for the Ameba machine. The set of Entries contain both machine code and printable programs—an object base. The machine can act as a server or client for any application or meta-application, including AI that seems to need a context-free language.

There can be two IDEs for Ameba, one CUI and one GUI—character and graphical user interfaces. Both should be extensible, as Emacs (see: "http://en.wikipedia.org/wiki/Emacs") is extensible. Versions of Entries are possible by mangling Symbols. And, I hope that someone would integrate a regression test system into the IDEs, a part I have felt was missing from the traditional IDE.

Since the set of Entries contain both executable and printable forms of objects, many meta-tools can be made to use these Entries as an object base, including optimizers, data flow diagrams, CASE tools and theorem proving systems.

For this old programmer, Ameba is an archetype for computer and metalanguage that is easy to use as a calculator.

See also: http://lambda-the-ultimate.org/node/4108

Ad blocker interference detected!


Wikia is a free-to-use site that makes money from advertising. We have a modified experience for viewers using ad blockers

Wikia is not accessible if you’ve made further modifications. Remove the custom ad blocker rule(s) and the page will load as expected.