Leonardus
Loading...
Searching...
No Matches
ARCHITECTURE

The project Leonardus coding and architectural documentation is split into
a part CODING covering notes on more general aspects of
programming that could also be applied in other C++ projects.

And these very specific notes.

Manifesto

Development Process

Leonardus is an open source project.

There is an extensive catalog of automated tests. When making design
decisions, the testability of a possible solution is given high priority.

There is a constant refactoring of the code and documentation.

A backlog is maintained with GitLab issues tickets.

No technical debt should be accumulated. Corresponding issue tickets
always have high priority.

When encountering incidental findings (e.g., potential bugs, suboptimal
patterns, or missing edge-case handling) during development, create a GitLab
issue or task tag immediately. Even for minor observations. This ensures no
insight is lost and technical debt is preemptively addressed.

All artifacts can be built with a minimal toolset. Currently, these are
the GNU C++ compiler, make, flex, Bash and git together with the libraries
Boost, GMP and quadmath. The project is hosted on GitLab, and the
environment's features like CI/CD are used. However, a build of all
artifacts and tests must always be possible without GitLab.

Documentation is preferably embedded in the code and should be cleanly
extractable with Doxygen. All documentation is included in the git repository.

Code

Readability and clarity of the program source code take precedence over
optimization in the implementation.

The project documentation and the source code are written in English.
For mathematical operators, whose creation is a primary goal of the project,
multilingualism is in the project backlog.

The main branch of the code must always allow a build of the artifacts
at the turn of the day, and these must pass the automated tests.

LeoScript will not have a garbage collection, but a controlled tree of all
existing semantic objects.

The execution of LeoScripts should be as reproducible as possible in terms
of runtime behavior. This means that:

  • The entire system must be explicitly initialized.
  • Any caching must be controllable.
  • No asynchronous processes, such as garbage collection or other cleanup
    tasks, should be executed.

Compatibilty Goals

LeoScript is designed to provide a conceptual and structural compatibility
with PostScript and Forth, though it does not aim for full syntactic or
functional equivalence.

PostScript

LeoScript adopts the interpreter architecture of PostScript, but deliberately
excludes all graphics-related functionality. This approach already enables
non-graphical PostScript programs to run under LeoScript with minimal or no
modification. The focus is on the stack-based execution model and the language
semantics for algorithmic and computational tasks.

A further extension — currently in the project backlog — plans to introduce
dummy routines for PostScript’s graphical operations. This would allow
PostScript programs to be syntactically executable in LeoScript, even if
graphical output is not supported.

Forth

LeoScript does not aim for full Forth compatibility. However, it supports the
idiomatic style of Forth programming and allows small, typical Forth
algorithms to run unchanged in LeoScript. This is achieved by leveraging
Leonardus vocabulary concept — a system extension mechanism that allows
Forth’s "words" to be defined one-to-one in LeoScript.

Build

With git clone and make

IDEA: text

With the leobuild Docker Image

IDEA: text

Semantic Object Types

This is a table of the SOs with details.

All objects are executable or non-executable aka literal.
This executable status becomes manifest with different OTCodes for
name objects and array objects.

For simple objects a duplicate of an object duplicates the value of the object.

For composite objects a duplicate of an object shares its value with the
original object.

There's an enum class OTCode in the source to list all the codes.

The column type shows the result of the operator type which is used to
identify objects in LeoScript code.

The column class shows the C++ class name of the SOs.

The Table

objects class exe status OTCode type description
simple SOL L nulltype The null object is used as placeholder in arrays.
SOB B booleantype A boolean value.
SOI I integertype A 128-bit integer.
SOM M marktype A mark object.
SON literal N nametype for non-executable name objects.
executeable Nx nametype for executable name objects.
SOO O operatortype A regular registered code code operator.
SOo o operatortype An unregistered core code operator.
SOR R realtype A 128-bit IEEE real number.
SOQ Q rationaltype A rational number with arbitrary precision.
composite SOS S stringtype A string.
SOD D dicttype A dictionary is a list of key-value pairs of SOs.
SOA literal A arraytype For a non-executable array of any SOs in any order.
executable Ax arraytype For an executable array aka procedure.
SOK K stacktype A stack of SOs.

Abbreviations

The system also makes use of abbreviations (pseudo OTCodes) to determine
groups of OTCodes.

abbreviation description
Z I or R or Q
X any SO

The Test Approach

See TESTS

Tools Directory

A development tool overview will be generated by the build process
as TOOLS.md.

man Pages

Leonardus integrates the traditional UNIX man page subsystem.
Generated man pages are categorized under section 7 (Miscellaneous
Information Manual).

Doxygen generates man pages for the operators registered in:

  • systemdict,
  • leodict and
  • userdict.

The command lb -C manpage generates man pages for the operators from
vocabularies.

Note

The tool mandb must be accessible in the PATH to update the database for apropos.

The man pages are

  • copied to the Leonardus docker image and
  • they are available for the developer locally after make install.

Build Types

There are three so-called build types. They denote build and, in particular,
compilation settings for different environments.

DEVELOP

Compilation without time-consuming optimizations.
Used to develop and debug the code.

PRODUCTION

Compilation with good optimization. All debug-code is removed.
All asserts are removed.

PROFILE

Compilation without any optimization. No inlining of functions.
Compilation with instrumented code to generate profiling data.
Used to analyze function, line and branch test coverage.
make lcov only succeeds with this build type.

Notes

The GitLab pipeline uses the build types PRODUCTION and PROFILE.

The make utility can be directed by: BUILD_TYPE=PRODUCTION make leon or
as in the GitLab pipeline configuration by: make BUILD_TYPE=PRODUCTION leon
or by: export BUILD_TYPE=PRODUCTION.

If you change the build typing a make btclean is recommended.

The command line tools lc, lb, leon, and parser print the build type
with which they were generated as the first line in their help text.

Program Exit Codes

User Programs

These are: lb, lc, parser, leon.

EC_SUCCESS = 0 ... Success.
EC_OPERATOR = 1 ... leon pass: operator error.
EC_INTERPRETER = 2 ... leon pass: parser, semantics or internal error;
parser pass: parsing error;
Scripts: environment and setup issues.
EC_CMDLINE = 3 ... Command line issues
EC_PANIC = 4 ... leon pass: panic exit due to memory restrictions.

Test Programs

These are: e2e.sh, ctest.sh.

EC_SUCCESS = 0 ... Success.
EC_TESTFAIL = 5 ... Errors from test harnisch or found by test harnish.

Scripts in Tools/ Directory

EC_SUCCESS = 0 ... Success
EC_ERROR = 1 ... General error

Order of Include Files

The call and usage hierarchy (see hierarchy.drawio) results in the
segmentation of include statements into so-called IncBlocks.
The specific order of these blocks is as follows:

  • Inc Libray
  • Inc HAA (for Helper, Adapter, Algorithms)
  • Inc Medium
  • Inc Rich

Release Strategy

Version Numbering Scheme

We use a two-part Version Number consisting of a major and minor
version, separated by a period. We started with "0.9".
Release Numbers are assigned to these version numbers by appending
a sequential number starting from 1. Therefore, the first release number
was 0.9.1. When the version number is increased, the last number is reset
to 1. Consequently, the first release of version 1.0 is 1.0.1.
Release numbers are stored in git as Release Tags prefixed with 'v'.
Thus, the first release tag is "v0.9.1". In git, the first commit intended
for a release is tagged with this release tag.
The last commit in the release cycle receives a Release Completion Tag
in git. This release completion tag in git has the form of the release tag
with the appended text "--release". Therefore, the first release
completion tag is named "v0.9.1--release".
For each release completion tag, a GitLab Release Object, i.e., a
release in the sense of GitLab, is created.

CHANGELOG

When a new release is created, a new section is opened in the
CHANGELOG.md for this release. The Release Date of the previous
release is also assigned then. This is the date the respective Git
release object was created.
The section of the previous release in the CHANGELOG is manually revised
when a new release is created. The revised content of the CHANGELOG section
for a release serves as the Release Notes.

For each commit, the corresponding commit message should be appended to
the CHANGELOG. This can be done automatically by installing a git hook
from TOOLS.

What is a Release?

A release consists of:

  • a release number
  • a release tag in Git
  • a release completion tag in Git
  • all associated commits
  • a section in the CHANGELOG
  • the revised release notes
  • a GitLab release object
  • a Docker image

Release Change Check List

  • build an up-to-date set of custom Docker images with customdocker.sh
  • check for FIXME's in the task tags
  • check markdownlint output
  • check tmp/cloc_categorized.txt
  • check SAST artifact-report
  • assess the lcov HTML output
  • update CHANGELOG with release date
  • revise CHANGELOG section to create the release notes
  • last commit of remaining changes
  • git push
  • with current vx.y.z add vx.y.z–release tag to this last commit
  • git push origin tag vx.y.z--release
  • check pipelines in GitLab
  • create a GitLab release object
  • open new section in CHANGELOG with date "open"
  • first commit of new release cycle
  • git push
  • add vx.y.z+1 tag
  • git push origin tag vx.y.z+1
  • unset BUILD_TYPE
  • make clean
  • review make release-info output
  • make e2e
  • review ./leon -h release-info output

makefile and GitLab Integration

The available makefile targets can be displayed by using the the
make command without options.

The makefile will be triggered from the GitLab pipelines. All process
details are implemented in the makefile and its helping scripts.
Intenionally there are no details coded in the pipeline YAML configuration.

Integration Strategy

Both the makefile and the CI/CD pipeline are structured into the same five stages:

  • build,
  • test,
  • predeploy,
  • deploy, and
  • l2test

The makefile supports two different dependency mechanisms:

  • When used on the shell, the makefile enforces dependencies and builds them
  • When used in the pipeline, the makefile expects the artifacts to be copied
    by the CI/CD pipeline and does not build them itself

Build Performance

The pipeline configuration utilizes parallel processing in GitLab runners by
invoking make -j$(nproc).

To achieve the same locally, you can run:

export MAKEFLAGS="-j$(nproc)"

in-Files

The project uses *.in template files (e.g., HTML, Bash, JSON) with
\@PLACEHOLDER\@ tokens, replaced at build time via sed. This ensures
version-controlled templates remain environment-agnostic, while generated
files (e.g., file.html, script.sh, config.json) embed runtime-specific values
(paths, versions, etc.). The approach is extensible. No hardcoded values are
committed.

Interpreter Pass 2 Exection Model

main() loop

Main interpreter loop

  • Reads a leon-format line and calls Interpreter::leonline().
  • Then processes the execution stack until its empty.
    SOOs will be executed
    Executable SONs calls their SON::load_exec()
    All other SOs will be push onto the operand stack.
  • Repeat.

Interpreter::leonline()

Processes one line of the leon-format input.

  • Executable names outside of procedures and {, } are pushed onto
    the execution stack.
  • Everything else is pushed onto the operand stack.

SON::load_exec()

Looks up a name and executes it.

  • Procedures will be unfolded onto the execution stack.
  • Operators will be executed.
  • Executable names will be called recursively.
  • Other SOs will be pushed to the operand stack.

SOO::exec()

Calls the C++ machine code associated with the SOO and SOo.

SOA::unfold2exec()

Unfolds duplicates of the array-content to the execution stack.

operator bind

Replaces executable names with operator objects recursively into elements
that are SOA. Also does further optimization if Interpreter::odo_
ist set to true.

operators begin and end

These dictionary stack manipulations influence what is found by
SON::load_exec().

operators exec

The exec operator pushes

onto the execution stack.

loop operators

The loop operators push

onto the execution stack.

Boost Library

Installation Note

The installation of the Boost library in a GitLab pipeline image requires a
apt -y install gfortran- libboost-all-dev, because of Fortran config issues.

Following Boost modules are in use:

Boost.Assert

BOOST_ASSERT, BOOST_ASSERT_MSG, BOOST_ASSERT_IS_VOID
are used as building blocks for
DBC_PRE, BDC_POST, DBC_INV, DBC_INV_CTRO, DBC_INV_RAII and DBC_IS_VOID

Boost.Test

See TESTS

GMP Library

GNU Multiple Precision Arithmetic Library (GMP) is a free library for
arbitrary-precision arithmetic, operating on signed integers, rational
numbers, and floating-point numbers.

In this project, we use the GMP library for the storage and algorithms of
rational numbers, specifically for the SOQ implementation. This is
complemented by an additional layer that provides adapters and
custom algorithms.

Quadmath Library

The quadmath library is a part of the GNU Compiler Collection (GCC) that
provides software implementations of mathematical functions for 128-bit
floating-point numbers (__float128). While the GCC supports the __float128
and __int128 data types natively, many standard library functions are not
available for these extended-precision types. The quadmath library fills this
gap by offering optimized implementations of essential mathematical
operations for __float128.

In Leonardus, the decision to use 128-bit floats and integers as fundamental
data types drives the need for quadmath. However, since not all required
functions are available out-of-the-box, the project includes an adapter128
module. This module acts as "glue code," bridging the gap between the native
__float128/__int128 support in GCC and the missing standard library
functionality.

GitLab Notes

The following GitLab features are used:

  • Labels to structure issue boards and label technical debt and
    feature branches
  • Issues, Incidents and Issue Boards for planning
  • Milestones for a thematic structuring of tickets
  • CI/CD Pipelines
    • with build, test, predeploy, deploy and l2test stages
    • CI/CD variables
    • artifacts
  • Releases
  • Container Registry to provide the latest Leonardus Docker image
  • Pages for hosting the program documentation
  • Tags to mark the beginning and the end of a release cycle.
  • Components to import ready to run CI/CD steps:
    • to-be-continuous/bash
    • components/sast
    • components/markdownlint
  • Badges as dashboard elements in the GitLab Project main page and on
    the Docker hub repository overview.

Additional features will be integrated over time.

KDevelop

KDevelop creates a .kdev4 file and a ./kdev4 directory. These files
are integrated into the git repository to share them.

Visual Studio Code

VSC creates a ./vscode directory. This directory is integrated into
the git repository to share it as reference.

The following helpful adjustments can be made from the example configuration files c_cpp_properties.json und settings.json:

  • defines of GITRELEASE and BUILD_TYPE
  • cppStandard
  • .leo as PostScript file
  • Doxygen section tags
  • the todo-tree configuration

The 'Todo Tree' extension can handle our task tags appropriately.
The 'Lex' extension improves syntax highlighting for the flex source.
The 'Docker' extension supports Dockerfile editing.

Docker

There are three types of Docker images:

  1. Leonardus images to provide ready-to-run Leonardus installations
    • leonardus
    • leojupyter
  1. Custom CI/CD images to drive the GitLab CI/CD pipeline
  1. A build environment with compiler, tools and a source code clone
    • leobuild

All images are stored on Docker hub (under the user hagenbund2).

The GitLab container regisitry is used exclusively to store the Leonardus
installation images.

The Dockerfiles to build the images are located in ./Docker.

Deployment to Docker Images

The makefile target dockerize builds Docker images with a Leonardus
installation and pushes them to the Docker hub.

The target dockerforward copies the Leonardus images to the GitLab
container registry.

Leonardus Image Tags

The Makefile generates two types of tags for installation images:

1. Mutable Tags

These tags are reassigned to new images when a new build is created:

Tag Description
latest Standard convention for the most recent build.
release-<release number> The last build of the given release.

2. Immutable Tags

To ensure unique and immutable references, we use the Git commit hash in
the following formats:

Tag Format Description Example
CI-<build-type>-<commit-hash> Generated in a GitLab pipeline. CI-PRODUCTION-f3c904f
DEV-<build-type>-<commit-hash> Generated from a local developer command line. DEV-PROFILE-f3c904f

Build Custom CI/CD Images

We build custom CI/CD images with preinstalled software primarily to speed up
GitLab pipelines.

The build steps are automated in the script customdocker.sh.

Docker Hierarchy

Below is a dependency tree of the Docker images, showing their origin and
FROM statements:

Image Name Origin
----------------------------------------------------------------------
ubuntu pulled from Docker hub
├── hagenbund2/ubuntunoble_man customdocker.sh
│ ├── hagenbund2/leonardus makefile
│ ├── hagenbund2/leobuild manual
│ └── hagenbund2/ubuntunoble_jupyter customdocker.sh
│ └── hagenbund2/leojuypter makefile
├── hagenbund2/ubuntunoble_gcc customdocker.sh
└── hagenbund2/ubuntunoble_gcc_boost customdocker.sh
gcc pulled from Docker hub
└── hagenbund2/gcc_boost customdocker.sh
dockershelf/latex pulled from Docker hub
└── hagenbund2/latexfull_doxygen customdocker.sh
rlespinasse/drawio-export pulled from Docker hub
docker pulled from Docker hub

Docker Registries

Docker hub

The Docker hub is used to store all images built with customdocker.sh and
from the makefile.
A manual clean up of old images has to be done frequently.
The Markdown-formatted "Overview" descriptions for Docker hub are stored as
templates in the directory ./AuxDoc.

Gitlab Container registry

The GitLab container registry is used exclusively to store images from the
makefile. It requires a slightly different naming scheme:

Docker hub GitLab container regisitry
hagenbund2/leonardus hagenbund/leonardus/leonardus
hagenbund2/leojupyter hagenbund/leonardus/leojupyter

Further Notes and Details

To run Docker in Docker on a workstation:

docker run --privileged docker

To run Docker in Docker on GitLab we use in the .gitlab-ci.yml:

image: docker
services:
- docker:dind

$DOCKER_HUB_USER and $DOCKER_HUB_PASSWORD are stored as CI/CD variables in
the GitLab project.

$REGISTRY_USER and $CI_REGISTRY_PASSWORD are provided automatically by GitLab.

To run Docker commands as a non-root user, add your user to the "docker" group.
This is particularly necessary for integration with the makefile to trigger
the "dockerize" target locally during development. Use the following command:

sudo usermod -aG docker ${USER}

Drawio

Diagrams.net aka draw.io aka drawio is used for documentation.
This versatile tool allows users to create diagrams and export them to PDF.
For simplicity, the term "drawio" is used throughout the project to refer to
the program, file extension, directory, and related elements.

We use the DEB package v28.1.2 from GitHub jgraph/drawio-desktop/releases
to install drawio on development machines.
In the GitLab CI/CD the Docker image rlespinasse/drawio-export is applied.

Additionally, there is a valuable VS Code extension for drawio files
by publisher Henning Dieterichs.

Project Jupyter

Leonardus integrates with Jupyter, because LeoScript can be configured as a
Jupyter kernel.

The files kernel.json.in, kernel.py and logo-64x64.png in the directory
./Jupyter are the building blocks for a Jupyter kernel. They are installed
by make install to $HOME/.local or, in the case of the Docker image by
leojupyter.Dockerfile to /usr/local/share/jupyter/kernels.

kernel.py acts as an adapter, wrapping the lb interpreter to provide the
required kernel interface for JupyterLab.

Local JupyterLab

The make install creates the necessary configuration to use Leonardus with
an existing JupyterLab installation.

An even tighter integration can be achieved using the Tools/udocker.sh
script. The script runs the Docker processes as the current login user,
enabling data sharing with Docker volumes.

You can open and work with *.ipynb notebooks from the project's ./Jupyter
directory using any Jupyter client, regardless of whether a kernel
configuration exists.

Docker JupyterLab

The leojupyter Docker image described by leojupyter.Dockerfile
contains all components needed to run JupyterLab directly from the image.

To start the JupyterLab X application, use the command provided in the
Docker hub repository's "Overview" section. It integrates with a
$HOME/leojupiter directory to provide persistent storage. It further
fine-tunes the startup of the X application by setting certain Docker security
options.

Ghostscript

Ghostscript can be used as reference implemenation of the PostScript language.

Hint: To invoke gs without rendering, you can set the environment
variable export GS_DEVICE=nullpage.