Scripting

Wikis > SuanShu > Scripting
In a manner similar to using Matlab/Octave/R/Scilab, we may use SuanShu in an interactive or a scripting environment. This is most appropriate for research, data analysis and rapid prototyping. Often using a script language, the numerical problems may be expressed and solved in a reduced number of code lines as compared to using Java. Also, the users can get immediate feedback for each line execution, without having to first finish the whole program.

Here is a screenshot of using SuanShu in Groovy.

groovy_matrix_example

Tools

There are a number of ways to use SuanShu in a scripting environment. We recommend the following.

Advantages

There are a number of advantages to using SuanShu over other mathematics scripting languages such as Matlab/Octave/R/Scilab. In the following comparisons, we are illustrating the pros and cons using R, but the arguments probably carry over to the other mathematics scripting languages.

No Language Pollution

There is no need to learn and use a very specialized and proprietary language. In general, the proliferation (pollution) of languages causes confusion. There are very established programming languages that junior students and professional programmers already know. The most notable examples are Java, C# and C++. They are general enough to solve any computationally feasible problems. It makes no sense to invent a new language just for an application. The consequences of language pollution are:

  1. the users need to learn a new language just to use a new application, hence a barrier of entry; e.g., the R language to use R;
  2. the heterogeneity of languages introduces unnecessary challenges when combining them to build one integrated system; the more the worse;
  3. debugging a variety of languages is difficult;
  4. the hosting applications are almost never written in Matlab/Octave/R/Scilab; we would thus have to pass a large amount of data back and forth between the applications (e.g., in Java) and the data processing code (e.g., R); this is a severe performance issue;
  5. the specialized language has no use outside its environment; e.g., no one uses the R language outside the R computing environment.

Most programmers already know Java (or can convert from C# and C++ very easily). As SuanShu is written entirely in Java, there is no need to learn a new language to use it.

Better Code

The “real” programming languages (e.g., Java, Groovy, Scala) are better designed than the specialized mathematics languages. They are invented by scientists who really understand programming languages. The modern languages support (or shall I say, demand) object-oriented code, inheritance, interface, anonymous class, closure, functional programming, an advanced memory model and management, automatic garbage collection and etc. In contrast, the popular mathematics languages have few, if not none, of these features. For example, the R language is a by-and-large procedural imperative C-like language. (Its object-oriented feature is nothing more like a struct in C. It should not be confused with the real object-oriented support in e.g., Java.) It might be acceptable a few decades ago when computer science was at its dawn but certainly not after modern programming languages have come a long way.

In a nutshell, the users of modern programming languages can write better code, in terms of correctness, readability, easiness to debug (maintainability), modularity (object-oriented).

Let’s illustrate our opinion by numerically evaluating this integral:

$\int_0^1 \! x^2 \, \mathrm{d}x$

In Matlab/Octave/R/Scilab, we typically would write something like this.

This would be a piece of perfectly acceptable code many decades ago when C was the most popular programming language. Judging from the modern software engineering perspective, we make one criticism: the mathematics concepts involved are not modular; they cannot be separated out. All this snippet does is to describe one particular procedure to integrate one particular function. There is no reusable code. For example,

  • if you want to integrate another function, you would need to copy and paste this snippet then modify (*);
  • the integrand cannot be manipulated (as a first class citizen); it cannot be evaluated without making a copy outside the integration for-loop; it cannot be passed to another operation, e.g., differentiation, or another way of doing integration, without copying and pasting the same code.

Here is a modern object-oriented way of doing integration using SuanShu in Groovy. Note that the integration algorithm is now independent of the integrand. The integrator can take different integrands, and the integrand is no longer tied to the integrator. This is not to mention that the code is a lot shorter. One statement is to define a Riemann integrator; another statement is to define the integrand. The two objects are independent.

Software Engineering Process

Not only does the modern languages like Java allow the users to produce more elegant code, they also come with a suite of software engineering tools and paradigms to improve the coding experience as well as the quality of end products.

First, the computer science community has escaped from the procedural imperative programming mindset to object-oriented programming. We have designed many modern programming techniques, most notably design patterns for effective programming. However, it is not clear if the legacy mathematics languages support inheritance and interface to the extent that we can code patterns using them.

The modern languages come with tools to ensure code correctness. For example, for each mathematics concept, SuanShu has a corresponding class(es). The JUnit tool allows us to systematically and automatically do regression tests on all the source code regularly and when there are code changes. The same goes true for Groovy, Scala and etc. In contrast, suppose we have 100s of R scripts that depend on each other. When we make code changes in one of them, it is not clear how to systematically and automatically make sure that the changes are correct for this particular R file and that all other scripts will work after the changes. Specifically, suppose we make a fundamental change in the way the coefficients in an ARMA model are estimated, will all other functions that depend on this ARMA fitting still work? Theoretically maybe yes, but practically often no (unless you always write bug free code).

Below is a screenshot of the result of SuanShu unit testing.

junit_example