Tcl Data Analysis


Tcl is a programming language where “everything is a string“, that is, every datatype has a string representation. Therefore, while Tcl does have internal datatypes (e.g. list and float), Tcl is effectively typeless. This makes data analysis difficult. Is the string “1 2 3” just a string value, a list, or a matrix? You can parse “1 2 3” as a string with 5 characters, or as a list with three elements, or as a 3×1 matrix. All are valid interpretations of the same value. Tcl commands, therefore, provide the context for the data.

This makes data analysis in Tcl a challenge. You can’t tell the dimension of a matrix by just looking at it, it has an infinite number of possible interpretations. It could be a value. Or a vector. Or a matrix. Or a 3D tensor, Or a 4D tensor, etc.. All interpretations are valid when everything is a string.

I believe that this typelessness of Tcl is the main reason why there is no native support for math operations on vectors and matrices in Tcl. Math has to go through the “expr” command, and with the exception of the “in” and “ni” operators, the “expr” command only deals with scalars. The “lmap” command, added in Tcl 8.6, can be used in conjunction with “expr” to perform operations over lists, but this still cumbersome, and only works for 1D arrays. As it stands, there are no built-in commands for matrix or higher level tensor manipulation in Tcl.

Existing Tcl packages for Matrices

The standard Tcl library, Tcllib, does have a few packages for matrix manipulation, and they have their use cases, but, in my opinion, they fall short.

The math::linearalgebra package in Tcllib provides linear algebra routines in native Tcl. Its representation of matrices is a nested Tcl list, i.e. a list of row vectors. From what I have seen, this is the most common way to represent matrices, as it is fully compatible with the built-in Tcl list commands. The main strength of this package is in commands such as “solveGauss”, which handles the difficult task of solving linear systems of equations with float numbers. However, the commands it provides for basic matrix access/modification are, in my opinion, lacking, and it does not support tensors, which limits its application.

The struct::matrix package in Tcllib is an object-oriented approach to matrix manipulation, and the matrix objects can be linked to Tcl arrays. The matrix values can be “serialized” into a three element list, with the first two elements being the matrix dimension, and the third element being the nested list representation used by math::linearalgebra. The access and modification procedures are convenient in this package, but it provides no methods for performing math operations on the values, and as such is not very useful for mathematics.

There are 3rd party packages for matrix manipulation, and the best I have seen is VecTcl. VecTcl is honestly really good. The representation of matrices and higher rank tensors in VecTcl is compatible with the “nested list” structure used by math::linearalgebra, and it and they are, admittedly, very good. However, it has one fatal flaw in my opinion. Rather than building upon the existing Tcl syntax, it creates its own sub-language, interpreted by the command “vexpr”. In “vexpr”, arrays are referenced with “barewords” (e.g. “x”) rather than with variable substitution (e.g. “$x”), and it uses square brackets for indexing, which breaks the convention that square brackets evaluate commands. It doesn’t feel like Tcl, because it isn’t. It’s VecTcl. Nevertheless, it is fast, and has great features, and certainly has its place. It was just not what I was looking for.

Ta-da!

So I made my own package: Tda (Tcl Data Analysis). Tda (pronounced “Ta-da”) provides an pure Tcl implementation of N-Dimensional arrays and column-oriented tables. Additionally, Tda also provides data type conversions between tables and matrices, file import and export utilities, and basic data visualization tools.

Here are a few examples of what you can do with Tda:

N-Dimensional array manipulation

The “ndlist” module adds ND array support, with familiar index notation.

> set A {{1 2 3} {4 5 6} {7 8 9}}
{1 2 3} {4 5 6} {7 8 9}
> mget $A 0:1 0:1
{1 2} {4 5}
> mget $A 0* :
1 2 3
> mexpr x $A {$x / 2.0}
{0.5 1.0 1.5} {2.0 2.5 3.0} {3.5 4.0 4.5}
> cmap max $A
7 8 9
> rmap max $A
3 6 9
> mreplace $A : 0 ""
{2 3} {5 6} {8 9}
> mop $A .+ {.1 .2 .3}
{1.1 2.1 3.1} {4.2 5.2 6.2} {7.3 8.3 9.3}
Tabular data manipulation

The “tbl” module provides an object-oriented framework for tabular data.

> set tblObj [tbl new]
::oo::Obj34
> $tblObj define data {
    1 {x 3.44 y 7.11 z 8.67}
    2 {x 4.61 y 1.81 z 7.63}
    3 {x 8.25 y 7.56 z 3.84}
    4 {x 5.20 y 6.78 z 1.11}
    5 {x 3.26 y 9.92 z 4.56}
}
> set tblCopy [$tblObj copy]
::oo::Obj35
> set a 20.0
20.0
> $tblCopy expr {@x*2 + $a}
26.88 29.22 36.5 30.4 26.52
> $tblCopy fedit q {@x*2 + $a}
> $tblCopy cget q
26.88 29.22 36.5 30.4 26.52
> $tblCopy
keyname key fieldname field keys {1 2 3 4 5} fields {x y z q} data {1 {x 3.44 y 7.11 z 8.67 q 26.88} 2 {x 4.61 y 1.81 z 7.63 q 29.22} 3 {x 8.25 y 7.56 z 3.84 q 36.5} 4 {x 5.20 y 6.78 z 1.11 q 30.4} 5 {x 3.26 y 9.92 z 4.56 q 26.52}}
Data-type conversion and file utilities

Read and write from text and csv files, and convert easily to Tda data structures. Example files from https://github.com/maxogden/csv-spectrum.

> readMatrix comma_in_quotes.csv
{{first last address city zip} {John Doe {120 any st.} {Anytown, WW} 08123}}
> set empty [readFile empty.csv]
a,b,c
1,"",""
2,3,4
> csv2mat $empty
{{a b c} {1 {} {}} {2 3 4}}
> set escaped_quotes [readFile escaped_quotes.csv]
a,b
1,"ha ""ha"" ha"
3,4
> set tbl [csv2tbl $escaped_quotes]
::oo::Obj14
> $tbl properties
keyname a fieldname field keys {1 3} fields b data {1 {b {ha "ha" ha}} 3 {b 4}}
> $tbl get 1 b
ha "ha" ha
> tbl2mat $tbl
{a b} {1 {ha "ha" ha}} {3 4}
Data Visualization

The “vis” module uses the “wob” package (installed as a dependency) to create widgets for data visualization.

The “plotXY” command allows you to explore data interactively, with a slider bar.

namespace path ::tcl::mathfunc
set x [linsteps 0.01 -10 10]
set y1 [vmap sin $x]
set y2 [vmap cos $x]
plotXY $x $y1 $y2
::wob::mainLoop

The “viewMatrix” and “viewTable” commands allow you to copy data directly from Tda matrices and tables.

set x [range 100]
set y [vexpr xi $x {sin($xi)}]
viewMatrix [augment $x $y]
::wob::mainLoop

How to get Tda

If you want to use Tda, just install Tin (https://github.com/ambaker1/Tin) and run the following code, which will install and import the commands.

package require tin
tin import tda

I had a lot of fun putting this package together, and I hope it can be of use to the Tcl and OpenSees communities. If you have any suggestions or find bugs, feel free to raise an issue on GitHub (https://github.com/ambaker1/Tda), or fork the repository and submit a pull request.

Documentation


Leave a comment