Over the past months I’ve been studying how to improve performance of the Common Subexpression Elimination (CSE) routine of SymPy, so it can be used in SymPyBotics with acceptable computing times.
This resulted in pull request #2355, which had already been merged into SymPy master branch (for release in version 0.7.4).
Here is an example comparing both new and previous CSE implementations when applied to the expressions of the ”generic torque”, computed with SymPyBotics, for the first 3 joints of a 7-DOF WAM Arm robot.
Cache has influence in times, but we can notice an average performance improvement of about 25x when external (pre and post) optimizations are used. When no external optimizations are used, the performance has an average improvement of 90x. With the new `order=’none’` option, the improvement rises to 500x for the non cached case, and to 1000x for the cached one!
For this particular case, the CSE is less optimized when external optimizations are done (output has more operations) than when they are not.
How it works now
First, two remarks:
- expressions are not trees but rather directed acyclic graphs (DAG). E.g., in the expression sin(x+1)+cos(x+1), the arguments of sin and cos are the same, x+1; indeed the node x+1 has two parents;
- SymPy (sub)expressions are nicely and fastly hashable, thus great to use in sets and dictionaries.
The CSE core/raw algorithm:
- The core of the new CSE parses the expression adding each seen subexpression to the seen set. If a subexpression was already seen, it is added to the repeated set and its children nodes are not parsed (there is no need to).
- After knowing the repeated subexpressions (nodes with more than one parent), the core CSE rebuilds the whole tree using intermediate variables in place of repeated subexpressions.
The internal optimizations:
- Before the core CSE algorithm is performed the expression is parsed to find optimization opportunities; when an optimizable subexpression is found it is added to the opt_subs substitutions dictionary.
- When the core algorithm parses a subexpressions it looks for it in the opt_subs dictionary, if it is there is parses the substitution instead.
- The currently implemented internal optimizations are the following:
- negative signs are striped out from multiplications, e.g., -2*x is substituted by -1*(2*x)
- negative signs are striped out from exponents, e.g., x**(-2*y) is substituted by (x**(2*y))**-1
- common Add and Mul terms are grouped, e.g., in cos(a+b+c)+sin(a+b+d)), a+b+c is substituted by (a+b)+c and a+b+d is substituted by (a+b)+d, so that a+b is a single and repeated node
In my opinion three things could further improve CSE:
- add support for MatrixExprs;
- use replacement Symbols which could somehow clone the assumptions of the subexpressions they represent;
- implement an optimal Mul/Add term matching system (maybe using Matthew Rocklin’s logpy package).
Filed under: PhD, SymPyBotics, Work | 4 Comments
Tags: python, subexpression elimination, sympy
This is a python toolbox to generate and manipulate robot dynamic equations. It depends on the great SymPy library and on SymCode, another library I’m developing.
Filed under: PhD, SymPyBotics, Work | Leave a Comment
Tags: git, python, sympy
I’m starting (1 year already done) my PhD so I have to make a thesis proposal (they call it thesis project) for the university.
As a fan of typography, I produce almost all my documents using LaTeX type system. And when they are long I put the source code into git versioning. This time, and following Will Robertson‘s idea, I decided to place the code in a public repository: https://gitorious.org/csousa-phd-thesis-project.
I’m not completely sure if this is a good (secure) idea, but quoting Will:
I also believe that all academic research should be made more open. This is a small step towards such a philosophy.
Notice, however, that the work is copyrighted and no rights are given to reproduce or modify it.
Anyway, I don’t expect the content of my thesis proposal to have value for outside. Instead, I’m making it public because of Latex source structure.
The key points of this document latex source are:
- the use of memoir class, with a separated package for customization;
- the use of glossaries package for acronym list;
- the use of biblatex (yes, not bibtex) for bibliography (configured to use biber);
- the use of microtypography;
- the use of latexmk, with a local rc file for glossaries files; and
- being split across several .tex and .sty (package) files.
For this document pdflatex is used. For others, when I want to play with fonts, I use lualatex with Will’s fontspec package.
As a note, I use bibliography .bib file from Mendeley. I’ve configured it to generate one file per collection, so that for each document I write I create a dedicated collection and copy relevant references to it.
Filed under: PhD | Leave a Comment
I’m starting this because of http://matt.might.net/articles/how-to-blog-as-an-academic/.
Filed under: zOther | Leave a Comment