Updated: Apr 20
This is an interlude to our usual topics. This post will be one level inside quantitative model development, one level below using a given programming language or framework to automate asset management. We will discuss large-scale program architecture and a method to minimally document it with the minimum impact on implementation, with a small example.
At Ostirion, we believe in and apply when possible systems engineering principles described in ISO/IEC/IEEE 15288. Sometimes to the letter, sometimes with flexibility. We sincerely believe that a good architecture can make a defective implementation eventually work and that it is challenging to make any implementation work (with acceptable efficacy) inside a bad architecture. This causes tension with other, more agile, approaches and we have witnessed and suffered tensions between 15288 and SCRUM approaches first-hand. We will cover this tension (and how to relax) in another interlude for today, we cover the generation of UML diagrams automatically for Python code. An effective method and tool to generate the minimum diagramming needs for documentation or personal "help" calls.
The reasons for seeking automation of UML diagrams are, in our experience, these three:
To document an older codebase to be used as a baseline for derivative products.
To reduce the documentation effort of ongoing code generation work.
Ensure that the UML documentation is aligned with the codebase.
Multiple commercial tools help with the management of code documentation using UML diagrams. These commercial packages are relatively expensive and present their code-to-UML or UML-to-code functionalities as a secondary tool. A prime example is the widely used Enterprise Architect, where the "code round-trip" functionality is placed in the middle of the commercial prospect. It could well be the first, and if we go radical, the only one truly needed for effective development. If you cannot see architectural elements reflected in the code implementation, they should not exist. If the code does not exhibit all the architectural elements, something is wrong. The round trip of code to architecture is a necessity.
This divide is created by the large skill set differences between the professionals defining the system architecture and the professionals implementing the system. This gap makes mixing abstract diagramming and concrete code implementation unappealing, and often two distinct and opposing camps appear during a project. It is difficult for high-level-view systems engineers to understand the code and difficult for developers to keep up with system definitions and documentation requirements. Whatever the framework or approach is used, systems management is always at risk of dissociation from code solutions, changes go undocumented, systemic indiscipline emerges, and quality and functionality drop with the consequent loss of value.
If price is a problem, as usually is, there are functional code-to-UML tools that are open-sourced. One such tool is pyreverse, integrated into the Python code linter pylint. We will use a Jupyter Notebook for this demonstration. If using a cloud session that allows installation of modules, install pylint:
!pip install --q pylint
The module is small enough not to require the full pip log. The "--q" option reduces the verbosity of pip installs. If you want to check what packages your system has installed, you can always run:
Produces a list with all installed modules similar to this:
And to check the built-in modules:
We will check one of the built-in modules using pyreverse, datetime module, for example, as it is widely used in time series analysis and is sufficiently simple and sufficiently complex to illustrate the type of diagrams we can automatically obtain with default options:
!pyreverse -o png datetime
The format is simple, austere, and fits our minimum documentation needs:
We see the often used datetime, time, and timedelta classes with their inheritor and composer classes. The format being, as we said, simple and austere provides a side benefit: it requires a lot of work to fit into a word processor or presentation document properly. It looks so uncool that we are not tempted to build self-standing documents and deviate from a Model-Based Systems Engineering Approach. The territory becomes the map so that no change to the territory makes the map obsolete.
This is just an elementary toy example. We can extend it to modules that you may be working with that are underdocumented and can, in a pinch, act as a backup help generator. Avoid producing these UML diagrams using tools that do not include or are disconnected from the code. Integrated tools, both commercial and open, are readily available for you. Prevent the collapse of your bridges.
If you require quantitative model development, deployment, verification, or validation, do not hesitate and contact us. We will also be glad to help you with your systems engineering and code production processes.