|SWAG >> Tools|
Table of Contents
Over the years, members of SWAG have produced a considerable amount of tools to aid them in their research. The tools are varied in nature, but fall into several broad categories:
Pipelines. While all SWAG tools are extremely useful on their own, they truly shine when combined together into "pipelines". A pipeline, as the name suggests, is a collection of tools designed to work together to achieve a final result, with some extra code to achieve tool integration. As per pipeline philosophy, the output of one tool in the pipeline becomes the input of the other, with each tool in the procession performing a certain task. The two pipelines currently available both aim at making the extraction and analysis of software easier for the beginning researcher. They provide a way to perform a complete extraction, analysis and visualization of a piece of software with as few user input as possible.
Just like individual tools, the pipelines are freely available for download. Installing a pipeline is usually easier than installing
all the individual tools that make up that pipeline, and lets you use all the included tools both individually and in the pipeline.
Below you will find the listing of the tools currently available, with links to their short descriptions.
CPPX is a free C++ compiler which produces a fact base instead of producing executable code. For design-recovery tool interoperability to become a reality, we need common software to extract facts from code bases, in a common format, and according to a common schema. CPPX extracts facts from C++ source from the highest semantic level (classes and global data and functions) down to the lowest code level of individual statements and expressions. The output format is based on that of Bell Canada's Datrix project. CPPX is available as free software (and in fact depends on GCC for semantic analysis).
BFX and LDX are two complementary binary code extractors targeted at object modules, executables, and dynamic libraries. Normally, they are integrated with the actual software build process to carry out fact extraction. They together produce information on function calls, variable access, and build dependencies between object modules. Their less detailed output is an order of magnitude smaller than the CPPX output. The two extractors offer an extremely simple and supremely reliable way to derive a system model.
ASX is a fact extraction tool that extracts source information from C, C++, assembler, object, libraries, dynamic libraries and executables, in a format that may then immediately be visualised using lsedit.
The CLone Interpretation and Classification System (CLICS) is a tool that extends the work of CCFinder/Gemini by trying to improve the scalability of the clone visualization problem. It automatically filters common types of false matches and classifies clones based on a taxonomy of clones. Users can navigate clones based on this taxonomy, remove clones from the result set, edit the list of files to be included in the analysis without changing the detection results, and visualize the cloning relationships in the software system using LSEdit.
Grok is a programming language for manipulating binary relations. Grok has an interpreter, which can be considered to be a relational calculator. This interpreter has been used extensively for analyzing factbases produced by parsers that extract information from source programs.
The initial version of Grok was created by the author in 1995. It has evolved to become a language for manipulating factbases. Grok operates at the level of a relational database, in that operators generally apply across entire relations, and not just to single entities. The Grok interpreter has been optimized to handle large factbases (up to several hundred thousands of facts). It keeps all of its data structures in memory. It is written in the Turing language.
jGrok (download here) is a re-implementation and extension of the original Grok, written in Java, completed and maintained by Jingwei Wu. jGrok adds many new features and commands to the set that original Grok provides. While sacrificing some speed, jGrok gains portability: being a Java program, it is executable on any platform that has a Java Virtual Machine available. Like Grok, jGrok is optimized for operating on large fact bases, and has been used to operate on collections of up to a million facts.
LSEdit (the name stands for Landscape Editor) is a nested-box-and-arrow graph visualizer. While it is primarily used to display graphs representing software architectures ("landscapes"), it is not limited to visualizing software, and is general enough to display any graph that can be visualized as a collection of nested boxes connected by arrows. LSEdit possesses advanced graph layout and editing capabilities, advanced query, elision and search functions, and support for graphs in excess of 300,000 nodes. LSEdit is written in Java and runs equally well on Windows, Unix, MAC OS and any other platform that has a Java virtual machine.
Named after the ship upon which Charles Darwin served as a naturalist, Beagle is a tool that aims to help developers gain a better understanding of the software evolution process. It provides a framework that allows users to query, visualize, and navigate through a system's history, and allows users to build a persistent, annotated models of how structural changes have impacted the design of the system.
Spectrograph provides a metrics-based method to characterize the evolution of a spectrum of closely related components. There are five terms that need to be clarified in the spectrograph modelling of software evolution: time, spectrum, measurement, snapshot, and thread.
Swagkit is an architecture extraction and analysis toolkit developed by the Software Architecture Group at the University of Waterloo, comprised of the CPPX fact extractor, Grok fact manipulation engine and several fact manipulation scripts, and the LSEdit graph visualization software.
SWAGKit can be used to extract, abstract and present Software Architectures. Currently Swagkit supports the extraction of C/C++ code, the abstraction to the architectural level and the presentation in a Landscape form. Swagkit has been used to analyze and visualize many complex software systems, including the Linux operating system kernel and the VIM editor.
The LDX/BFX family of pipelines has been developed by Jingwei Wu while working at SWAG. The two related pipelines are made up of the LDX/BFX fact extraction utilities (from which they get their names), the jGrok fact manipulation and query engine, and the LSEdit graph visualization software. These pipelines are interesting because unlike other fact extractors, they work besides the compiler and linker, not instead of them. This means that the facts are extracted as the software system is built. The build process produces both the executable program and the information about that program.
The Software Bookshelf is a web-based paradigm for the presentation and navigation of information representing large software systems. The Portable Bookshelf (PBS) is one implementation of this concept. The PBS Toolkit is our set of tools for the generation of a PBS Bookshelf.
The PBS toolkit is now retired; it, and the CFX extractor it is based on are no longer supported;
please use SWAG Kit or the LDX/BFX pipeline instead.