Thinking Craftsman's Tool Kit (TC Toolkit)

Since I am consulting on software development, I felt a need for analyzing code quickly to detect code hotspots, hotspots which need to addressed first. The techniques I am using are different than traditional 'static code analysis' (e.g. using tools like lint, PMD, FindBugs etc). I am using a mix of various code metrics and visualizations to detect 'anomalies'.

The Thinking Craftsman toolkit is a set of programs to quickly analyze the source code. These program are written in Python. The source code for Thinking Craftsman Toolkit is hosted on Google code. It is published under New BSD license.

  1. Code Duplication Detector (CDD):
    CDD can analyze files in directory tree and print the duplicates found (ingoring the comments). It is implemented in Python using the Rabin Karp string matching algorithm and Pygments lexers.

  2. Token Tag Cloud (TTC):
    Sometime back I read the blog article 'See How Noisy Your Code Is'. I developed a python module for creating various tag clouds based on token types (e.g. keywords, names, classnames etc).

  3. Treemap Visualization for Source Monitor Metrics data (SMTreemap)
    Source Monitor is an excellent tool to generate various metrics from the source code (e.g. maxium complexity, averge compelxity, line count, block depth etc). However, it is difficult to quickly analyse this data for large code bases. Treemaps are excellent to visualize the hierarchicaldata on two dimensions (as size and color). This tool uses Tkinter to display the SourceMonitor data as treemap. You have to export the source monitor data as CSV or XML. can then use this CSV or XML file as input to display the treemap