In order for the results of CPPX fact extraction to be generally useful, not only must the facts be in a published format, they must conform to a documented and publically available schema, so that the types of entities and relationships, and constraints on them, are as expected.

The CPPX output graph conforms to the data model and schema of Bell Canada's Datrix project. More precisely, the Abstract Syntax Graph (ASG) for C++, which is documented in " The ASG Reference Manual (version 1.4)" serves as the reference description of the output graph of CPPX.

The conceptual model (or "graph model") which the Datrix project assumes is essentially the enhanced E/R graphs supported by TA. There are entities and relationships. Entities are represented by graph vertices (nodes). Relationships are binary and are represented by graph edges. Both vertices and edges can have attributes, which are (named) strings or numbers. Every vertex and edge belongs to a unique type, which determines the attribute names and types of connexion allowed (for edges). The types are organized into a simple (single) inheritance hierarchy.

The graph model is a subset of that of the GXL graph exchange language project, and hence Datrix graphs can be represented in GXL. Here is a brief about GXL for CPPX, with an example output. Here are some more examples.

last update 2001 May 9 by AJM