project : Bookshelf program : CFX ( C Fact X-tractor ) Author : Gerry Kovan, University of Toronto Table of Contents ----------------- How to read this document .......................... 1.0 Terms defined in glossary .......................... 2.0 Purpose of Program ................................. 3.0 Program Usage ...................................... 4.0 Files created by cfx ............................... 5.0 Background Information ............................. 6.0 Description of CFX Relations ....................... 7.0 Description of CFX ................................. 8.0 Description of program cfx_make_trans ............. 9.0 Appendix 1: Appendix 2: Appendix 3: Glossary 1.0 How to read this document: -------------------------- This document is a combination of a man page and a technical document. Sections 3.0 to 5.0 contain information about the purpose and usage of the program. Sections 6.0 discusses the necessary background information about relationships within C programs. Section 7.0 lists all the relations that cfx extracts. It is probably a good idea to read section 6.0 before looking at section 7.0. Section 8.0 gives a brief description of how the cfx program is built. Section 9.0 explains the purpose of the 'cfx_make_trans' program and gives a demonstration of its usage. 2.0 Terms defined in glossary ----------------------------- 1) internal linkage 2) external linkage 3.0 Purpose of Program ---------------------- This program extracts import/export relations directly from C source code. 4.0 Program Usage ----------------- Because cfx uses the gcc compiler modules as a front end, the input to cfx is similar to gcc. The gcc options that are relevant to cfx are the PreProcessor options. For example the -D option defines symbols. This is used when the source code contains conditional compilation statements and you want to build an executable based on these statements. This is also very useful for cfx as it solves the conditional compilation problem when extracting facts. The -c option in gcc simply compiles the source file and suppresses the linking stage. In cfx this option works similarly, the file is compiled ( in cfx comile means to extract facts for a single file ) but not linked ( in cfx link means to combine the facts for all the source code files and resolve certain fact information ). The -o option in cfx works similarly to gcc as well. The -o option description taken from the gcc man page is: -o file Place output in file file. This applies regardless to whatever sort of output GCC is producing, whether it be an executable file, an object file, an assembler file or preprocessed C code. In cfx a suffix or a prefix is appended to the name of the output file specified by the -o option. The prefix ".cfx" is appended to the file name if the -c option is specified in the command line as well ( i.e. cfx_cc -o bob.o -c main.c ) . If the -c option is not specified then the suffix ".cfx.mr" is appended to the file name. 5.0 Files created by cfx ------------------------ The files created by cfx parallel gcc. Compiling : example 1: ---------- gcc -c main.c <== gcc creates an object file main.o cfx_cc -c main.c <== cfx creates a fact file called .cfx.main.o example 2: ---------- gcc -o bob.o -c main.c <== gcc creates an object file called bob.o cfx_cc -o bob.o -c main.c <== cfx creates a fact file called .cfx.bob.o Linking : example 1: ---------- In the Makefile listed in Appendix 1, three .c files are compiled and there object files are linked together. In gcc the commands to compile and link the three object files are as follows: gcc -c main.c <=== creates main.o gcc -c add.c <=== creates add.o gcc -c globalVars.c <=== creates globalVars.o gcc -o math main.o add.o globalVars.o <== links the object files to produce an executable called 'math' In cfx the process is similar: cfx_cc -c main.c <== creates .cfx.main.o cfx_cc -c add.c <== creates .cfx.add.o cfx_cc -c globalVars.c <== creates .cfx.globalVars.o The cfx link stage combines the cfx compiled fact files to produce a single fact file for the complete program. cfx appends the suffix ".cfx.mr" to the name specified as the output file -- in this command the output file is specified to be 'math' . cfx_cc -o math .cfx.main.o .cfx.add.o .cfx.globalVars.o <== cfx link stage note: cfx also creates a ".ifc" files . These can be ignored as they are not used for anything right now. 6.0 Background Information -------------------------- In C programs there are two type of file dependencies: 1) link-time -- external linkage ( i.e. linking files ) 2) source -- internal linkage ( i.e. including files ) The following is an example demonstrating link-time dependencies and source dependencies : file main.c: ------------ /* declaration of function 'add' */ extern int add ( int x, int y ); #include "multiply.c" void main() { int a; int b; a = add ( 3, 4 ); <==== link dependency between file main.c and add.c -reference to function add is resolved at link time b = mult (3 , 4 ); <=== source dependency between file main.c and multiply.c - reference to function } mult is resolved at compile time file add.c: ----------- /* definition of function 'add' */ extern int add ( int x, int y ) { return x + y; } file multiply.c: ---------------- /* declaration of function 'mult' */ static int mult ( int x, int y ); /* definition of function 'mult' */ static int mult ( int x, int y ) { return x * y; } An executable of the above program is built using the following commands: gcc -c main.c <== compile main.c to produce an object file called main.o gcc -c add.c <== compile add.c to produce an object file called add.o gcc -o math main.o add.o <== link main.o and add.o to produce an executable called 'math' In C, files can "import" and "export" in two ways: through "linking" (link dependency) and through "including" (source dependency). A more complex example : file main.c: ------------ #include "globalVars.h" #include "add.h" #include "multiply.c" #include "staticVars.c" #include "common.h" void main() { extern float e; a = add ( 3, 4 ); b = mult (3 , 4 ); c = add ( 5 , 6 ); d = mult ( 5 , 6 ); e = div ( 7 , 8 ); { BOOLEAN f; struct complex vector; enum colour paint; union typeOf h; f = TRUE; vector.real = 3.14; vector.imag = 6.28; paint = red; } } file globalVars.h - interface to file 'globalVars.c' ---------------------------------------------------- /* declarations of integer varibles 'a' and 'b' */ extern int a; extern int b; file globalVars.c ----------------- /* definitions of integer variables 'a' and 'b' */ int a; int b; file staticVars.c ----------------- /* definitions of static integer variables 'c' and 'd' */ static int c; static int d; file add.h: - interface to file add.c ------------------------------------- /* declaration of function 'add' */ extern int add ( int x, int y ); file add.c: ----------- /* definition of function 'add' */ extern int add ( int x, int y ) { return x + y; } file multiply.c: ---------------- /* declaration of function 'mult' */ static int mult ( int x, int y ); /* definition of function 'mult' */ static int mult ( int x, int y ) { return x * y; } file divide.c: -------------- /* definition of variable e; by default it has external linkage */ float e; /* definition of function 'div'; by default it has external linkage */ float div ( int a, int b ) { return a / b; } file common.h: -------------- #define TRUE 1 typedef int BOOLEAN; struct complex { float real; float imag; }; enum colour { red , white , blue }; union typeOf { int x; float y; }; See Appendix 1 for a listing of the Makefile to build this program, See Appendix 2 to see the cfx Makefile created by the program 'cfx_make_trans'. LINKING: Files can export variables and functions with external linkage. They can import such variables, although the variables will require a declaration in the source (file that does the including), and the functions may or may not have a declaration in the source (file that does including). export: - variables with external linkage definition file globalVars.c exports variable definitions 'x' and 'y' - functions with external linkage definition file add.c exports the definition of function 'add' import: - variables with external linkage usage file main.c imports variables 'a' and 'b' since these variables are used/referenced in main.c - functions with external linkage usage file main.c imports function 'add' as it is used/called in the file INCLUDING: export: - variables with internal linkage definition file staticVars.c exports the varible definitions of 'c' and 'd' as staticVars.c is included by another file ( main.c ) - functions with internal linkage definition file multiply.c exports the definition of function 'mult' since "multiply.c" is included by another file ( main.c ) - variables with external linkage declaration file globalVars.h exports the declaration of variables 'a' and 'b' as globalVars.h is included in file main.c - functions with external linkage declaration file add.h exports the declaration of function 'add' as add.h is included in main.c import: - variables with internal linkage usage file main.c imports 'c' and 'd' as these variables are used/referenced in main.c - functions with internal linkage usage file main.c imports 'mult' as this function is used/called in main.c - variables with external linkage usage file main.c imports 'e' as this variable is used/referenced in main.c - functions with external linkage usage file main.c imports 'div' as this function is used/called in main.c - code & variables import/export: - typedefs file main.c imports type 'BOOLEAN' as it is used file common.h exports the definition of 'BOOLEAN' - macros file main.c imports macro 'TRUE' as it is used file common.h exports the definition of 'TRUE' - enums file main.c imports enum 'colour' file common.h exports the definition of 'colour' - enum tags file main.c imports enum tag 'red' as it is the only one used file common.h exports enum tags 'red', 'white' and 'blue' - structs file main.c imports struct 'complex' file common.h exports the definition of 'struct' - unions file main.c imports union 'typeOf' file common.h exports the definition of 'typeOf' - labels See Appendix 3 to see the CFX output for the above program. 7.0 Description of CFX Relations -------------------------------- 0 EXPORT_PUBLIC_FCN_DEF, /* definition of functions with external linkage*/ 1 EXPORT_PUBLIC_VAR_DEF, /* definition of variables with external linkage*/ 2 EXPORT_STATIC_FCN_DEF, /* definition of functions with internal linkage*/ 3 EXPORT_STATIC_VAR_DEF, /* definition of variables with internal linkage*/ 4 EXPORT_PUBLIC_FCN_DECL, /* declaration of functions with external linkage*/ 5 EXPORT_PUBLIC_VAR_DECL, /* declaration of variables with external linkage*/ 6 EXPORT_STATIC_FCN_DECL, /* declaration of functions with internal linkage*/ 7 EXPORT_MACRO, /* macros (#define) */ 8 EXPORT_TYPE, /* typdefs */ 9 EXPORT_ENUM_TAG, /* enumerated type tag */ 10 EXPORT_ENUM_ITEM, /* enumerated type item */ 11 EXPORT_STRUCT, /* struct name */ 12 EXPORT_UNION, /* union name */ 13 IMPORT_EXPORT_DIVIDER, 14 INCLUDE_FILE, /* file included (#include) */ 15 IMPORT_IMPLEMENT_PUBLIC_FCN, /* a definition implements a declaration */ 16 IMPORT_IMPLEMENT_PUBLIC_VAR, /* a definition implements a declaration */ 17 IMPORT_IMPLEMENT_STATIC_FCN, /* a definition implements a declaration */ 18 IMPORT_IMPLEMENT_STATIC_VAR, /* a definition implements a declaration */ 19 IMPORT_PUBLIC_FCN_DEF, /* use a definition in another file */ 20 IMPORT_PUBLIC_VAR_DEF, /* use a definition in another file */ 21 IMPORT_PUBLIC_FCN_DECL, /* use a declaration in another file */ 22 IMPORT_PUBLIC_VAR_DECL, /* use a declaration in another file */ 23 IMPORT_STATIC_FCN_DEF, /* use a static definition in another file */ 24 IMPORT_STATIC_VAR_DEF, /* use a static definition in another file */ 25 IMPORT_LINK_FCN, /* link time importation of functions */ 26 IMPORT_LINK_VAR, /* link time importation of variables */ 27 IMPORT_EXTERN_FCN, /* fcn not defined in analyzed code */ 28 IMPORT_EXTERN_VAR, /* var not defined in analyzed code */ 29 IMPORT_CODE_AND_VARS, /* use code and local vars in another file */ 30 IMPORT_MACRO, /* macros (#define) */ 31 IMPORT_TYPE, /* typdefs */ 32 IMPORT_ENUM_TAG, /* enumerated type tag */ 33 IMPORT_ENUM_ITEM, /* enumerated type item */ 34 IMPORT_STRUCT, /* struct name */ 35 IMPORT_UNION, /* union name */ /* the following relations are temporary and are resolved into other relations during various stages of the cfx compilation process. */ 36 EXPORT_PUBLIC_FCN_DEF_END, /* end of EXPORT_PUBLIC_FCN_DEF */ /*SS*/ 37 EXPORT_STATIC_FCN_DEF_END, /* end of EXPORT_STATIC_FCN_DEF */ /*SS*/ 38 EXPORT_ENUM_TAG_END, /* end of enumerated type tag */ /*SS*/ 39 EXPORT_STRUCT_END, /* end of struct definition */ /*SS*/ 40 IMPORT_FCN_REF_TEMP, /* resolved to IMPORT_LINK_FCN_TEMP 41 IMPORT_VAR_REF_TEMP, /* resolved to IMPORT_LINK_VAR_TEMP 42 IMPORT_LINK_FCN_TEMP, /* resolved to IMPORT_LINK_FCN or IMPORT_EXTERN_FCN 43 IMPORT_LINK_VAR_TEMP, /* resolved to IMPORT_LINK_VAR or IMPORT_EXTERN_VAR 44 NO_KIND 8.0 Description of CFX ---------------------- CFX is modeled after the GCC compiler. In fact CFX uses the GCC front end ( i.e. GCC modules are used for parsing the C code ) in order to extract the facts. GCC executables CFX executables --------------- --------------- cc - driver of compiler cfx_cc - driver cpp - preprocessor cfx_cpp - extracts preprocessor stage info (i.e MACROS ) cc1 - semantic analysis cfx_cc1 - does most of the work and codegen ld - linker cfx_ld - resolves external var refs and fcn calls cfx_make_trans - translates a make file into a CFX make file The following is a list of the relations that each of the executables extracts: cfx_cpp - extracts EXPORT_MACRO, IMPORT_MACRO, INCLUDE_FILE cfx_cc1 - extracts IMPORT_ENUM_ITEM, IMPORT_TYPE, IMPORT_VAR_REF_TEMP, IMPORT_FCN_REF_TEMP, IMPORT_ENUM_TAG, IMPORT_STRUCT, IMPORT_UNION, IMPORT_IMPLEMENT_PUBLIC_FCN, IMPORT_IMPLEMENT_PUBLIC_VAR, IMPORT_IMPLEMENT_STATIC_FCN, IMPORT_IMPLEMENT_STATIC_VAR, IMPORT_PUBLIC_FCN_DEF, IMPORT_PUBLIC_VAR_DEF, IMPORT_STATIC_FCN_DEF, IMPORT_STATIC_VAR_DEF, IMPORT_PUBLIC_FCN_DECL, IMPORT_PUBLIC_VAR_DECL, IMPORT_FCN_REF_TEMP gets resolved to IMPORT_LINK_FCN_TEMP, IMPORT_VAR_REF_TEMP gets resolved to IMPORT_LINK_VAR_TEMP EXPORT_PUBLIC_FCN_DECL, EXPORT_STATIC_FCN_DECL, EXPORT_PUBLIC_FCN_DEF, EXPORT_STATIC_FCN_DEF, EXPORT_PUBLIC_VAR_DEF, EXPORT_STATIC_VAR_DEF, EXPORT_STRUCT, EXPORT_UNION, EXPORT_TYPE, EXPORT_ENUM_ITEM, EXPORT_ENUM_TAG EXPORT_PUBLIC_FCN_DEF_END, EXPORT_STATIC_FCN_DEF_END, EXPORT_ENUM_TAG_END, EXPORT_STRUCT_END, EXPORT_UNION_END cfx_ld - extracts IMPORT_LINK_FCN_TEMP extracted from cfx_cc1 stage is resolved into one of: 1) IMPORT_LINK_FCN 2) IMPORT_EXTERN_FCN IMPORT_LINK_VAR_TEMP extracted from cfx_cc1 stage is resolved into one of : 1) IMPORT_LINK_VAR, 2) IMPORT_EXTERN_VAR 9.0 Description of program cfx_make_trans ------------------------------------------ This program translates a make file to a cfx type make file. Usage : cfx_make_trans The output of this program is .cfx Note: Usually some hand editing is required to get the .cfx to work properly. example Makefile: GNUDIR = /homes/d4/west/gnu CFLAGS = -c LFLAGS = -o CC = gcc a.out: main.o add.o $(CC) $(LFLAGS) a.out main.o add.o main.o: main.c $(CC) $(CFLAGS) main.c add.o: add.c add.h $(CC) $(CFLAGS) add.c Makefile.cfx that was created by cfx_make_trans : # Added by cfx_make_trans # Added to autocreate .cfx files from .c files .cfx.%.o: %.c $(COMPILE.c) $< $(OUTPUT_OPTION) # Added to override cc, ld and ar CC = cfx_cc LD = cfx_ld AR = cfx_ar # End of additions made by cfx_make_trans GNUDIR = /homes/d4/west/gnu CFLAGS = -c LFLAGS = -o CC = gcc <====================== this should be edited out a.out: .cfx.main.o .cfx.add.o <======= linking stage $(CC) $(LFLAGS) a.out .cfx.main.o .cfx.add.o .cfx.main.o: main.c <===== compilation $(CC) $(CFLAGS) main.c .cfx.add.o: add.c add.h <===== compilation $(CC) $(CFLAGS) add.c This cfx make file will create a .cfx.main.o and .cfx.add.o file for each of the compilations i.e. main.c and add.c. The cfx make file will create a file called `a.out.cfx.mr' during the linking stage ( this file name is determined by taking the file name of the executable that will be created - i.e. 'a.out' - and adding a '.cfx.mr' extension ). Appendix 1: simplified Makefile -------------------------------- CC = gcc CFLAGS = -c math: main.o add.o globalVars.o $(CC) -o math main.o divide.o add.o globalVars.o main.o: main.c $(CC) $(CFLAGS) main.c divide.o: divide.c $(CC) $(CFLAGS) divide.c add.o: add.c $(CC) $(CFLAGS) add.c globalVars.o: globalVars.c $(CC) $(CFLAGS) globalVars.c Appendix 2: CFX Makefile - created by cfx_make_trans ------------------------------------------------------ # Added by cfx_make_trans # Added to autocreate .cfx files from .c files .cfx.%.o: %.c $(COMPILE.c) $< $(OUTPUT_OPTION) # Added to override cc, ld and ar CC = cfx_cc LD = cfx_ld AR = cfx_ar # End of additions made by cfx_make_trans CC = gcc <======== this line was deleted CFLAGS = -c math: .cfx.main.o .cfx.add.o .cfx.globalVars.o $(CC) -o math .cfx.main.o .cfx.divide.o .cfx.add.o .cfx.globalVars.o .cfx.main.o: main.c $(CC) $(CFLAGS) main.c .cfx.divide.o: divide.c $(CC) $(CFLAGS) divide.c .cfx.add.o: add.c $(CC) $(CFLAGS) add.c .cfx.globalVars.o: globalVars.c $(CC) $(CFLAGS) globalVars.c Appendix 3: file math.cfx.mr - this is the output of CFX ---------------------------------------------------------- # File initialized at Thu Nov 20 12:58:47 1997 # # The File Table 11 main.c main.c *Initialization* *Initialization* globalVars.h globalVars.h add.h add.h multiply.c multiply.c staticVars.c staticVars.c common.h common.h add.c add.c globalVars.c globalVars.c divide.c divide.c External-Library External-Library # # Imports/Exports from main.c # # File Subject Object #Kind Attr File Line File Line Scope Name Item Name #---- ---- ---- ------ ---- ------ -------------------------- --------- main.c main.c 24 0 1 1 8 -1 -1 - main 36 1 1 29 -1 -1 - main 14 1 1 1 3 -1 - globalVars.h 22 1 1 -1 3 3 - a 22 1 1 -1 3 4 - b 14 1 1 2 4 -1 - add.h 21 1 1 -1 4 3 - add 14 1 1 3 5 -1 - multiply.c 23 1 1 -1 5 8 main mult 14 1 1 4 6 -1 - staticVars.c 24 1 1 -1 6 3 main c 24 1 1 -1 6 4 main d 14 1 1 5 7 -1 - common.h 30 1 1 -1 7 1 - TRUE 31 1 1 -1 7 3 main BOOLEAN 32 1 1 -1 7 11 main colour 33 1 1 -1 7 11 main red 34 1 1 -1 7 6 main complex 35 1 1 -1 7 16 main typeOf 25 1 1 -1 8 4 main add 26 1 1 -1 9 3 main a 26 1 1 -1 9 4 main b 25 1 1 -1 10 8 main div 26 1 1 -1 10 3 main e # # Imports/Exports from *Initialization* # # File Subject Object #Kind Attr File Line File Line Scope Name Item Name #---- ---- ---- ------ ---- ------ -------------------------- --------- *Initialization* *Initialization* 12 7 2 2 1 -1 -1 - __GCC_NEW_VARARGS__ 7 2 2 1 -1 -1 - __GNUC_MINOR__ 7 2 2 1 -1 -1 - __GNUC__ 7 2 2 1 -1 -1 - __sparc 7 2 2 1 -1 -1 - __sparc__ 7 2 2 1 -1 -1 - __sun 7 2 2 1 -1 -1 - __sun__ 7 2 2 1 -1 -1 - __unix 7 2 2 1 -1 -1 - __unix__ 7 2 2 1 -1 -1 - sparc 7 2 2 1 -1 -1 - sun 7 2 2 1 -1 -1 - unix # # Imports/Exports from globalVars.h # # File Subject Object #Kind Attr File Line File Line Scope Name Item Name #---- ---- ---- ------ ---- ------ -------------------------- --------- globalVars.h globalVars.h 2 5 3 3 3 -1 -1 - a 5 3 3 4 -1 -1 - b # # Imports/Exports from add.h # # File Subject Object #Kind Attr File Line File Line Scope Name Item Name #---- ---- ---- ------ ---- ------ -------------------------- --------- add.h add.h 1 4 4 4 3 -1 -1 - add # # Imports/Exports from multiply.c # # File Subject Object #Kind Attr File Line File Line Scope Name Item Name #---- ---- ---- ------ ---- ------ -------------------------- --------- multiply.c multiply.c 3 2 5 5 8 -1 -1 - mult 6 5 5 3 -1 -1 - mult 37 5 5 10 -1 -1 - mult # # Imports/Exports from staticVars.c # # File Subject Object #Kind Attr File Line File Line Scope Name Item Name #---- ---- ---- ------ ---- ------ -------------------------- --------- staticVars.c staticVars.c 2 3 6 6 3 -1 -1 - c 3 6 6 4 -1 -1 - d # # Imports/Exports from common.h # # File Subject Object #Kind Attr File Line File Line Scope Name Item Name #---- ---- ---- ------ ---- ------ -------------------------- --------- common.h common.h 11 7 7 7 1 -1 -1 - TRUE 8 7 7 3 -1 -1 - BOOLEAN 9 7 7 11 -1 -1 - colour 10 7 7 13 -1 -1 - blue 10 7 7 11 -1 -1 - red 10 7 7 12 -1 -1 - white 11 7 7 6 -1 -1 - complex 12 7 7 16 -1 -1 - typeOf 38 7 7 13 -1 -1 - colour 39 7 7 9 -1 -1 - complex 40 7 7 19 -1 -1 - typeOf # # Imports/Exports from add.c # # File Subject Object #Kind Attr File Line File Line Scope Name Item Name #---- ---- ---- ------ ---- ------ -------------------------- --------- add.c add.c 2 0 8 8 4 -1 -1 - add 36 8 8 6 -1 -1 - add # # Imports/Exports from globalVars.c # # File Subject Object #Kind Attr File Line File Line Scope Name Item Name #---- ---- ---- ------ ---- ------ -------------------------- --------- globalVars.c globalVars.c 2 1 9 9 3 -1 -1 - a 1 9 9 4 -1 -1 - b # # Imports/Exports from divide.c # # File Subject Object #Kind Attr File Line File Line Scope Name Item Name #---- ---- ---- ------ ---- ------ -------------------------- --------- divide.c divide.c 3 0 10 10 8 -1 -1 - div 1 10 10 3 -1 -1 - e 36 10 10 10 -1 -1 - div Glossary -------- The following terms are used throughout the document and are explained here more thoroughly. internal linkage - procedures and variables declared as 'static' which means they are local to the file they are in or are included in. - internal linkages are resolved during the compilation stage e.g. file main.c: ------------ static int add ( int a, int b ); #include "add.c" void main() { int sum; x = 4; sum = add ( 4,3 ); } file add.c: ----------- static int x; static int add ( int a, int b ) { return ( a+b ); } The above would be compiled and linked as follows: gcc -o add main.c <== compiles and builds an executable called 'add' The references to variable 'x' and function 'add' located in main.c are resolved during the compilation stage, therefore the references to their definitions have 'internal linkage'. external linkage - procedures and variables declared as 'extern' and used in other files that are linked together - external linkages are resolved during the linking stage e.g. file main.c: ------------ extern int x; extern int add ( int x, int y); void main() { int sum; x = 4; sum = add ( x,4 ); } file add.c: ----------- int x; int add ( int x, int y ) { return ( x+y ); } The above program would be compiled and linked as follows: gcc -c main.c <== compiles main.c to create an object file main.o gcc -c add.c <== compiles add.c to create an object file add.o gcc -o add main.o add.o <== links main.o and add.o to create an executable called 'add' The reference to variable 'x' and the call to function 'add' in main.c are resolved during the linking stage therefore the references to their defintions have 'external linkage'.