The Jack Compiler Compiles a subset of C to the Jackal 3.0 ISA Written in January-April 2005 (C) 2005 Nick Johnson I would like to give special thanks to Murali Nethi, Deborah Park and David Arlen for their help in testing jackcc. Also, special thanks is given to Dr. John Lach of the University of Virginia for using jackcc to create test cases in his class ECE 436 Advanced Digital Design. Table of Contents 0. License, Disclaimer, etc. 1. How to compile 2. Command line usage 3. Subset of C 4. Language Extensions 5. Multiplication 6. About generated code 7. Assembler 8. To Do List 9. How to report a bug 10. Known bugs 11. Example glue script 0. License, Disclaimer, etc. The Jack C Compiler is copyright (C) 2005 by Nick Johnson. This software is released under the GNU GPL. The license document is available at http://www.gnu.org/licenses/gpl.txt To obtain a copy of the program source code, go to http://jackcc.sourceforge.net/ Bug reports and suggestions are welcomed and encouraged. Report these bugs on the Sourceforge webpage. DISCLAIMER: This software has no warranty at all. Even if you heard me claim that it works. This software might destroy your files, render your computer unbootable and insult your mother. I will not be held responsible for any use of this software. WARNING: Read over the assembly language output that this generates. Some of the code generation features are still being debugged (especially support for recursive functions). I especially encourage you to read the section on multiplication. 1. How to Compile: UNIX: Simply type: make You need to have lex and yacc installed on the compiling machine. I think the scripts are compatible with all versions of these tools, but I have not checked (known to work with GNU Flex 2.5.4, and GNU Bison 1.875d). If you do not have those programs, then use the copies included in the distribution: cp dist-y.tab.h y.tab.h cp dist-y.tab.c y.tab.c cp dist-lex.yy.c lex.yy.c make Some versions of yacc do not like the -y flag. If this is the case, remove the -y flag within makefile. Do note that the purpose of the -y flag is to coerce yacc into outputting a file named y.tab.c instead of parser.y.c. Similarly, some versions of lex do not like the -I flag (SunOS 5.9 comes to mind). Finally, as mentioned in the Known Bugs section, jackcc will compile but fail to operate properly on 64-bit architectures (I made an assumption somewhere that sizeof(int)==sizeof(void*) ). Compilation should produce an executable called 'jackcc'. You can run a regression test of the built executable against developer test cases by running: make test Which will produce a test-results file. This file should basically say "Test Passed" for each test case. WINDOWS: I don't know anything about windows. Perhaps someone in the class would be gracious enough to try to compile this for windows. MACINTOSH: David Arlen reports that jackcc will compile on MacOS X by removing the #include "malloc.h" lines. 2. Command Line Usage: jackcc [options] outfile.asm Reads a program written in a subset of C (described below), and writes out an assembly language listing to actualize that specification. a. Code Generator Options: -sasm Use the SASM assembler instead of the default JackAS assembler. Jack can produce better code if it's using a better assembler, but SASM isn't smart enough to do a few things. Check it out on the sourceforge page. -recurse Write code that allows for recursive functions (i.e. stack frame). There is a big performance hit for enabling this. Do you *really* need recursive functions? Opt for iterative alternatives! The compiler will tell you if it thinks you need this. -r n Specify that the ISA includes n general purpose registers (default 12). We can make better code if we have more registers. -mul Specify that there is a multiply instruction. -jreg Specify that there is a jump to register instruction. This can simplify many constructs produced by the compiler. -one n Specify that special purpose register n always holds the constant 1. The constant 1 is generally the most common constant used by any program. -zero n Similar to -one, but to specify a register to hold the constant 0. b. Optimization Options: (Note that for each of these, there is also a negative command. i.e. the opposite of -inline is -noinline) -noopts Turn off all optimizations. Individual optimizations can be turned on by subsequent flags. -max All optimizations (except -cr). This is a bare minimum that you should use and is enabled by default. -v2r Attempt to put as many local variables and function formals into a register as is possible. This is a huge performance booster. This is on by default. -nov2r Turn off the -v2r flag. Instead, each local and formal will be stored into a static memory location or frame location. -cr Try to improve register usage by allocating constants to registers. I.E. if jackcc compiles a program, and then realizes that it never used some of the registers, then it will automatically re-compile it, and put common constants or offsets into those spare registers. It makes compilation slower, but not noticeably so. This is on by default. -inline Enable function inlining, where appropriate. -cf Optimize using constant folding. Any value that can be reasonably determined at compile time will be calculated and converted to a constant. Additionally, this option enables a lot of algebraic simplifications. -csee Optimize using common subexpression elimination. Values that are calculated multiple times in a small area are saved so that they are only computed one. For example, array indeces. -ris Optimize using reduction in strength. Expensive operations such as multiply are replaced with inexpensive operations such as bit-shift. -dce Optimize using dead code elimination. The compiler tries to find and remove unreachable code. -peep Optimize using a peephole optimizer. This will make one additional final sweep to look for minor optimizations. This makes a big improvement. c. Other Options: -html Write compiler output in HTML. It looks ugly, but will allow this program to be hosted on a web site. You'd still need to write a script to accept/sanitize an uploaded program. -sz n Set the size of the symbol table (default 4096). You shouldn't have to change this. I was lazy. With extraordinarily large programs (those too large to fit in Jackal's address space) you might get a compiler error, asking you to increase this. -slop Report type mismatch errors as warnings. -noisy Print a lot of debugging information to stderr.. -help Display usage information. d. Debugging options: These are, for the most part, only useful for the compiler writer. If you're kind of geeky, you might enjoy the -st -dag -cg flags used in conjunction. Personally, I prefer the -max -quads flags. -st Print the symbol table as a dot vector format drawing. -dag Print the directed acyclic graph as a dot vector format drawing. -cg Print the call graph as a dot vector format drawing. -histo Print a symbol usage histogram. Mostly for geeky interest. -quads Print the quad representation of the program. Quads are cool; it's like register transfer notation, and you pretend you have an infinite number of registers. Algorithms become nicely visible. -offsets Print out parameter, formal offsets, as well as structure alignment information. 3. A Subset of C? What do I lose? First remember, this is C and not C++. Because of that (not a limitation of the compiler) there are no classes, overloaded functions, templates, references, etc., etc. The following things are not included: * No support for char, short, long, float or double types (the ISA doesn't support them anyway). Additionally, const, register, automatic, static, signed, unsigned and volatile modifiers aren't included. * void is actually a synonym for int. Though this seems really strange, it doesn't change your code at all :) * No strings, not even the "blah blah" syntax. * No preprocessor, though these are standard and easy to find. The #line directive is recognized. As a result, #include, #define, comments, and other such things will not work. Just pass your code through an existing C preprocessor for the same effect. See the section on glue scripts. * No switch statement. Use if-else if-else... * No goto statement or labels. This will never be supported as far as I'm concerned. * No enumerations. Enumerations are easily replaced with #define directions (see note about preprocessor). * No standard library. You can't use printf() or malloc() or sin()... See the additional features section about the heap variable. * No Separate compilation. Everything must be in a single file when it gets to jack. * While you can perform bitshift by a constant number of bits, shifting by an arbitrary number of bits is not yet implemented. E.g., int a, b; a <<= 1; // ok a <<= b; // not ok. * No ?: operator. E.g., int a,b; a = (b>10) ? b : 10; // not ok if( b > 10 ) // ok a = b; else a = 10; * No typedefs. Just write it out the long way. * No function pointers. There is very little hope that I will ever implement this. * No variadic argument functions. I doubt you are crying too much about this. If this list is incomplete, please tell me. 4. Have you added anything to C? Yes. a. Additional Keywords Since there is no standard library for Jackal, it is hard to achieve dynamic memory. Thus, I have added the builtin 'heap' variable. The heap variable is a global integer variable. The special property of the heap variable is that the compiler places it LAST in memory. Thus, the programmer can be assured that memory after the address of the heap variable is unused and can be written to freely. An example which uses the heap variable for dynamic memory can be found in testcases/test-heap.jack. b. Operators to Leverage Special Instructions Jackal is a slow, minimal instruction set. While jackcc tries to be ANSI compliant, I also realize that there is room for additional operators that let the programmer use as few instructions as possible. b.1. NAND -- Bitwise inverse of And Since the Jackal ISA supports a bitwise NAND instruction, it seems like a waste to not give the programmer direct access to it. Thus, * The ~& operator calculates the bitwise NAND of two values. It's precedence is equal to that of & or |, and is left-associative. * The ~&= operator is just like += or *=, but with the bitwise NAND operation. It's predecence is equal to that of += or *=, and is right-associative. b.2. SRA -- Shift Right Arithmetic Since the Jackal ISA has an arithmetic right shift instruction but not a logical right shift, and as the C standard states that >> is logical shift right, I have added an arithmetic shift right operator. * The >>> operator calculates the arithmetic shift right of two values. Its precedence is equal to >> and is left-associative. * The >>>= operator is just like >>=, except with arithmetic right shift. It's precedence is equal to that of += or >>=, and it is right-associative. NOTE: The >> operator requires four instructions: the SRA instruction, a LIL/LIH pair for the mask, and then an AND instruction against that mask. The >>> operator requires only the SRA instruction. Thus, >>> is faster. Use >>> when you can instead of >>. 5. What about multiplication and the like? The Jackal 3.0 ISA does not include multiply, divide, or modulo instructions. You can use the mutiplication, division and modulo operators however much you like. The compiler will automatically replace these with function calls to multiply_operator(), division_operator() and modulo_operator() (which you must write yourself). For example: int m, n; int a,b,c; a = m * n; b = m / n; c = m % n; Is almost equivalent to: int m, n; int a,b,c; a = multiply_operator(m,n); b = divide_operator(m,n); c = modulo_operator(m,n); The only difference is that the compiler may be able to perform additional optimization to remove the multiplication, division or modulo from your code if you use the operators. For example, if -ris (reduction in strength) is enabled, multiplication of a variable by a constant is replaced by a seqence of bit-shifts and additions automatically for you. If -cf (constant folding) is enabled, then any multiplication of two constants can be replaced at compile time. The -cf and -ris flags are enabled by default. You must define your own version of the multiply_operator(), divide_operator() and modulo_operator() functions. Here is an example: int multiply_operator(int a, int b) { int sum; for(sum=0; b && a; a<<=1, b>>>=1) if( b & 1 ) sum += a; return sum; } Or, if you have implemented a multiply instruction, use the -mul flag. 6. A note on the code Jack generates: It is very important that you read over all code that Jack generates; I think it will produce correct code, but I wouldn't bet $10 on it. Comments with line numbers are inserted directly into the code to ease auditing the code. Jack tries to create correct and optimized code. If this is not the case, you should contact me so I can fix it. a. Loops: In all loops, the conditional expression is moved to the bottom of the loop. For example: for(i=0; i<10; i++) body; becomes: i=0; goto enter_100 top_100: body continue_100: i++ enter_100: if( i<10 ) goto top_100 break_100: Break and continue statements in the loop body are then turned into gotos to the respective break or continue labels. There is good reason behind this. If we had kept the conditional at the top of the loop, then every loop execution would have, at a minimum, two jumps: the conditional jump near the top and an unconditional at the bottom returning control flow to the conditional. Under Jack's model, we eliminate one of these jumps in the loop body. This can drastically reduce pipeline stalls, assuming that the loop executes many times. b. Function calls: Jackcc will try to put local variables and formal parameters into registers to decrease memory traffic. However, there are still memory locations created for these. For example, if a function foo() has a formal x and a local named y, then there are memory locations formal_x_foo and local_y_foo in the data area. These locations are used to save and restore values before or after a function call, and to spill to if there aren't enough registers. The functions return value is stored in register R0. The recursive code generator is still experimental. It builds a stack frame, using register R1 as the stack pointer and R2 as the frame pointer. Parameters and locals are stored in the stack frame, as well as the return address. c. Function Returns: Since there is no stack and no return instruction, the return address is stored in a static location raddr_foo when using non-recursive code generation. Further, because there is no indirect jump instruction, instructions are added to determine the appropriate jump destination. Self-modifying code is another alternative, but can cause problems if the modified instruction is already in the processor pipeline, and thus is not default. Additionally, if there is a jump indirect instruction, then jack can use that. In other words, function calls are very expensive on the Jackal ISA. Jackcc will try to inline functions when appropriate. 7. Where can I get an assembler? I've written an assembler, jackas, as an alternative to SASM. It's better, and jackcc can work with it's advanced features to write better code. Further, jackas is a command line utility in both assembler and simulator mode, so it can be scripted effectively. This assembler is available from the jackcc website, http://jackcc.sourceforge.net. There is a link on Jack's website (http://jackcc.sourceforge.net) to an assembler that runs in the .NET environment. 8. Todo List The following features will be added soon (in roughly this order): * Allow passing a formal to an inlined function by register instead of by memory. * Write a guide document about how to modify jackcc / jackas. * Only write a storage location for a variable if it is used. * Allow independant storage locations to share space (for example, if foo() and bar() are never active at the same time, then the locals from foo() can overlap the locals from bar() ). This will apply to spill variables, formals, locals and return addresses. Thus, it might be worth creating a linked list of all storage locations used by a function. * Shift by an arbitrary amount (i.e. a << b). * Function returns via self modifying code. * Array construction syntax. I.E. int *p = {1, 2, 3, 4}; * Better error reports. * Better code generation, optimization. * Clean up the internals: - Turn BST type into AVL type. 9. How to report a bug. Go to Jack's sourceforge page (http://www.sf.net/projects/jackcc) and enter a bug into the bug tracker. Please include the source file, a list of the command line options, and a description of the bug (i.e. it segfaults, it produces incorrect code, etc). 10. Known bugs. * Jackcc does not work correctly when compiled on 64-bit architectures. * Code generated with the -recursive option probably does not work. * Pointer addition is sometimes funny... * Register colorer fails whenever it needs to spill! 11. Example glue scripts The following script will take a C program from standard input, preprocess it, pass it through the compiler and then assemble it to an Intex Hex file, written to standard output. It assumes that the jackcc binary is located in the subdirectory jackcc/, and that the jackas binary is located in the subdirectory jackas/ #!/bin/bash # file: jack pp=$(mktemp) asm=$(mktemp) cpp >$pp jackcc/jackcc <$pp >$asm jackas/jackas <$asm rm $pp $asm This script file will read a C program from standard input, and then simulate it, showing only the contents of memory after the program halts: #!/bin/bash # file: jack-sim-dump pp=$(mktemp) asm=$(mktemp) cpp >$pp jackcc/jackcc <$pp >$asm jackas/jackas -sim <$asm |grep DUMP rm $pp $asm This script will compile the C program as above, but will tell you the final value of regster R0: #!/bin/bash # file: jack-sim-time pp=$(mktemp) asm=$(mktemp) cpp >$pp jackcc/jackcc <$pp >$asm jackas/jackas -sim <$asm |grep R0 |tail -n 1 rm $pp $asm This script will compile the C program as above, but will tell you how many instructions were executed before the program halted: #!/bin/bash # file: jack-sim-time pp=$(mktemp) asm=$(mktemp) cpp >$pp jackcc/jackcc <$pp >$asm jackas/jackas -sim <$asm |grep TIME |tail -n 1 rm $pp $asm This latter case is very useful for timing algorithms. For example, to determine the fastest multiplication algorithm, type at the prompt: for i in testcases/test-mul?.jack; do \ echo $i takes $( ./jack-sim-time <$i ) instructions; \ done