PDA

View Full Version : Trying to write a C compiler...



Al6200
April 21st, 2007, 05:44 PM
Well, I'm trying to write a C Compiler without a few features like structs, and for the most part I'm failing, but I only started a few weeks ago, so I'm still in the somewhat early stages.

Basically the idea is that I have something of a tree. At the top level I have a class called namespaces. At the beginning the program finds how many namespaces there are, and calls the constructor call for namespaces (for each namespace). The constructor's arguement is a string, and the constructor basically takes apart that string and looks for while loops and for loops. It calls the constructor for the Loop object, and for any statements, etc. that are in the namespace. The constructor of the loop objects calls the constructor of the conditional class, the constructor of the conditional class calls the constructors of the statements classes, etc.

Then there is a class which breaks apart that object, in a repetitive, iterative, and sometimes recursive set of operations, and converts it into binary.

I am having a few problems though:


basically the code works by converting

int x = 3
int y = x + 2;
int* z = &y;

to

;given n is defined beforehand as a memory location

mov dx, n
mov [dx], 3
mov eax, [dx]
mov [dx + 1], eax
add [dx + 1], 2
mov eax, dx
add eax, 1
mov [dx + 2], eax

But my code doesn't work. WHY!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
WHAT DO YOU MEAN INVALID OPERAND SIZE?!?
C
Clearly, if this C compiler is ever going to defeat Visual Studio, I'm going to need to sharpen my assembly skills. But that's why I'm writing a compiler.

Write a compiler-> learn assembly -> Write a compiler

Checkmate!

MTsoul
April 21st, 2007, 10:15 PM
What is the exact error you are getting?

TheColonial
April 21st, 2007, 11:02 PM
You're mixing register sizes. EDX = 32 bit, DX = 16 bit. You should instead be referencing EDX, not DX, because everything else looks to be using 32 bits (EAX), and not 16 bit (AX):

mov edx, n
mov [edx], 3
mov eax, [edx]
mov [edx + 1], eax
add [edx + 1], 2
mov eax, edx
add eax, 1
mov [edx + 2], eax

You should also make sure that:
all of your variables are next to each other in memory if you're going to offset from EDX for each of your variables ints and pointers are all 32 bit on a 32 bit machine, which means you're not offsetting enough from EDX. EDX + 1 is one byte past EDX - 1/4 of the offset required. You should be offsetting 32 bits (4 bytes) for each variable:

mov edx, n
mov [edx], 3
mov eax, [edx]
mov [edx + 4], eax
add [edx + 4], 2
mov eax, edx
add eax, 1
mov [edx + 8], eax

Your code may still have issues depending on which instructions allow direct manipulation of values stored in memory addresses rather than registers, but you'll find that out as you go ;)

I'd recommend having a read of the Intel instruction set (http://www.intel.com/design/intarch/manuals/243191.htm) manual before you continue. Writing a compiler is no small job, and understanding the instruction set is very key, as is having a good understanding of the PE file format (http://msdn.microsoft.com/msdnmag/issues/02/03/PE2/).

Hope that helps, and good luck!
OJ

Ben H
May 10th, 2007, 02:41 AM
Oi! Beat me to it!

zellers
May 21st, 2007, 01:47 AM
I don't think u will ever beat visual studio but u could always try...:smirk:

Al6200
May 21st, 2007, 07:05 AM
I don't think u will ever beat visual studio but u could always try...:smirk:

Well, we'll just see about that. I'm removing all of the pesky and pointless features from C++ to exponentializisize the learning curve like classes, string, chars, structs, complex data types, pointers.

Lets see MS try and do that.

hybrid101
May 21st, 2007, 07:14 AM
haha, that would be so cool:D
got an example of the compiler dude? don't forget the documentation:D

Al6200
May 21st, 2007, 07:35 AM
I think documentation stiffles the user creativity. If they see something from such a brilliant programmer such as myself, they'll think "I can never be as great as him". And they never will be.

zellers
May 23rd, 2007, 01:20 AM
lol so how are we gonna do much with it if it has no support for classes chars, strings and structs and whatnot?

Al6200
May 23rd, 2007, 07:21 AM
Strings were never anything more than a flashy marketing point. Real applications use 100% 32 bit integers.

wo1olf
May 25th, 2007, 01:32 PM
doesn't c compilers already exist? To me this sounf like re-inventing the wheel...:tb: :tb: