Thursday, December 18, 2008

Re-creating source code, Part 1

As I talked about last time, I've started re-creating the source to DeSmet C version 2.51 (aka PCC 1.2d, and the version I use the most).

How do you re-create the source to a program you don't have? Well, if you think about it, you have the program -- sort of.

You have the executable.

When the compiler source is run through a compiler, you get an executable program. So, there is a mapping from the source code to the executable. What you want to do is reverse, that and go from the executable to the source.

There are two levels to doing this, disassembly and decompilation.

Disassembly is fairly straight forward, it translates the object code into assembly statements. For an x86 program, there can be a little ambiguity, as code and data are in the same memory space (von Neumann architecture) , and so you have to have some algorithm to determine if the object byte you are looking at is code or data, but this is a minor obstacle (typical embedded processors are Harvard architecture, and so this problem does not exist).

I'll talk more about disassembly in my next post.

1 comment:

Shin said...

What is sad today is that if you ran half if not all of this jargon past some of my friends studying programming they wouldn't want nor need to know what any of it means/meant.

I can only presume they know how to use a mouse, and occasionally a keyboard.

What is to become of this generation for which everything has been done for them?

Their argument is that they don't "need" to know.

The problem being that they really don't "need" to know anything to do their "jobs."

At this point is seems it is monkeys training monkeys.

"You don't need to know any more than that."
They say.

The problem: They don't want to know anything.