• About Me
  • Blog
  • Home

Eric Hokanson

~ E's little space in cyberspace

Eric Hokanson

Tag Archives: C

Quine Quandary

23 Monday Dec 2013

Posted by Eric Hokanson in Computer Science, Philosophy, Quine programs

≈ 1 Comment

Tags

C, Computer science, programming, Python, Quine

While at a Christmas party, I met a Computer Science undergrad currently earning his degree at UNM.  He was having trouble grappling Quine programs.  A Quine program is a computer program that takes in no input, and produces a copy of its own source code as the only output.  That is — self-replicating code.  It is not as easy to explain as one might think, and my poor crude attempts at doing so only confused this poor CS student further.  To my defense, I used examples you can find online, but many of these examples are not very intuitive to understand.  Take for example this Quine written in Python:

s='s=%r;print s%%s,';print s%s,

Fire up a Python interpreter and try it out. It simply prints that exact line you typed in.  Unless you are very familiar with Python strings and the string formatting codes, it is hard to see how this program works.  But it is essentially, defining a string s, then using that string itself to replicate.  Here is a Quine written in the old Kernigan and Ritchie C style:

main(){char*s="main(){char*s=%c%s%c;printf(s,34,s,34);}";printf(s,34,s,34);}

I don’t believe it will compile with today’s C99 or better compilers, which don’t allow programmers to implicitly call the printf function with out using:

#include <stdio.h>

Again, the C program is not any more intuitive to understand than the Python example.  So today, my goal is to write a Quine program in C that is easier for me to explain by example.  Instead of using printf format strings, I am going to leverage the power of a computer’s ability to represent C source code as data.  The goal is to take the data representation of the source code and print it, then translate the data into the ASCII high-level source code and print that.  After some trial and error, here is what I came up with:

#include <stdio.h>

int
main (void)
{
    unsigned int i;

    printf("const unsigned char data[] = {");
    for (i = 0; i<sizeof(data); i++)
    {
        if (i%8 == 0)
            printf("\n");
        printf("%0#4x,", data[i]);    
    }
    printf("\n};\n\n");
    for (i = 0; i<sizeof(data); i++)
        putchar(data[i]);
    return 0;
}

The above is my partial program so far.  There are two for-loops.  The first loop will iterate through a byte array called data, which I have not defined yet because this data array will contain the hexadecimal representation of this ASCII code (i.e. everything from the #include to the last right curly-brace at the bottom).

The second for-loop takes that same data array and uses the C library putchar that will translate the hexadecimal representation into an ASCII character, which will give us all that C code from the #include to the last right curly-brace at the bottom.  In other words: this source code will print an exact copy of the data array and our source code when executed.

Next, we need to translate our C code into the byte array data and place it at the top of our program (before the #include statement in the source above).  It should look like this:

const unsigned char data[] = {
0x23,0x69,0x6e,0x63,0x6c,0x75,0x64,0x65,
0x20,0x3c,0x73,0x74,0x64,0x69,0x6f,0x2e,
0x68,0x3e,0x0a,0x0a,0x69,0x6e,0x74,0x0a,
0x6d,0x61,0x69,0x6e,0x20,0x28,0x76,0x6f,
0x69,0x64,0x29,0x0a,0x7b,0x0a,0x20,0x20,
0x20,0x20,0x75,0x6e,0x73,0x69,0x67,0x6e,
0x65,0x64,0x20,0x69,0x6e,0x74,0x20,0x69,
0x3b,0x0a,0x20,0x20,0x20,0x20,0x0a,0x20,
0x20,0x20,0x20,0x70,0x72,0x69,0x6e,0x74,
0x66,0x28,0x22,0x63,0x6f,0x6e,0x73,0x74,
0x20,0x75,0x6e,0x73,0x69,0x67,0x6e,0x65,
0x64,0x20,0x63,0x68,0x61,0x72,0x20,0x64,
0x61,0x74,0x61,0x5b,0x5d,0x20,0x3d,0x20,
0x7b,0x22,0x29,0x3b,0x0a,0x20,0x20,0x20,
0x20,0x66,0x6f,0x72,0x20,0x28,0x69,0x20,
0x3d,0x20,0x30,0x3b,0x20,0x69,0x3c,0x73,
0x69,0x7a,0x65,0x6f,0x66,0x28,0x64,0x61,
0x74,0x61,0x29,0x3b,0x20,0x69,0x2b,0x2b,
0x29,0x0a,0x20,0x20,0x20,0x20,0x7b,0x0a,
0x20,0x20,0x20,0x20,0x20,0x20,0x20,0x20,
0x69,0x66,0x20,0x28,0x69,0x25,0x38,0x20,
0x3d,0x3d,0x20,0x30,0x29,0x0a,0x20,0x20,
0x20,0x20,0x20,0x20,0x20,0x20,0x20,0x20,
0x20,0x20,0x70,0x72,0x69,0x6e,0x74,0x66,
0x28,0x22,0x5c,0x6e,0x22,0x29,0x3b,0x0a,
0x20,0x20,0x20,0x20,0x20,0x20,0x20,0x20,
0x70,0x72,0x69,0x6e,0x74,0x66,0x28,0x22,
0x25,0x30,0x23,0x34,0x78,0x2c,0x22,0x20,
0x64,0x61,0x74,0x61,0x5b,0x69,0x5d,0x29,
0x3b,0x20,0x20,0x20,0x20,0x0a,0x20,0x20,
0x20,0x20,0x7d,0x0a,0x20,0x20,0x20,0x20,
0x70,0x72,0x69,0x6e,0x74,0x66,0x28,0x22,
0x5c,0x6e,0x22,0x7d,0x3b,0x5c,0x6e,0x5c,
0x6e,0x22,0x29,0x3b,0x0a,0x20,0x20,0x20,
0x20,0x66,0x6f,0x72,0x20,0x28,0x69,0x20,
0x3d,0x20,0x30,0x3b,0x20,0x69,0x3c,0x73,
0x69,0x7a,0x65,0x6f,0x66,0x28,0x64,0x61,
0x74,0x61,0x29,0x3b,0x20,0x69,0x2b,0x2b,
0x29,0x0a,0x20,0x20,0x20,0x20,0x20,0x20,
0x20,0x20,0x70,0x75,0x74,0x63,0x68,0x61,
0x72,0x28,0x64,0x61,0x74,0x61,0x5b,0x69,
0x5d,0x29,0x3b,0x0a,0x20,0x20,0x20,0x20,
0x72,0x65,0x74,0x75,0x72,0x6e,0x20,0x30,
0x3b,0x0a,0x7d,0x0a,
};

Now, how did I get the above?  Well, we can use an ASCII table and transcribe our C source above by hand… # = 0x23, i = 0x69, n = 0x6E, c = 0x63, l = 0x6C, u = 0x75, d = 0x64, e = 0x65, spaces = 0x20, newlines = 0x0A, etc.  That is a lot of work.  Being a lazy computer scientist, I wrote the following python script to do it for me:

import sys

f = open(sys.argv[1], 'r')

s = ''
for line in f:
    for l in line:
        s += '0x%02x' % ord(l)
        s += ','

s += '0x0a,'
step = 40
for i in range(0, len(s), step):
    line = s[i:i+step]
    print line

In the above, you pass in the C file of our first code listing above, and it prints to the screen:

0x23,0x69,0x6e,0x63,0x6c,0x75,0x64,0x65,
0x20,0x3c,0x73,0x74,0x64,0x69,0x6f,0x2e,
0x68,0x3e,0x0a,0x0a,0x69,0x6e,0x74,0x0a,
0x6d,0x61,0x69,0x6e,0x20,0x28,0x76,0x6f,
0x69,0x64,0x29,0x0a,0x7b,0x0a,0x20,0x20,
0x20,0x20,0x75,0x6e,0x73,0x69,0x67,0x6e,
0x65,0x64,0x20,0x69,0x6e,0x74,0x20,0x69,
0x3b,0x0a,0x20,0x20,0x20,0x20,0x0a,0x20,

<-------snipped----------------------->

0x28,0x64,0x61,0x74,0x61,0x29,0x3b,0x20,
0x69,0x2b,0x2b,0x29,0x0a,0x20,0x20,0x20,
0x20,0x20,0x20,0x20,0x20,0x70,0x75,0x74,
0x63,0x68,0x61,0x72,0x28,0x64,0x61,0x74,
0x61,0x5b,0x69,0x5d,0x29,0x3b,0x0a,0x20,
0x20,0x20,0x20,0x72,0x65,0x74,0x75,0x72,
0x6e,0x20,0x30,0x3b,0x0a,0x7d,0x0a,

You can copy and paste that into your

const unsigned char data[] = {

},

block and put that above the #include of the first code listing above.  I used Eclipse-C++ to code and run my program.  Upon execution the program should print to the console:

const unsigned char data[] = {
0x23,0x69,0x6e,0x63,0x6c,0x75,0x64,0x65,
0x20,0x3c,0x73,0x74,0x64,0x69,0x6f,0x2e,
0x68,0x3e,0x0a,0x0a,0x69,0x6e,0x74,0x0a,
0x6d,0x61,0x69,0x6e,0x20,0x28,0x76,0x6f,
0x69,0x64,0x29,0x0a,0x7b,0x0a,0x20,0x20,
0x20,0x20,0x75,0x6e,0x73,0x69,0x67,0x6e,
0x65,0x64,0x20,0x69,0x6e,0x74,0x20,0x69,
0x3b,0x0a,0x20,0x20,0x20,0x20,0x0a,0x20,
0x20,0x20,0x20,0x70,0x72,0x69,0x6e,0x74,
0x66,0x28,0x22,0x63,0x6f,0x6e,0x73,0x74,
0x20,0x75,0x6e,0x73,0x69,0x67,0x6e,0x65,
0x64,0x20,0x63,0x68,0x61,0x72,0x20,0x64,
0x61,0x74,0x61,0x5b,0x5d,0x20,0x3d,0x20,
0x7b,0x22,0x29,0x3b,0x0a,0x20,0x20,0x20,
0x20,0x66,0x6f,0x72,0x20,0x28,0x69,0x20,
0x3d,0x20,0x30,0x3b,0x20,0x69,0x3c,0x73,
0x69,0x7a,0x65,0x6f,0x66,0x28,0x64,0x61,
0x74,0x61,0x29,0x3b,0x20,0x69,0x2b,0x2b,
0x29,0x0a,0x20,0x20,0x20,0x20,0x7b,0x0a,
0x20,0x20,0x20,0x20,0x20,0x20,0x20,0x20,
0x69,0x66,0x20,0x28,0x69,0x25,0x38,0x20,
0x3d,0x3d,0x20,0x30,0x29,0x0a,0x20,0x20,
0x20,0x20,0x20,0x20,0x20,0x20,0x20,0x20,
0x20,0x20,0x70,0x72,0x69,0x6e,0x74,0x66,
0x28,0x22,0x5c,0x6e,0x22,0x29,0x3b,0x0a,
0x20,0x20,0x20,0x20,0x20,0x20,0x20,0x20,
0x70,0x72,0x69,0x6e,0x74,0x66,0x28,0x22,
0x25,0x30,0x23,0x34,0x78,0x2c,0x22,0x20,
0x64,0x61,0x74,0x61,0x5b,0x69,0x5d,0x29,
0x3b,0x20,0x20,0x20,0x20,0x0a,0x20,0x20,
0x20,0x20,0x7d,0x0a,0x20,0x20,0x20,0x20,
0x70,0x72,0x69,0x6e,0x74,0x66,0x28,0x22,
0x5c,0x6e,0x22,0x7d,0x3b,0x5c,0x6e,0x5c,
0x6e,0x22,0x29,0x3b,0x0a,0x20,0x20,0x20,
0x20,0x66,0x6f,0x72,0x20,0x28,0x69,0x20,
0x3d,0x20,0x30,0x3b,0x20,0x69,0x3c,0x73,
0x69,0x7a,0x65,0x6f,0x66,0x28,0x64,0x61,
0x74,0x61,0x29,0x3b,0x20,0x69,0x2b,0x2b,
0x29,0x0a,0x20,0x20,0x20,0x20,0x20,0x20,
0x20,0x20,0x70,0x75,0x74,0x63,0x68,0x61,
0x72,0x28,0x64,0x61,0x74,0x61,0x5b,0x69,
0x5d,0x29,0x3b,0x0a,0x20,0x20,0x20,0x20,
0x72,0x65,0x74,0x75,0x72,0x6e,0x20,0x30,
0x3b,0x0a,0x7d,0x0a,
};

#include 

int
main (void)
{
unsigned int i;printf("const unsigned char data[] = {");
for (i = 0; i<sizeof(data); i++)
    {
        if (i%8 == 0)
            printf("\n");
        printf("%0#4x," data[i]);    
    }
    printf("\n"};\n\n");
    for (i = 0; i<sizeof(data); i++)
        putchar(data[i]);
    return 0;
}

You should be able to copy and paste the print out into an IDE, compile and run it.  Quines are pretty neat.  I enjoyed this little exercise.  It made me think about levels of meaning; about values and their representations.  Computers and programs can’t differentiate between data and code.  It is the context that determines when data is meant to be interpreted as code instead of data, and when code is meant to be data (e.g. when downloading a binary file from the Internet).  This is an important concept in computer security.  It is the fact that we can use data as code that makes shellcode exploits possible.

“Yields falsehood when preceded by its quotation” yields falsehood when preceded by its quotation.

–Quine’s paradox

Subscribe

  • Entries (RSS)
  • Comments (RSS)

Archives

  • May 2016
  • May 2015
  • April 2015
  • March 2015
  • September 2014
  • August 2014
  • June 2014
  • May 2014
  • April 2014
  • March 2014
  • February 2014
  • January 2014
  • December 2013
  • November 2013

Categories

  • Alan Turing
  • Algorithms
  • Apollo 17
  • C Programming
  • Christmas
  • Computer Programming
  • Computer Science
  • Computer Security
  • Current Events
  • Cyber Security Research
  • Education
  • Freedom of choice
  • Freewill
  • Hacking
  • Holidaze
  • Learning
  • Malware RE
  • Math
  • NASA
  • Pen-testing
  • Philosophy
  • Pi Day
  • procrastination
  • Programming
  • Python
  • Quine programs
  • Quotes
  • Random Stuff
  • Research
  • Reverse Engineering
  • Shopping
  • Smithsonian National Air and Space Museum
  • Software Development
  • Star Wars
  • Success
  • Uncategorized

Meta

  • Register
  • Log in

Blog at WordPress.com.

  • Follow Following
    • Eric Hokanson
    • Join 44 other followers
    • Already have a WordPress.com account? Log in now.
    • Eric Hokanson
    • Customize
    • Follow Following
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar