<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>endeavormac</title>
	<atom:link href="http://myw3b.net/blog/index.php/feed/" rel="self" type="application/rss+xml" />
	<link>http://myw3b.net/blog</link>
	<description>Code, Thoughts, Stuff</description>
	<lastBuildDate>Fri, 18 Jun 2010 14:49:55 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Ordering Password Dictionaries</title>
		<link>http://myw3b.net/blog/index.php/2010/06/ordering-password-dictionaries/</link>
		<comments>http://myw3b.net/blog/index.php/2010/06/ordering-password-dictionaries/#comments</comments>
		<pubDate>Fri, 18 Jun 2010 14:49:55 +0000</pubDate>
		<dc:creator>endeavormac</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://myw3b.net/blog/?p=172</guid>
		<description><![CDATA[Recently I found myself coming across numerous WPA-PSK encrypted wireless APs, with a desire to access them. I think it&#8217;s safe to say we all know about the WPA CoWPAtty tables, but these APs did not have names that were in the tables. To make matters worse, at the time the only available hardware I [...]]]></description>
			<content:encoded><![CDATA[<p>Recently I found myself coming across numerous WPA-PSK encrypted wireless APs, with a desire to access them. I think it&#8217;s safe to say we all know about the <a href="http://www.renderlab.net/projects/WPA-tables/">WPA CoWPAtty tables</a>, but these APs did not have names that were in the tables. To make matters worse, at the time the only available hardware I had available was my netbook, and I was attempting ~280 keys a second. With my massive dictionaries with millions and millions of real-world passwords to attempt, it was going to take hours for each AP.</p>
<p>I decided I needed to order my password dictionaries in a manner that would bring the more likely passwords to the beginning, and the least likely passwords to the end. I wasn&#8217;t aware of any program that did this, and figured I would write my own.</p>
<p>A few hundred lines of C later and I have a very fast password dictionary ordering program. It is far from perfect, but much better than nothing. It loads the entire dictionary into memory, creates all the information necessary for markov chains, and then uses this information to score each individual password. The passwords are then ordered by their score.</p>
<p>You can find the code in the <a href="http://code.google.com/p/rainbowsandpwnies/source/browse/trunk/dictsort/">rainbowsandpwnies googlecode svn repo</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://myw3b.net/blog/index.php/2010/06/ordering-password-dictionaries/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tech Stocks</title>
		<link>http://myw3b.net/blog/index.php/2010/05/tech-stocks/</link>
		<comments>http://myw3b.net/blog/index.php/2010/05/tech-stocks/#comments</comments>
		<pubDate>Tue, 04 May 2010 04:14:55 +0000</pubDate>
		<dc:creator>endeavormac</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://myw3b.net/blog/?p=167</guid>
		<description><![CDATA[Only about 1/4 of my projects ever make it to any sort of announcement or release. Usually, those releases are silent through svn somewhere. However, I&#8217;ve finally reached the time to take a look at the stock market, and have figured, naturally, my best bet is with tech stocks. In any case, I thought tonight [...]]]></description>
			<content:encoded><![CDATA[<p>Only about 1/4 of my projects ever make it to any sort of announcement or release. Usually, those releases are silent through <a href="http://code.google.com/p/endeavormac/source/checkout">svn</a> <a href="http://code.google.com/p/rainbowsandpwnies/">somewhere</a>. However, I&#8217;ve finally reached the time to take a look at the stock market, and have figured, naturally, my best bet is with tech stocks.</p>
<p>In any case, I thought tonight would be a good chance to share one of those many projects that never see foreign eyes. I thereby present you with <a href="http://myw3b.net/~endeavormac/stocks/index.html">my custom spin on what a tech stock tracker should look like</a>. I&#8217;m not trading minute to minute, or hour to hour, so I don&#8217;t care about real-time stock quotes. What I do care about are new announcements that may tip a stock one way or the other, or stocks that look like they may be headed towards rapid change.</p>
<p>Right now, you&#8217;re looking a few hours in python, on a 10 minute cron. It grabs information, parses it, and generates static html. I have plans yet to continue to improve this project. More specifically, I&#8217;d like to delve deeper into the news articles, counting keywords that may indicate the direction a stock is about to head. I also planning on grabbing historical stock information and running some <a href="http://en.wikipedia.org/wiki/Regression_analysis">regression stats</a> in an attempt to automate finding what stocks are very closely related to one another. If stocks A, B and C are all closely related, and news indicates good things for A and B, good things are most likely in store for C.</p>
]]></content:encoded>
			<wfw:commentRss>http://myw3b.net/blog/index.php/2010/05/tech-stocks/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Making Perfect Rainbow Tables Faster</title>
		<link>http://myw3b.net/blog/index.php/2010/03/making-perfect-rainbow-tables-faster/</link>
		<comments>http://myw3b.net/blog/index.php/2010/03/making-perfect-rainbow-tables-faster/#comments</comments>
		<pubDate>Sun, 28 Mar 2010 19:03:26 +0000</pubDate>
		<dc:creator>endeavormac</dc:creator>
				<category><![CDATA[Ideas]]></category>

		<guid isPermaLink="false">http://myw3b.net/blog/?p=130</guid>
		<description><![CDATA[A Quick Background on Perfect Rainbow Tables Rainbow tables allow us to find plaintexts to cryptographic hash algorithms quickly. They are based off of Hellman Martin&#8217;s &#8220;A Cryptanalytic Time &#8211; Memory Trade-Off&#8220;. By including the step in the reduction function, Phillipe Oechslin was able to improve on Hellman Martin&#8217;s method, and we came up with [...]]]></description>
			<content:encoded><![CDATA[<h3>A Quick Background on Perfect Rainbow Tables</h3>
<p>Rainbow tables allow us to find plaintexts to cryptographic hash algorithms quickly. They are based off of Hellman Martin&#8217;s &#8220;<a href="http://www-ee.stanford.edu/~hellman/publications/36.pdf">A Cryptanalytic Time &#8211; Memory Trade-Off</a>&#8220;. By including the step in the reduction function, Phillipe Oechslin was able to improve on Hellman Martin&#8217;s method, and we came up with Rainbow Tables, the &#8220;<a href="http://lasecwww.epfl.ch/~oechslin/publications/crypto03.pdf">Faster Cryptanalytic Time &#8211; Memory Trade-Off</a>&#8220;.</p>
<p>If you are unfamiliar with rainbow tables, it is suggested you <a href="http://kestas.kuliukas.com/RainbowTables/">become</a> <a href="http://en.wikipedia.org/wiki/Rainbow_table">familiar</a> before continuing.</p>
<p>Even rainbow tables are not perfect. They still merge, and merges mean wasted information, wasted space, wasted time. What we really want are perfect rainbow tables. In perfect rainbow tables, each chain has a unique endpoint. We have no merges. They give us nearly the same percentage to find a plaintext as non-perfect rainbow tables, but are much smaller.<br />
<span id="more-130"></span><br />
Perfect rainbow tables have always taken much longer to create than non-perfect rainbow tables. The two step process to create perfect rainbow tables has been:</p>
<ol>
<li>Create all chains for your rainbow tables</li>
<li>Eliminate duplicate chains, until only one instance for each endpoint remains</li>
</ol>
<p>Each time you eliminate a duplicate chain, all the time spent creating that chain is lost/wasted.</p>
<p>We must also understand that chains can merge at any step. If we have three chains which merge, they may merge like this:</p>
<pre>
hash(aaaaa) -> 13049 ... reduce(13049, 1) -> bcdef ... hash(bcdef) -> 30459 ... reduce(30459, 2) -> iekdj ... hash(iekdj) -> 69384
hash(bbbbb) -> 93024 ... reduce(93024, 1) -> bcdef ... hash(bcdef) -> 30459 ... reduce(30459, 2) -> iekdj ... hash(iekdj) -> 69384
hash(ccccc) -> 84626 ... reduce(84626, 1) -> dgrhf ... hash(dgrhf) -> 84532 ... reduce(84532, 2) -> iekdj ... hash(iekdj) -> 69384
</pre>
<h3>A Better Method for Creating Perfect Tables</h3>
<p>Creating our perfect tables faster is as simple as identifying the merges in our chains before we reach the end of the chain. Instead of creating each chain one at a time, and then checking for merges after-the-fact, we create all of our chains out to a certain &#8220;merge step&#8221;, eliminate merges, and then continue until the end of the chain. If we repeat this process several times, we can identify progressively more merges before reaching then end of the chain. As we eliminate more merges, the time to get to our next &#8220;merge step&#8221; is less than the one before it.</p>
<p>Here is an illustration of this process in pseudo-code:</p>
<pre>function extend_chains (chains, first_step, last_step)
chains = extend_chains(chains, 0, 99)
throwaway_merges(chains)
chains = extend_chains(chains, 100, 199)
throwaway_merges(chains)
...
chains = extend_chains(chains, 900, 999)
throwaway_merges(chains)</pre>
<p>I have implemented my throwaway_merges function in a <a href="http://en.wikipedia.org/wiki/Merge_sort">mergesort</a> (seems fitting) which automatically discards duplicate chains during the sort process.</p>
<h3>Performance</h3>
<p>The performance of this method depends on the number of merges that will occur in our chains. The more merges, the larger the difference in time between the new method and the old method. As long as the time required to sort the chains and eliminate merges is less than the time gained from eliminating those merges early on, we receive a net gain in speed. Profiling my code, my implementation of throwaway_merges comprises less than 1% of the total computation time.</p>
<p>Just to reiterate, creating merges <em>speeds up</em> this process of creating rainbow tables. Using increasingly longer chains, while still detrimental in the actual cracking process, is no longer <em>as</em> detrimental in the generation of perfect rainbow tables.</p>
<p>I have generated some output from my new method of finding perfect chains compared with an implementation of the original method for finding perfect chains.</p>
<p>All runs were done on a single core of a Q6600 processor in Ubuntu-9.10-amd64 (each process on its own core). Disk write times are not accounted for, but are assumed negligible. The hash function being used is the MD4 implementation available <a href="http://code.google.com/p/endeavormac/source/browse/trunk/md4/md4.h">here</a>. I am currently generating MD4 hashes for ease of debugging (I check against md4 hashes generated through python&#8217;s hashlib), but these will be NT hashes by release (a simple conversion).</p>
<p><b><a href="http://myw3b.net/~endeavormac/rmm/rmm.txt">First Run</a></b></p>
<p>In the first run, we generate 10,000 chains, each chain with a length of 2048, where hashes reduce to plaintexts a-z 4 characters long. This table will have many merges, and will accent the speed improvement of this new method.</p>
<p>Of the 10000 chains, 405 unique chains remain. <strong>That&#8217;s only 4.05% of our original chains</strong>. My method finished creating its perfect chains in ~1.25 seconds, while the original method finished in ~7.21 seconds. <strong>The new method is ~5.8 times as fast</strong>.</p>
<p><b><a href="http://myw3b.net/~endeavormac/rmm/rmm2.txt">Second Run</a></b></p>
<p>In the second run, we generate 50,000 chains, each chain with a length of 2048, where hashes reduce to plaintexts a-z 5 characters long. This table will have fewer merges, and the improvement will not be as great.</p>
<p>Of the 50000 chains, 9349 unique chains remain. <strong>That&#8217;s 18.7% of our original chains</strong>. My method finished creating its perfect chains in ~15.96 seconds, while the original method finished in ~38.11 seconds. <strong>The new method is ~2.4 times as fast</strong>.</p>
<p><b><a href="http://myw3b.net/~endeavormac/rmm/rmm3.txt">Third Run</a></b></p>
<p>In the third run, we generate 500,000 chains, each chain with a length of 10240, where the hashes reduce to plaintexts of a-z 7 characters long. This table will have fewer merges, and the improvement should be minimal.</p>
<p>Of the 500,000 chains, 313005 unique chains remain. <strong>That&#8217;s 62.6% of our original chains</strong>. My method finished creating its perfect chains in ~1433 seconds, while the original method finished in ~1792 seconds. <strong>The new method is ~1.25 times as fast</strong>.</p>
<p><b><a href="http://myw3b.net/~endeavormac/rmm/rmm3.txt">Fourth Run</a></b></p>
<p>In the third run, we generate 50,000 chains, each chain with a length of 2048, where the hashes reduce to plaintexts of a-z 7 characters long. This table will almost no merges, and there should be no improvement with the new method.</p>
<p>Of the 50,000 chains, 49417 unique chains remain. <strong>That&#8217;s 98.8% of our original chains</strong>. My method finished creating its perfect chains in ~35.99 seconds, while the original method finished in ~36.00 seconds. <strong>The new method is ~1.00 times as fast</strong>.</p>
<h3>Looking Forward</h3>
<p>The entirety of this rainbow tables implementation was coded by me in C. The code requires some cleaning up, but once I am done I will be releasing a new, open source (exact license TBD) implementation of rainbow tables in C. The release will include both the original method of generating rainbow tables, and my new method for generating perfect rainbow tables.</p>
<p>Other things I would like to do:</p>
<ul>
<li>The chains are currently 20 bytes long. This can be brought down to 12 bytes, this has not been implemented yet.</li>
<li>I would also like to experiment with a heap sort, where chains are added to the heap as soon as they are generated. The heap will automatically drop duplicate chains. When moving from one &#8220;merge step&#8221; to another, we just read from one heap and write to another. I am unsure how this will perform compared to my current implementation of merge sort. As the sorting algorithm accounts for only a small amount of time, this is low-priority.</li>
<li>Assuming this new method will lead to longer chains being generated, implementing &#8220;break points&#8221; in the chains may be beneficial. Given a chain of length 10,000, we may also store a value which will allow us to pick up our chain at step 5,000. We would call this value a break point. This way, when cracking against the table, and we find a possible collision at, say, position 8,000, we do not have to generate the chain starting from step 0. We can start from step 5,000. This is not my idea, but I do have a link to the original idea. More breakpoints can be added to speed up cracking, at a loss of disk space.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://myw3b.net/blog/index.php/2010/03/making-perfect-rainbow-tables-faster/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>x86 Assembly for C Programmers 1.1, Reddit Follow-up</title>
		<link>http://myw3b.net/blog/index.php/2009/10/x86-fcp-1-1-reddit-follow-up/</link>
		<comments>http://myw3b.net/blog/index.php/2009/10/x86-fcp-1-1-reddit-follow-up/#comments</comments>
		<pubDate>Sat, 24 Oct 2009 15:21:04 +0000</pubDate>
		<dc:creator>endeavormac</dc:creator>
				<category><![CDATA[Assembly Tutorials]]></category>

		<guid isPermaLink="false">http://myw3b.net/blog/?p=82</guid>
		<description><![CDATA[I posted x86 Assembly for C Programmers 1 to reddit and got some great feedback. There were a few things that were brought up, and I&#8217;m taking a minute to address some of them (I&#8217;m not addressing everything).  Thanks to everyone out there who took the time to point out mistakes and make suggestions. Intel [...]]]></description>
			<content:encoded><![CDATA[<p>I posted <a href="http://myw3b.net/blog/?p=7">x86 Assembly for C Programmers 1</a> to <a href="http://www.reddit.com/r/programming/comments/9wzut/assembly_for_c_programmers_1/">reddit</a> and got some great feedback. There were a few things that were brought up, and I&#8217;m taking a minute to address some of them (I&#8217;m not addressing everything).  Thanks to everyone out there who took the time to point out mistakes and make suggestions.<span id="more-82"></span></p>
<h3>Intel Assembly Comprehensive Cheat Sheet</h3>
<p><a href="http://www.reddit.com/user/danukeru">danukeru</a> from reddit posted a link to <a href="http://www.jegerlehner.ch/intel/opcode.html">this cheat sheet</a>, which lists x86 opcodes, what their abbreviations mean, and their Intel syntax. It is awesome, and I wish I had it earlier.</p>
<h3>A little more into lea</h3>
<p>This comes from <a href="http://www.reddit.com/user/jputnam">jputnam</a> at reddit as a suggestion.</p>
<p>A quick google search turns up <a href="http://www-scm.tees.ac.uk/users/u0000408/Instruct/_LEA.htm">a</a> <a href="http://www.intel.com/software/products/documentation/vlin/mergedprojects/analyzer_ec/mergedprojects/reference_olh/mergedProjects/instructions/instruct32_hh/vc150.htm">few</a> <a href="http://wiki.answers.com/Q/What_is_load_effective_address">articles</a> which are helpful. I especially like <a href="http://courses.ece.illinois.edu/ece390/archive/fall2001/books/labmanual/inst-ref-lea.html">this one</a> from the University of Illinois at Urbana-Champaign.</p>
<p>Basically, <strong>lea</strong> sets a <em><span style="text-decoration: underline;">register</span></em> equal to a value, after computations on that value are completed. Usually, this is an offset from another register. This is different from <strong>mov</strong>, which is used to set a <em><span style="text-decoration: underline;">memory location pointed to by a register</span></em> equal to a value.</p>
<p>Let&#8217;s say we want to set our register <strong>ecx</strong> equal to <strong>ebp+0&#215;12</strong>, because that&#8217;s where one of our stack variables are, and instead of using <strong>[ebp+0x12]</strong> everytime we want to refer to that stack variable, we&#8217;re going to use <strong>ecx</strong> (I have no idea why we would ever want to do this, but we&#8217;ll pretend). We would use the <strong>lea</strong> instruction to do this all very neatly in one line.</p>
<pre lang="asm">lea ecx, [ebp+0x12]</pre>
<p>If you&#8217;re still not entirely cleared up on <strong>lea</strong>, visit some of the links above. They do an excellent job explaining this instruction.</p>
<h3>Explain why eax is set to 0 at the end of main</h3>
<p>Another one from <a href="http://www.reddit.com/user/jputnam">jputnam</a>.</p>
<p>It is a common convention to set <strong>eax</strong> to the return value. Main returns 0. Therefor, we set <strong>eax</strong> to 0.</p>
<p>If you call a function and want to check its returned value, check <strong>eax</strong>.</p>
<h3>Spend a little more time explaing ebp</h3>
<p>This one comes from <a href="http://www.reddit.com/user/stevep98">stevep98</a> at reddit.</p>
<p>Given this C function:</p>
<pre lang="asm">void function ebp_explanation (int argument)
{
	int i;
}</pre>
<p>This is a simplistic example of how this function may look on the stack (<a href="http://ditaa.sourceforge.net/">ditaa</a> is cool):</p>
<p><em>Image has been accidentally deleted</em></p>
<p>0&#215;74 is memory we reserve for int i. Normally, we reserve this space by setting <strong>esp</strong> somewhere beneath 0&#215;74, so when we push things on to the stack they do not overwrite the memory at 0&#215;74.</p>
<p>We then set <strong>ebp</strong> somewhere consistent on our stack, and we leave it there. Convention is to set <strong>ebp</strong> to where it is in the above illustration. From there, when we want to refer to certain pieces of memory (IE stack variables), we refer to them by an offset from <strong>ebp</strong>. If we want to add 1 to int i, it will look like this:</p>
<pre lang="asm">add DWORD PTR [ebp-0x4], 0x1</pre>
<p>Like storing the return value in eax, this is a convention, but not a rule. In fact, when we take our same program from the first tutorial, give gcc the -O3 flag, and look at the disassembled code, we will see gcc decides to refer to int i in regards to <strong>esp</strong> instead of in regards to <strong>ebp</strong>.</p>
<h3>Stack Alignment</h3>
<p><a href="http://pwnguin.net/">jldugger</a> commented on stack alignment.</p>
<p>If you want some more information stack alignment, which I admittedly didn&#8217;t do a terrific job of explaining, a simple google search returns these two pages: ( <a href="http://www.fftw.org/doc/Stack-alignment-on-x86.html">one</a> | <a href="http://stackoverflow.com/questions/672461/what-is-stack-alignment">two</a> ). They should help clear things up. I&#8217;m going to leave it at that for stack alignment.</p>
<h3>What&#8217;s to come</h3>
<p>I won&#8217;t know 100% what will come next until it&#8217;s written, but these are the next three topics I would like to hit (remember, I&#8217;m learning as I&#8217;m writing):</p>
<ol>
<li>Take one.c (the example program from x86 Assembly for C Programmers 1) compiled with -O2 and -O3, and analyze why the compiler does what it does with different levels/kinds of optimization.</li>
<li>Begin looking at some more complicated examples, more of the instruction set, and expand our knowledge of the x86 instruction set.</li>
<li>We will take a C program, break it down in to assembly, and begin optimizing it in assembly. This will be the first installment of writing/modifying the assembly code.</li>
</ol>
<p>Beyond that, we&#8217;ll find out when we get there.</p>
]]></content:encoded>
			<wfw:commentRss>http://myw3b.net/blog/index.php/2009/10/x86-fcp-1-1-reddit-follow-up/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>x86 Assembly for C Programmers 1</title>
		<link>http://myw3b.net/blog/index.php/2009/10/assembly-for-c-programmers-1/</link>
		<comments>http://myw3b.net/blog/index.php/2009/10/assembly-for-c-programmers-1/#comments</comments>
		<pubDate>Tue, 13 Oct 2009 12:00:30 +0000</pubDate>
		<dc:creator>endeavormac</dc:creator>
				<category><![CDATA[Assembly Tutorials]]></category>

		<guid isPermaLink="false">http://myw3b.net/blog/?p=7</guid>
		<description><![CDATA[Introduction I&#8217;m writing a series of tutorials on x86 assembly for C programmers who are already familiar with many of the basics of programming and computing. The assembly tutorials available online just aren&#8217;t doing it for me, and I need something organized the way I think, on the topics I&#8217;m interested in, presented in a [...]]]></description>
			<content:encoded><![CDATA[<h3>Introduction</h3>
<p>I&#8217;m writing a series of tutorials on x86 assembly for C programmers who are already familiar with many of the basics of programming and computing. The assembly tutorials available online just aren&#8217;t doing it for me, and I need something organized the way I think, on the topics I&#8217;m interested in, presented in a way which make comprehensive understanding easy. I&#8217;ll do the work, go find the answers, and then drop everything here for you to enjoy.</p>
<p>Please note I do not claim to be an expert on the assembly language.</p>
<p>My interest in assembly is for both optimizing C applications, and the purpose of developing exploits for vulnerabilities in common applications, <em>not</em> write applications in assembly from scratch. I&#8217;m not interested in, &#8220;Good,&#8221; examples of assembly, I&#8217;m interested in real examples. This will affect the assembly we look at. More specifically, I write the code in C, compile it with gcc, and what comes out is what we&#8217;ll be dissecting.</p>
<p>For the purposes of these tutorials, 32-bit x86 assembly. Everything compiled/built/disassembled on the latest stable distro of Ubuntu.<span id="more-7"></span></p>
<h3>References</h3>
<p><a href="http://www.arl.wustl.edu/~lockwood/class/cs306/books/artofasm/toc.html">The Art of Assembly</a> is an excellent reference, and if you need clarification of any of the topics discussed, I recommend checking it out. <a href="http://www.arl.wustl.edu/~lockwood/class/cs306/books/artofasm/Chapter_6/CH06-1.html#top">Chapter six</a> covers all of the instructions, how they work, and what specifically they do.</p>
<h3>Thanks To:</h3>
<p>Bushmills from irc.freenode.net##asm for taking the time to explain to a noob why the first 7 lines of assembly were what they were.</p>
<h3>The Code</h3>
<p>Let&#8217;s take a look at a simple C application, and it&#8217;s disassembled assembly code.<br />
gcc one.c -o one</p>
<pre lang="c">#include <stdio.h>

int main (int argc, char * argv [])
{

	int i;

	argc++;

	for (i = 0; i < 10; i++)
		printf("%d\n", i);

	return 0;

}</pre>
<p>Disassembled counterpart (for main):<br />
objdump -d one -M intel</p>
<pre lang="asm">080483c4 :
 80483c4:	8d 4c 24 04          	lea    ecx,[esp+0x4]
 80483c8:	83 e4 f0             	and    esp,0xfffffff0
 80483cb:	ff 71 fc             	push   DWORD PTR [ecx-0x4]
 80483ce:	55                   	push   ebp
 80483cf:	89 e5                	mov    ebp,esp
 80483d1:	51                   	push   ecx
 80483d2:	83 ec 24             	sub    esp,0x24
 80483d5:	83 01 01             	add    DWORD PTR [ecx],0x1
 80483d8:	c7 45 f8 00 00 00 00 	mov    DWORD PTR [ebp-0x8],0x0
 80483df:	eb 17                	jmp    80483f8
 80483e1:	8b 45 f8             	mov    eax,DWORD PTR [ebp-0x8]
 80483e4:	89 44 24 04          	mov    DWORD PTR [esp+0x4],eax
 80483e8:	c7 04 24 d0 84 04 08 	mov    DWORD PTR [esp],0x80484d0
 80483ef:	e8 04 ff ff ff       	call   80482f8

 80483f4:	83 45 f8 01          	add    DWORD PTR [ebp-0x8],0x1
 80483f8:	83 7d f8 09          	cmp    DWORD PTR [ebp-0x8],0x9
 80483fc:	7e e3                	jle    80483e1
 80483fe:	b8 00 00 00 00       	mov    eax,0x0
 8048403:	83 c4 24             	add    esp,0x24
 8048406:	59                   	pop    ecx
 8048407:	5d                   	pop    ebp
 8048408:	8d 61 fc             	lea    esp,[ecx-0x4]
 804840b:	c3                   	ret
 804840c:	90                   	nop
 804840d:	90                   	nop
 804840e:	90                   	nop
 804840f:	90                   	nop</pre>
<p>This is a list of the instructions that are used above. We'll explain which each of these instructions do as we come across them later:</p>
<ul>
<li><strong>lea</strong> - Load Effective Address</li>
<li><strong>and</strong> - logical AND</li>
<li><strong>push</strong> - PUSH data on to the stack</li>
<li><strong>mov</strong> - MOVe data from one register to another</li>
<li><strong>sub</strong> - SUBtract</li>
<li><strong>jmp</strong> - JuMP</li>
<li><strong>call</strong> - CALL another subfunction</li>
<li><strong>add</strong> - ADDition</li>
<li><strong>cmp</strong> - CoMPare</li>
<li><strong>pop</strong> - POP data off the stack</li>
<li><strong>ret</strong> - Return control to the parent function</li>
</ul>
<p>You'll notice we left off <strong>jle</strong>. <strong>jle</strong> means jump if less than or equal to, and is a variant of the <strong>jmp</strong> instruction. You can find all the variations with any assembly reference.</p>
<p>Now let's take a look at the registers used. (<a href="http://stackoverflow.com/questions/1395591/what-is-exactly-the-base-pointer-and-stack-pointer-to-what-do-they-point">ESP/EBP</a>)</p>
<ul>
<li><strong>esp</strong> - Stack Pointer (for the top of the stack).</li>
<li><strong>ecx</strong> - Counter (used for other purposes described later)</li>
<li><strong>ebp</strong> - Base Pointer</li>
<li><strong>eax</strong> - Accumulator Register (Arithmetic Operations)</li>
</ul>
<p>If you don't understand exactly what all these registers are, we'll describe them later, and you will see how they are used.</p>
<h3>Some Background:</h3>
<p>First, some vocabulary:</p>
<ul>
<li><strong>Stack</strong>: This is, surprise, an implementation of the data structure known as the <a href="http://en.wikipedia.org/wiki/Stack_(data_structure)">stack</a>. We use this stack to keep track of information about the program during the course of its running.</li>
<li><strong>Register</strong>: Think of <a href="http://en.wikipedia.org/wiki/Processor_register">registers</a> as our variables. Think of them as pointers, and we dereference them by putting them in [].</li>
<li><strong>Instruction</strong>: An i<a href="http://en.wikipedia.org/wiki/Instruction_(computer_science)">nstruction</a> is an operation we want to run on the processor.</li>
<li><strong>Operand</strong>: Quite simply, an <a href="http://en.wikipedia.org/wiki/Operand#Computer_science">operand</a> is an argument to an instruction.</li>
<li><strong>Word</strong>: Every 4 bytes is considered a word. <a href="http://en.wikipedia.org/wiki/Word_(computing)">Wikipedia defines word as the smallest unit of data used by a computer design</a>. We're using a 32 bit operating system, so 32 bit words, 4 byte words...</li>
</ul>
<p><span style="text-decoration: underline;"><strong>The x86 Stack and esp<br />
</strong></span></p>
<p>The x86 stack is a LIFO mechanism we use to store information, LIFO being Last In, First Out. <strong>push</strong> puts data on the stack, <strong>pop</strong> takes data off the stack. <strong>push</strong> and <strong>pop</strong> manipulate data relative to <strong>esp</strong>, which is the stack pointer.</p>
<p>The stack grows down, meaning we start at higher memory addresses, and as the stack grows, we end up with lower memory addresses. <strong>esp</strong> is often referred to as pointing to the top of the stack, but in diagrams, the top of the stack is depicted as at the bottom (because we have higher addresses at the top, and lower addresses at the bottom).</p>
<p><strong>esp</strong> decrements before adding a value to the stack, not after, so <strong>esp</strong> will always point to the last element added to the stack.</p>
<p>This may be a bit confusing now, but by the end of the first 7 instructions, you should have a good handle on it.</p>
<p>When we call a function, the stack typically looks like this:</p>
<pre>------------------
| argument 1     |
------------------
| argument 0     |
------------------
| return address | &lt;- esp is here
------------------</pre>
<p>This is how the function will inherit the stack. In most simplistic tutorials, a few more commands will be executed at the beginning of the function to give us a stack like this:</p>
<pre>------------------------
| argument 1           |
------------------------
| argument 0           |
------------------------
| return address       |
------------------------
| original ebp         | &lt;- ebp points here
------------------------
| stack data variables | &lt;- esp is here
------------------------</pre>
<p><span style="text-decoration: underline;"><strong>Aligning the Stack</strong></span></p>
<p>This stack, as you will see, is nothing more than a bunch of memory in relation to <strong>esp</strong>, and <strong>esp</strong> is the only way we can identify where we are in the stack. If we change esp, we change our location in the stack, without using push or pop.</p>
<p>We want the stack to be "aligned", meaning we want our stack variables to start on a word whose address ends in 0, or the memory is evenly divisible by 16, however is easiest for you to think of it. This apparently speeds up the computation of some operations, but more importantly, with the introduction of SSE instructions (which work on 128 bits at once), having your variables aligned improperly can lead to some spectacular failures.</p>
<p>It all has to do with memory segmentation <em>(Edit: Not really. See jldugger's comment. Processor design is important here. Visit the following link to learn more. I'm going to go ahead and mark this under not too terribly important to understand. Keep reading, you'll be fine, I promise.)</em> If you're really interested, <a href="http://en.wikipedia.org/wiki/X86_memory_segmentation">do some reading</a>. For now, just know we want our stack to be properly aligned, and that's what gcc is doing in the first seven instructions.</p>
<h3>The Assembly<span style="text-decoration: underline;"><strong><br />
</strong></span></h3>
<p>We're going to go instruction by instruction, explaining what's happening, and looking at the stack, along with where our registers are, each step along the way.</p>
<table border="0">
<tbody>
<tr>
<td valign="top">The state of the stack when we enter main() can be found to the right.</p>
<p>As we go through the first seven instructions, the instruction and a description will be found on the left, while the state of the stack will be found on the right.</td>
<td valign="top" nowrap="nowrap">
<pre>     ------------------
0x80 | char * argv[]  |
     ------------------
0x7c |   int argc     |
     ------------------
0x78 |   ret addr     | &lt;- esp points here
     ------------------
0x74 |                |
     ------------------
0x70 |                |
     ------------------
0x6c |                |
     ------------------
0x68 |                |
     ------------------
0x64 |                |
     ------------------
     ~     ~   ~      ~
     ------------------
0x40 |                |
     ------------------</pre>
</td>
</tr>
</tbody>
</table>
<table border="0">
<tbody>
<tr>
<td valign="top">
<pre lang="asm">lea ecx,[esp+0x4]</pre>
<p>This is the Load Effective Address instruction.</p>
<p>Syntax of lea:</p>
<p><strong>lea</strong> dest, source</p>
<p>It loads the destination register with the source register, after completing any necessary computations. For us, it loads the address of <strong>esp</strong> +0x4 into <strong>ecx</strong>, meaning <strong>ecx</strong> will point to the address beneath <strong>esp</strong> on the stack. Our stack now looks like this:</td>
<td valign="top" nowrap="nowrap">
<pre>     ------------------
0x80 | char * argv[]  |
     ------------------
0x7c |   int argc     | &lt;- ecx points here now
     ------------------
0x78 |   ret addr     | &lt;- esp points here
     ------------------
0x74 |                |
     ------------------
0x70 |                |
     ------------------
0x6c |                |
     ------------------
0x68 |                |
     ------------------
0x64 |                |
     ------------------
     ~     ~   ~      ~
     ------------------
0x40 |                |
     ------------------</pre>
</td>
</tr>
</tbody>
</table>
<table border="0">
<tbody>
<tr>
<td valign="top">
<pre lang="asm">and esp,0xfffffff0</pre>
<p>This is the logical and instruction.</p>
<p>Syntax of and:</p>
<p><strong>and</strong> dest, source</p>
<p>It performs a binary and between the destination and the source, and saves the result in the destination. If you're not familiary with binary operations, you should probably take some time to familiarize yourself with them immediately. Here's what wikipedia has to say on <a href="http://en.wikipedia.org/wiki/Binary_and">AND</a>.</p>
<p>This is where we align the stack.</p>
<p>Now our stack looks like this:</td>
<td valign="top" nowrap="nowrap">
<pre>     ------------------
0x80 | char * argv[]  |
     ------------------
0x7c |   int argc     | &lt;- ecx points here
     ------------------
0x78 |   ret addr     |
     ------------------
0x74 |                |
     ------------------
0x70 |                | &lt;- esp points here now
     ------------------
0x6c |                |
     ------------------
0x68 |                |
     ------------------
0x64 |                |
     ------------------
     ~     ~   ~      ~
     ------------------
0x40 |                |
     ------------------</pre>
</td>
</tr>
</tbody>
</table>
<table border="0">
<tbody>
<tr>
<td valign="top">
<pre lang="asm">push DWORD PTR [ecx-0x4]</pre>
<p>Push "pushes" an item on to the stack.</p>
<p>Syntax of push:</p>
<p><strong>push</strong> data</p>
<p>Let's break down what we are pushing on the stack.</p>
<p>The brackets mean we are referring to the contents of the memory pointed to by <strong>ecx</strong>-0x4. This is the return address. So <strong>ecx</strong>-0x4 is 0x7c, but [<strong>ecx</strong>-0x4] is the return address.</p>
<p>DWORD PTR means were are referring to a 32 bit value. WORD PTR is 16 bits, BYTE PTR is 8 bits. The processor knows ecx is a 32 bit value, but because we are pushing the value at ecx, the processor needs to know how many bits, starting at ecx, to push.</p>
<p>Once this is completed, the stack will look like this:</td>
<td valign="top" nowrap="nowrap">
<pre>     ------------------
0x80 | char * argv[]  |
     ------------------
0x7c |   int argc     | &lt;- ecx points here
     ------------------
0x78 |   ret addr     |
     ------------------
0x74 |                |
     ------------------
0x70 |                |
     ------------------
0x6c |   ret addr     | &lt;- esp points here now
     ------------------
0x68 |                |
     ------------------
0x64 |                |
     ------------------
     ~     ~   ~      ~
     ------------------
0x40 |                |
     ------------------</pre>
</td>
</tr>
</tbody>
</table>
<table border="0">
<tbody>
<tr>
<td valign="top">
<pre lang="asm">push ebp</pre>
<p>We're pushing <strong>ebp</strong> on to the stack. We do this so at the end of the function, we can restore ebp to its original state.</td>
<td valign="top" nowrap="nowrap">
<pre>     ------------------
0x80 | char * argv[]  |
     ------------------
0x7c |   int argc     | &lt;- ecx points here
     ------------------
0x78 |   ret addr     |
     ------------------
0x74 |                |
     ------------------
0x70 |                |
     ------------------
0x6c |   ret addr     |
     ------------------
0x68 | original ebp   | &lt;- esp points here now
     ------------------
0x64 |                |
     ------------------
     ~     ~   ~      ~
     ------------------
0x40 |                |
     ------------------</pre>
</td>
</tr>
</tbody>
</table>
<table border="0">
<tbody>
<tr>
<td valign="top">
<pre lang="asm">mov ebp,esp</pre>
<p>Mov moves the value of one register in to another.Think of mov as "dest := source"</p>
<p>Syntax of mov:</p>
<p><strong>mov</strong> dest, source</p>
<p>Here, we moving the value of the <strong>esp</strong> register in to the <strong>ebp</strong> register. If you understand the purpose of the <strong>ebp</strong> register, you know we use it to refer to variables on the stack. In our c application, int i; is a stack variable. Variables on the heap are generally variables for whom we dynamically allocate memory, but for now this isn't important. Know that on our stack we are going to have room for the integer i.</p>
<p>We need a way to refer to this place on the stack consistently. To do this, we use the <strong>ebp</strong> register. This register points to the base of our stack in this function. Now if we want to refer to integer i, we refer to an offset of the stack relative to <strong>ebp</strong>. As we continue to go through the instructions, you will see <strong>[ebp-0x8]</strong>, which actually refers to integer i on the stack.</td>
<td valign="top" nowrap="nowrap">
<pre>     ------------------
0x80 | char * argv[]  |
     ------------------
0x7c |   int argc     | &lt;- ecx points here
     ------------------
0x78 |   ret addr     |
     ------------------
0x74 |                |
     ------------------
0x70 |                |
     ------------------
0x6c |   ret addr     |
     ------------------
0x68 | original ebp   | &lt;- esp and ebp point here
     ------------------
0x64 |                |
     ------------------
     ~     ~   ~      ~
     ------------------
0x40 |                |
     ------------------</pre>
</td>
</tr>
</tbody>
</table>
<table border="0">
<tbody>
<tr>
<td valign="top">
<pre lang="asm">push ecx</pre>
<p>Now we're pushing <strong>ecx</strong> on to the stack. The reason we are doing this can be found in the instructions 0x8048406 and 0x8048408. We will use this <strong>ecx</strong> to return <strong>esp</strong> to its original state before executing the ret instruction at the end of this function..</td>
<td valign="top" nowrap="nowrap">
<pre>     ------------------
0x80 | char * argv[]  |
     ------------------
0x7c |   int argc     | &lt;- ecx points here
     ------------------
0x78 |   ret addr     |
     ------------------
0x74 |                |
     ------------------
0x70 |                |
     ------------------
0x6c |   ret addr     |
     ------------------
0x68 | original ebp   | &lt;- ebp points here
     ------------------
0x64 |      0x7c      | &lt;- esp points here now
     ------------------
     ~     ~   ~      ~
     ------------------
0x40 |                |
     ------------------</pre>
</td>
</tr>
</tbody>
</table>
<table border="0">
<tbody>
<tr>
<td valign="top">
<pre lang="asm">sub esp,0x24</pre>
<p>Sub is short for subtract, and it subtracts the value on the right from the value on the left.</p>
<p>Syntax of sub:</p>
<p><strong>sub</strong> dest, source</p>
<p>Think like this: "dest -= source"</p>
<p>Now we subtract 0x24 from <strong>esp</strong>. This gives us our room for our stack variables. We only have one stack variable, and definitely do not need 9 words of space on the stack to make room for an integer, which under normal circumstances should be just 4 bytes, or one word in size. However, because this code was not compiled with any optimization flags, this is how gcc pieced everything together.</td>
<td valign="top" nowrap="nowrap">
<pre>     ------------------
0x80 | char * argv[]  |
     ------------------
0x7c |   int argc     | &lt;- ecx points here
     ------------------
0x78 |   ret addr     |
     ------------------
0x74 |                |
     ------------------
0x70 |                |
     ------------------
0x6c |   ret addr     |
     ------------------
0x68 | original ebp   | &lt;- ebp points here
     ------------------
0x64 |      0x7c      |
     ------------------
     ~     ~   ~      ~
     ------------------
0x40 |                | &lt;- esp points here now
     ------------------</pre>
</td>
</tr>
</tbody>
</table>
<p>The first seven instructions are the most confusing, and things become much simpler from here. Hopefully you have become familiar with the working of the stack. I'm going to omit stack pictures from the remainder of this tutorial.</p>
<pre lang="asm">add DWORD PTR [ecx],0x1</pre>
<p>The add instruction works like the sub instruction, except instead of subtraction we are working with addition.</p>
<p>Because of the brackets, we are not adding 1 to <strong>ecx</strong>, but instead to the memory pointed to by <strong>ecx</strong>. If you remember from our stack, <strong>ecx</strong> points to the first argument we passed to main, or int argc. If you remember from our C code, after declaring int i, we incremented argc. Well, here's the assembly instruction for that line of code.</p>
<p>DWORD PTR because integers are 4 bytes (int argc).</p>
<pre lang="asm">mov DWORD PTR [ebp-0x8],0x0</pre>
<p>Now we are entering our for loop. The first thing our for loop does is set int i equal to 0. Well, we know int i is a stack variable. We also know the common convention is to refer to stack variables as an offset from <strong>ebp</strong>. So guess where int i is on the stack? That's right, it's at <strong>ebp</strong>-0x8. Here we are setting int i equal to 0, the first part of our for loop.</p>
<pre lang="asm">80483df:   eb 17   jmp   80483f8</pre>
<p>The jmp instruction is used to "JuMP" from one place in the code to another. I included the two bytes which form this instruction because I wanted to point something out. While we see "jmp 80483f8", which makes this instruction look absolute, it's actually relative. We are jumping 0x17 bytes ahead. 0xdf + 0x02 + 0x17 = 0xf8. Why add the 0x02? Because this jmp instruction is two bytes, and the jump starts after the jmp instruction.</p>
<p><strong><em><span style="text-decoration: underline;">We're going to do some skipping around now</span></em></strong>. Instead of following the assembly from first instruction to last, I'd instead like to go through the assembly in the order the instructions will be executed.</p>
<pre lang="asm">80483f8:   83 7d f8 09   cmp   DWORD PTR [ebp-0x8],0x9</pre>
<p>The CoMPare instruction compares two values, and sets the x86 flags register appropriately. Yes, there's an x86 flags register. No, we aren't that concerned with it right now. Just know that the cmp instruction sets flags which correspond to a comparison between its two operands.</p>
<pre lang="asm">80483fc:   7e e3   jle   80483e1</pre>
<p>The Jump if Less than or Equal to instruction will execute a jmp instruction if the x86 flags register has the appropriate flags set, indicating the previous cmp instruction compared one value that was less than or equal to a second value.</p>
<p>We're beginning to see exactly how our for loop executes on the processor. After setting the initial value, we jump immediately down to the comparison, or for our for () statement, "i &lt; 10". The comparison actually comes out to "i &lt;= 9". If this condition holds true, we perform another jump to where the beginning of our for loop code would be.</p>
<pre lang="asm">80483e1:   8b 45 f8   mov   eax,DWORD PTR [ebp-0x8]</pre>
<p><strong>eax</strong> is one of our general purpose registers we haven't mentioned yet. Here, we are setting it equal to <strong>[ebp-0x8]</strong>, or int i.</p>
<pre lang="asm">80483e4:   89 44 24 04   mov   DWORD PTR [esp+0x4],eax</pre>
<p>We are preparing to call the function printf. Printf takes two arguments. Remember, arguments are with the first argument closest to the top of the stack, and the last argument closest to the bottom. We are now positioning arguments on the stack. Int i represents our second argument in our printf() function call, and we are placing it closest to the bottom of the stack here.</p>
<pre lang="asm">80483e8:   c7 04 24 d0 84 04 08   mov   DWORD PTR [esp],0x80484d0</pre>
<p>Here we are moving the value 0x80484d0 in to the memory where esp is currently located. We're placing this value into the stack without altering esp. You're probably wondering what is at memory address 0x80484d0. It's these 4 bytes:</p>
<p>0x25 0x64 0x0a 0x00</p>
<p>The C String equivalent would be "%d\n". I hope it looks familiar, because it's the first argument to our printf call.</p>
<pre lang="asm">80483ef:   e8 04 ff ff ff   call   80482f8</pre>
<p>And here we go ahead and make the printf call. The call instruction will do a few things. For simplicity's sake, we will say it pushes the address of the next instruction on to the stack (the return address for the next function/procedure), and then begins executing the assembly instruction at the specified location. <a href="http://www.arl.wustl.edu/~lockwood/class/cs306/books/artofasm/Chapter_6/CH06-5.html#HEADING5-98">If you absolutely must know...</a></p>
<p>We don't really need to worry too much about this now. Just know that 0x80483f4 just got pushed on to the stack, and the next instruction that will be executed is 0x80482f8. When the procedure we call returns, its ret instruction will pop the return address off the stack, meaning the stack should by just as we left it before the call instruction.</p>
<pre lang="asm">80483f4:   83 45 f8 01   add   DWORD PTR [ebp-0x8],0x1</pre>
<p>After the printf(), and before we do our next comparison, we need to increment int i. This is where that tiny piece of magic happens.</p>
<p>After this add instruction, we're back to our cmp instruction. We've already covered this, so let's skip ahead to the remaining six instructions, starting at the memory location 0x80483fe.</p>
<pre lang="asm"> 80483fe:	b8 00 00 00 00       	mov    eax,0x0
 8048403:	83 c4 24             	add    esp,0x24
 8048406:	59                   	pop    ecx
 8048407:	5d                   	pop    ebp
 8048408:	8d 61 fc             	lea    esp,[ecx-0x4]
 804840b:	c3                   	ret</pre>
<p>You should be able to understand what's going on now in these last six lines. If not, here's a quick synopsis to help you on your way:</p>
<ul>
<li><strong>80483fe:</strong> Zero out eax... eax := 0</li>
<li><strong>8048403:</strong> Return esp to its original position, before we made room for stack variables. Need more help? Look at memory location 0x80483d2.</li>
<li><strong>8048406:</strong> Get ecx back off the stack.</li>
<li><strong>8048407:</strong> Set ebp to its original value before we entered the procedure. We're returning to our parent function, and it probably wants to know where its stack variables are.</li>
<li><strong>8048408:</strong> Set esp back to its original value when we entered the main() procedure.</li>
<li><strong>804840b:</strong> Return, which will pop the return address off the stack, and the next instruction executed will now be at that return address.</li>
</ul>
<p>Take another look at the assembly instructions. You should now understand all the basics of what is happening at the processor.</p>
<p>In the next tutorial, we'll take a better look at what exactly is happening, with a little less abstraction and a little more detail.</p>
]]></content:encoded>
			<wfw:commentRss>http://myw3b.net/blog/index.php/2009/10/assembly-for-c-programmers-1/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
		</item>
	</channel>
</rss>
