Thursday 28 June 2012

Microsoft Excel 2003 encryption scanner

(Edit 6/4/2013 - You will want the latest version at https://github.com/GavinSmith0123/crackxls2003 or https://github.com/GavinSmith0123/crackxls2003/tags rather than what is below.)

Microsoft Excel 2003 spreadsheets which are encrypted are encrypted with 40 bits of encryption.

No free program exists to scan keys to decrypt such files. I have taken a step towards such a program existing with the following file:

http://pastebin.de/27599

This is mostly complete, but requires to be linked against the low-level function involved in the MD5 algorithm, operating on a single 64-byte block. The "md5-fast.S" file linked to from here works. There is surely the same function in the implementation of the OpenSSL MD5 functions, but it is not provided as a publicly callable function.

Compile with something like "cc -O5 -fomit-frame-pointer -march=native -mtune=native crackxls2003.c -o xls_crack -lssl md5-fast.S".

I haven't put much work into the interface. It should be invoked with something like './xls_crack "7c dc fa a2 cd d2 2c 6a c8 f0 9f 3b 8b 72 ee f8" "d0 b0 64 e3 92 b2 6e 12 d0 5c 4a fe bc 66 35 8f" 1d da 30 05 8f'. The first argument is the "verifer" field, the second is the "verifierHash" field. The remainder of the arguments tell you where to start checking the key space. (The example I've given has the correct encryption key (which came from the password "monkey") which you can use to check the program is working.) These two fields can be obtained using a program like "poledump" in conjunction with the file format reference.

It processes keys at the rate of about a month per key space on my computer. On an x86 machine, it may be possible to make it go faster using SSE instructions. I have heard you can make it go very fast using GPU memory, but I cannot test this as my computer has none.

Once the encryption key is obtained, the demonstration version of "guaexcel" can be used to decrypt the file.
Obvious improvements would be to extend it to a complete program at both ends, extracting the necessary data from the encrypted file and decrypting the file after the correct key is obtained.

Let me know if you successfully used this code.

Links:

Crack password - RC4 40 bit decryption of documents - second part
Excel RC4 Encryption Algorithm
MS Office 40-bit key searching on GPU

Saturday 21 April 2012

Data in Lisp

Lisp has always been something of a mystery to me. I've read much about it, but still sometimes feel I don't have a good grasp of what is going on. Almost it seems that you need to have written your own Lisp compiler to understand it. Here are a few points which I found enlightening.

Lisp is, simply put, not a functional language! Understanding of state and data is indispensable in understanding the execution of a Lisp program.

We have heard before that a binding associates a symbol with a value. But it is best, I think, to say that "a binding is a storage location". (I'm not sure if such a statement would enrage somebody with a different mental model of Lisp.) For example:

(let ((x 5)) ; create a storage cell and store 5 in it
(setq x 7) ; overwrite 5 with 7
(setq x (cons 1 2)) ;allocate a new storage area with two cells. Store 1 in the first cell and 2 in the second. Store in the aforesaid storage cell ("x") the memory address of the newly allocated area (and keep some note somewhere that it is a pointer to a cons cell, not an integer)
(setq x '(3 4 5))) ;store a pointer to a location which contains a list (3 4 5). (this list should not be modified.)

All in all, the storage which is allocated when this "let" expression is evaluated is as follows:
* A storage location for the symbol "x"
* Storage for cons cell

(The list (3 4 5) was assumed to exist already.)

Accessing the values of bindings


Only one of these can actually be referenced: the cons cell. You cannot access data structures such as those which hold the bindings of symbols. However, it is best to think of them both as storage, because they are both subject to garbage collection.

One thing you might try and do is export the value of a symbol. This is impossible.

To show a couple of work-arounds, here are two ways of implementing Jensen's device.

Example from Wikipedia in Algol 60:

begin
integer i;
real procedure sum (i, lo, hi, term);
value lo, hi;
integer i, lo, hi;
real term;
comment term is passed by-name, and so is i;
begin
real temp;
temp := 0;
for i := lo step 1 until hi do
temp := temp + term;
sum := temp
end;
comment note the correspondence between the mathematical notation and the call to sum;
print (sum (i, 1, 100, 1/i))
end


(I'm not sure what Jensen's device actually is but I think it's the function "sum".)

Let's think for a minute about what we would have to do to get something like this to work. The example works by passing a variable ("i"), and a function which operates on the variable to the function (written here simply as "1/i"). The function is tied specifically to the variable i, and no other.

This makes us think of closures. We'll create a new variable (or in Lisp terms, binding), and a function which uses this variable, and then pass both to a function.

Our first attempt:

(let ((i 0)) ; storage area
(flet ((term () (/ 1 i))) ; function referring to storage area
(sum i 1 100 #'term))) ; arguments are in same order as above example


Wait, no no no, we are evaluating "i" on the last line and passing 0 to the sum function. Maybe we could quote it as "(quote i)" - no, that won't work: then we're passing a symbol to the function, which is no use.

There's no way of doing what we want to do without starting again and using a very different approach.

For example, data structures such as lists can be passed around. So we could make "i" a list of length 1:

(let ((i (list 0)))
(flet
((invi () (/ 1 (first i)))
(sum i 1 100 #'inva)))


Now the sum function can get access to the value in the list.

Here is a second, final alternative using accessor functions:

(let ((i 0))
(flet
((get-i () i)
(set-i (x) (setq i x)))
(invi () (/ 1 i)))
(sum #'get-i #'set-i 1 100 #'invi)))


Now I'll move onto another subject.

Data is richer than atoms and lists


A program or function in Lisp belongs to the smallest collection of lists which satisfies the property that if we say that the lists are of type X, then the members of each list are all one of:
  • Symbols
  • Constants (such as numbers or strings)
  • Lists of type X

or in other words, is a list which you can create if the only data types you know about other than lists are symbols or constants (known as atoms).

Now during the execution of a Lisp program, data is stored of various types. It was a realization for me to notice that the options for the type of stored data are slightly richer than the types that occur in the data structure corresponding to the program code.

For example, in the above examples we pass closures such as "#'invi" as parameters to the function "sum". When "sum" is being executed, some of its parameters have values which are of the closure type, which is nowhere to be seen in code.

Sunday 15 April 2012

Notes on C default argument promotions and function prototypes

I spent a while researching this subject, so here are a few notes.

C has a thing called the "default argument promotions". If using modern standard C, the only time that these will matter are if you are passing parameters to a variadic function (e.g. printf). The parameters in the variadic section may have their type converted before they are passed to the function. For example, float parameters are converted to doubles, and char's are converted to int's. If you actually needed to pass, for example, a char instead of an int, the function would have to convert it back.

This is a wart and exists because of early design decisions whose consequences we got stuck with for the sake of backwards compatibility. This post is for historical interest and to see why the rules are the way they are.

Here are some examples. We refer to the following functions:


int a (c1)
char c1;
{
printf("%%c:%c\n%%x:%04x\n", c1, c1);
}

int b (c1)
int c1;
{
printf("%%c:%c\n%%x:%04x\n", c1, c1);
}

int a1 (char c1)
{
printf("%%c:%c\n%%x:%04x\n", c1, c1);
}

int b1 (int c1)
{
printf("%%c:%c\n%%x:%04x\n", c1, c1);
}




We have some functions defined here using the pre-ANSI style, and some with the ANSI style, so we shall see what is possible.

We try different ways of calling these functions.


int
main (void)
{
int (*fp)();
int (*fp1)(char);

char c = 's';


We define 'fp' as a function pointer without a prototype. Therefore, all parameters passed via the function pointer are subject to the default argument promotion rules.

The following are correct usage:



fp = a;
fp ('st'); /* correct */


This prints:

%c:t
%x:0074


(I've decided to mix things up with some multi-character character constants, which are result in undefined behaviour according to the standard. I am assuming that they can be used to define the individual bytes of an integer constant.) This is correct because even though 'a' takes a char as a parameter and we are passing it an int (character literals are of type int in C), 'a' is defined using the pre-ANSI style, so it should be called as if it were receiving an int, and it will be automatically converted to a char within the function. (I wanted to see if I could pass both characters ('s' and 't') on to printf, but in this case it only seems to pass on the 't'.)


fp = b;
fp('st'); /* correct */


This prints:


%c:t
%x:7374


Now for something slightly more questionable:


fp = b1;
fp('st'); /* correct - mixing styles works */


This prints:


%c:t
%x:7374


This is questionable because we are calling a function defined using the ANSI style with a pointer declared using the old style - but the function is expecting an int, and gets an int - so everything is fine.


fp1 = a1;
fp1(c); /* correct */


This prints:

%c:s
%x:0073


just as you would expect it to.

Now for some incorrect examples:


/*fp = a1;
fp ('st'); /* incorrect */


It's impossible to call 'a1' correctly through fp, because it is expecting a char as a parameter, but 'fp' cannot pass chars as parameters. 'st' starts as type int, and is passed as type int to a1.


/*fp = a1;
fp (c); /* incorrect */


Even though c is of type char, the default argument promotions convert it to an int.


/* fp1 = a;
fp1(c); /* incorrect but works */


Here we are doing the opposite: calling a old-style function with a new-style pointer. c is passed as a char, but fp1 is expecting an int. It does seem to work on my computer but it is still wrong.

Here are some more functions which we will try calling:


int receive_float (f)
float f;
{
printf ("%f\n", f);
}

int receive_double (f)
double f;
{
printf ("%f\n", f);
}

int new_receive_float (float f)
{
printf ("%f\n", f);
}

int new_receive_double (double f)
{
printf ("%f\n", f);
}


Here are some attempts at calling these functions:


receive_float (1.0f); /*pass as double*/
receive_double (2.0); /*pass as double - 1.0 is of type double*/


Both of these are fine. 1.0f starts as a float, is converted to a double by the argument promotions, and within the function 'receive_float', it is converted back to a float. (Then within 'receive_float', it is converted again to a double to be passed to 'printf'.) The second line works exactly as we would expect it to.

Here is another example:

fp = new_receive_double;
fp(3.0);
fp(4.0f); /* use argument promotions */


'fp' is an old-style function pointer, so all float parameters are converted to doubles, as they should be.

Here is an incorrect example:

/* fp = new_receive_float; /* calling fp correctly is impossible /
fp(3.0); /* incorrect /
fp(4.0f); /* incorrect */

'new_receive_float' expects a float parameter, but fp can only pass it doubles.

Finally, some examples with variadic functions:



#if 0

varargs.h doesn't work anymore so I can't test calling it

int old_variadic (va_alist)
va_dcl
{
int x, y;

va_list p;
va_start(p);

a = va_arg(p, int);
b = va_arg(p, int);

va_end(p);

printf("Args received: %d, %d\n", a, b);
}

#endif

int new_variadic (char *s, ...)
{
va_list p;
int a,b;

va_start (p, s);

a = va_arg(p, int);
b = va_arg(p, int);

va_end(p);

printf("Args received: %d, %d\n", a, b);
}



You will see I have commented out the first function. I wanted to test calling old-style variadic functions in various ways, but I can't compile any old-style functions on my computer any more.


new_variadic (0, 33, 44);


This prints:

Args received: 33, 44


as you'd expect.


int (*fpv)(char *, ...);

fpv = new_variadic; /* correct */
fpv (0, 33, 44);


works just the same.

Now we get on to the interesting bit:


fp = new_variadic; /* incorrect but works */
fp (0, 33, 44);


is technically incorrect, but works anyway.

Why is it incorrect? Because variadic functions are supposed to be prototyped, according to the C standard. The reason for this is that compilers are allowed to use different calling conventions for variadic functions.

Wait a minute here - if all variadic functions have to be defined and called using the new-style, then why is there a rule that variadic parameters undergo default argument promotion, when default argument promotion is a left-over of the old way of doing things?

The answer, I believe, is binary compatibility. We may have a variadic library function defined using the old method.

If we have 'printf', or some other variadic function in a compiled library, which written and compiled using the old system, then it is impossible to write a prototype properly for the function.

int printf(const char *, ...);
is incorrect, because this implies the library function 'printf' can be called using a different calling convention to non-variadic functions. However, before the introduction of function prototypes, such a difference couldn't and didn't exist, because the compiler had no way of telling what type of function it was when it was outputting the object code which was to call the function.

Hence, the correct way to declare the function is
int printf();
, with no ANSI-style prototype, which forces the parameters to undergo default promotion. But if if the same calling convention is used for all functions, then
int printf(const char *, ...);
will work as well, as long as the variadic parameters undergo default promotion, which they do. For the standard library at least, you expect to be able to prototype the functions properly, and this allows this to happen. This isn't a very good justification though, because you can just recompile the library, or not use variadic prototypes. But there is a better justification.

As well as old library binaries being compatible with new programs, you also want new versions of libraries to be compatible with old program binaries, because in some cases you won't have the source code for the latter. If an old binary calls a variadic function in your library, the default argument promotions will be applied to the arguments. Then if you write a new version of the function, using stdarg.h, it still has to be called by the calling code in the old binary. So the code in the new version must provide the same interface as the old version, accepting the same types of arguments. This is the interface to any new programs as well, which declared the function using a variadic prototype.

Suppose the old program had code which was compiled from the following:

short int a;
int b;

printf("%hd", a); /* interpret parameter as a short int (but it is passed as an int) */
printf("%d", b); /* interpret parameter as an int */


We could use the same code in a new program, and the 'printf' would have to provide the same interface to both programs, so the new program has to pass 'a' as an int as well.

I couldn't find much information online about this subject. Here are a few links which are slightly relevant:

http://lkml.indiana.edu/hypermail/linux/kernel/0606.2/0330.html

http://www.archivum.info/comp.lang.c/2009-07/01250/Re-about-argument-pushing-order.html

http://mail-index.netbsd.org/tech-kern/1995/07/31/0002.html">http://mail-index.netbsd.org/tech-kern/1995/07/31/0002.html

Thursday 5 April 2012

Further notes on Slackware 13.1 upgrade

I said in the previous post that I forgot to download the new kernels, which was incorrect. They are included in one of the packages.

Thunderbird would no longer open links in a web browser. I managed to follow the instructions here to fix the problem:

In Thunderbird go to "Edit --> Preferences --> Advanced" in the Thunderbird menus and click on the "Config Editor" button.

Search for the following three entries:
network.protocol-handler.warn-external.http
network.protocol-handler.warn-external.https
network.protocol-handler.warn-external.ftp
Set the value of each of these three entries to true (you can do this by double-clicking on each entry)


I have no idea why it is so obscure.

Another problem I had was that Ctrl-Alt-Del would now reboot instead of shutting down. I had executed the script in the upgrade instructions to use all the new configuration files. I fixed the problem by using the old configuration file:

bash #diff inittab inittab.bak
36c36
< ca::ctrlaltdel:/sbin/shutdown -t5 -r now
---
> ca::ctrlaltdel:/sbin/shutdown -t5 -h now
bash #mv inittab.bak inittab

Monday 2 April 2012

Notes on upgrading to Slackware 13.1

I decided to upgrade because the latest version of Google Chrome needed a newer libc and I needed to reinstall Google Chrome because I had a few problems with it. Slackware 13.37 is the latest version but it was said that skipping versions was not guaranteed to work.

First I downloaded all the new packages. I updated them in the order specified.

Then I realised I had a problem - I had not downloaded any of the new kernels! I managed to do this.

While downloading, I had a problem with wget. It seemed to be downloading extra files in parallel directories to the one I wanted! When I researched this problem later, I found that I should have used the "--no-parent" option. (Which reminds me of this Dilbert comic (25/4/1994).

I couldn't get to a login prompt when I rebooted, because the Internet connection scripts were failing in a typically idiotic manner. Rather than realizing they couldn't connect, they just kept on trying and blocked everything else from happening. I rebooted and started the kernel in single-user mode, and disabled them all ("chmod a-x rc.inet1" and so on). Now I could log in.

The biggest problem I had after that was getting wireless internet to work. I realized I needed to reinstall the ndiswrapper module for the new kernel, and I got a package from slackbuilds.org.

"lspci -k" showed me that the wrong driver was being used. I also had to blacklist some kernel modules by adding the following lines to /etc/modprobe.d/blacklist.conf:


blacklist wl
blacklist ssb
blacklist b43
blacklist b43legacy


I had those lines already, but a new blacklist.conf had been installed.

Connecting to a wireless router worked as before, with wpa_supplicant and dhcpcd.