Saturday, 30 January 2016

Editable garbage printed at bash prompt after printing binary file in an xterm

Sometimes I would have garbage printed at the bash prompt in xterm after printing a binary file, like


I've never got around to work out why, until now.

The reason is the presence of byte 9A in the file. See This is the "DECID" control sequence, which should cause the terminal to send a response string (as if it had been typed by the operator). Equivalent to DA or "Send Device Attributes". The response string in this case is


with some more bytes at the beginning (ESC ? or ESC >, I'm not sure which).

Friday, 8 January 2016

Meaning of lexical scope in Lisp

Lexical scope is described with the following:

Lexical scope. Here references to the established entity can occur only within certain program portions that are lexically (that is, textually) contained within the establishing construct

(Guy Steele et al (1989), Common Lisp the Language, 2nd edition, Chapter 3, Scope and Extent)

I think a nicer description is that lexical scope is early binding. It just so happens that all language constructs (lambdalet) that can create a binding for a variable that can be in effect when a function is defined happen to contain the textual region of the program where the variable is used.

I understood this from

Joel Moses (1970), The Function of FUNCTION in LISP

Saturday, 5 December 2015

Dealing with cyclical dependencies in C header files

Suppose source file A.c mostly deals with data structure A, which is defined in A.h. Likewise source file B.c mostly deals with data structure B, which is defined in B.h. A.h contains declarations of all the functions in A.c, and B.h contains declarations of all the functions in B.c.

Now we need to add a function to A.c which takes an argument of type B. To add a declaration of this function to A.h, we want a definition of type B to be in place in A.h. We do this by including B.h in A.h. This is intended to produce the effect that we can include A.h in another source file to declare the functions implemented in A.c, without requiring that source file to first include B.h.

Now suppose we add a function to B.c which takes an argument of type A. We do exactly the same in reverse.

What happens when a source file includes A.h now? A.h includes B.h at the start. B.h includes A.h at the start. We used include guards, so A.h is not included. When we reach a declaration of a function in B.h that has a parameter of type A, no definition of type A has been encountered yet, so there is an error.

This can be fixed by pulling the definitions of both A and B out into a separate header file, for example types.h. Forward declarations of types might work as well. (However, as far as I remember forward declaring a typedef isn't always accepted by C compilers.) In general, mixing function declarations and type definitions in header files may be problematical.

Friday, 11 September 2015

Law of Interface Responsiveness

Changes in response time between 0 and 1 seconds don't matter, because they're too small to notice.

Changes in response time between 1 and 10 seconds matter a lot, because the user will get bored waiting for it to finish.

Changes in response times above 10 seconds again don't matter, because the user is less likely to be sitting there waiting for it to finish, and will have gone and done something else.

Thursday, 3 September 2015

Mental arithmetic tips

For a sum like 8 * 26, imagine 26 -> 16 48, and when you have that clear in your head, smash them together to get 208.

To factorize a number less than a 1000, you only need to test divisibility by primes up to 31. There aren't that many of them: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29 and 31.

To test for divisibility by 2, check if the last digit is divisible by 2. To check 3, check if the sum of the digits is divisible by 3. To check 5, check if the last digit is 5 or 0. To check 11, add and subtract alternative digits, for example 374 = 11 * 34, and 3 + 4 - 7 = 0.

The others are not so easy. For some of them, think of subtracting a multiple of the prime, for example

301 = 280 + 21 = 7 * 40 + 7 * 3 = 7 * 43.

Test the lower primes first, because they are more likely to succeed.

For the larger primes, you can memorize the composite numbers they are involved in:

23 * 23 = 529
23 * 29 = 667
23 * 31 = 713
29 * 29 = 841
29 * 31 = 899
31 * 31 = 961

Wednesday, 19 August 2015

TeX - the worst programming language ever

Here are some observations I've had on programming TeX, or as I've taken to calling it recently, "Knuth's hunk a' junk".

You can't tell how many arguments a function will take by looking at its definition, because even if it takes no arguments, it can call another function that does take arguments.

All lines need to be ended with comment characters, except the ones that don't. A stray line with no comment at the end can cause an extra blank line to appear in the output, or break completely.

You can give numbers as a series of digits, which is what you expect. This will work most of the time, but occasionally a macro call immediately following the number will be expanded sooner than you might think. This can cause problems, for example in \ifnum\number=0\noexpand\foo\fi. If \number is equal to 0, you would expect this to result in "\noexpand\foo", but the \noexpand disappears, and you only get "\foo".

Likewise if you did \ifnum\number=1\twentythree\fi, and \twentythree was "23", this doesn't check if \number is 1: no, it actually checks if \number is 123. Not kidding. The solution is to always follow a number with a space, i.e. "\ifnum\number=1 \twentythree\fi".

Execution is split into multiple stages, including "expansion" and "execution". If you expand without executing, you can mess up the syntax for the execution. The error messages you get when the execution happens don't tell you why it's messed up: it just throws the whole dog's breakfast in your face.

This can happen when you define a macro that defines another macro, and then you use the first macro in the definition of a third macro whose body is expanded at the time of definition. Hence a macro doesn't stand alone - you have to account for the context in which it used.

TeX is a weak language, and sometimes you have even less of it. For example, suppose you are writing out to an auxiliary file, and you want to expand some control sequences, removing spaces following them. You think you're nearly a TeX wizard, and know about \futurelet, so you think, a-ha, I'll do a \futurelet at the end of this control sequence, check if the next character is a space, and if so, remove it.

This fails because \futurelet belongs, not to the expansion phase, but the execution phase. (Learn that by heart.) What you have at this stage are string concatenation, and string splitting. It's possible to remove a space even with these two. I'm not sure of the details, but it's something like this:

* Get the next character (splitting). Call it X.
* Append a space, followed by a marker character ("@") (concatenation). String now looks like "X @".
* Split the string at the first space. If X was not a space, this gives us X, if X was a space, this gives you an empty string. Also split the rest of the input at the marker character, and discard before then.

This doesn't work, however, because the first step doesn't work. You can't get the next character if it's a space, you get the first non-space instead.

And after I wrote that I discovered the existence of \ignorespaces, so none of that should be necessary. This allows me to make another point, which is that it's hard to learn TeX, because you can't tell which control sequences are primitives, and which are user-defined, just by looking at it. This wouldn't be so bad, except there are hundreds of primitives, not even including the primitives that plain Tex adds. I went looking for a definition of \ignorespaces in the file I saw it in and couldn't find one.

Error messages bear no relationship to the actual problem that caused them. You have to learn by experience what error message means what: for example "missing control sequence inserted (\inaccessible)" actually means: you tried to give a character an active definition when the character was not active (at the time of definition).

When you don't recognize the problem, you have to stare at the log files for half-an-hour or so, avoiding conscious thought, before you understand it.

Your input goes through a stage of interpretation called the "cat codes". (They're called that because TeX is personified by a lion.) You can tell TeX to make changes to the cat codes, but you also have to remember the cat codes from before the changes, in case something that was said in the past comes up again.

Your code is read from left to right, strictly. (Anyone who's learned about lambda calculus will see a similarity.) It's possible to expand the token after next with a construct like "\expandafter\next\token". But if you want to expand the token after that first, you have to do


To expand three tokens in advance and then two tokens in advance and then one in advance, it's:


Each time you need to expand a token coming from further down the line, the number of \expandafter's you need doubles. This is quite possible in practice, for example to expand a token inside a macro definition you need to jump over at least "\def", the name of the macro being defined, and an open brace ("{").

I keep on thinking that it must get easier at some point.

Monday, 10 August 2015


A compiler is

Happiest when compiling

Another compiler

How many cross-compilers could a cross-compiler compile if a cross-compiler could compile cross-compilers?