A lexical error corresponds to a particular state of the input token. We have only finite number of possible tokens. Even assuming a token can be of infinite length, to parse it we need only finite state in the automata required. And there should be a one to one correspondence between the number of possible errors and the state in the deterministic FA used for this. i.e., the number of possible lexical errors should be finite and hence regular. (I assume the error message does not output the already seen symbols - because this is already known as regular).