Here is another way to look at it.
The lower bits of page address (page offset bits) are not used in TLB as they are the same for virtual as well as physical addresses.
remaining bits = 40-13=27
So total entries in a page table (let there be one in place of a TLB) = 2^27
So now, we want to map this Page Table to a TLB with 128 entries by the use of a 4-way set-associative mapping.
For mapping purpose,take the Page Table as main memory and TLB as cache memory and the size of a block as x.
So total number of bits for main memory are 27+x.
Now mapping MM 4-way set associatively to CM gives (tag-set-word) as (22-5-x).
So the number of tag bits are 22