In simple words "Blocks are units made up of words, so while tranferring from source to destination you should not split the block and transfer only a part of the block"
That is as equal to saying that, if you are considering MM block bigger than CM block, then you should transfer the bigger sized block only (method of transfer may vary a/c to bus capacity) and it can be then occupied by several smaller Cache blocks. ( <-- Answers your first qstn )
For second qstn, just think that "Why do we make block transfers?" Because , we had some instruction/data reference which resulted in a miss. So, that single reference is searched from MM and the corresponding block should be transferred. So actual transfer of only 1 MM block into cache is sufficient but as it will result in irregular empty space so we must move 2 MM blocks into cache to that entire cache block is perfectly occupied.
P.S. Generally while designing, we take same block sizes at every level. But I have seen some problems where higher level had more block size than lower ones. (Your case 1). But I have not seen yet where higher order block size is lower that lower level block sizes. Because, I think, its generally not the case.
P.P.S. There is a very nice problem based on this concept. It was asked in Gate 2010. It was linked data type qstn. Check that out.