1. It only refers to the time taken to get it from the main memory. Generally, that is what access time means - time taken to get from X (X can be cache, TLB, main memory, hard disk etc.)
2. By default, most computers use a hierarchical memory system which is - cache, main memory and hard disk. None of the standard books mention a situation where simultaneous access is possible.
3. In case of a miss, the data is brought from the memory to the cache and stored there, and read from there. Afaik, there's no buffer between memory and cache.