Typically the L1 TLB access time will be less than the cache access time to allow tag comparison in a set associative, physically tagged cache. A direct mapped cache can delay the tag check by assuming a hit. (For an in-order processor, a miss with immediate use of the data would need to wait for the miss to be handled, so there is no performance penalty. For an out-of-order processor, correcting for such wrong speculation can have noticeable performance impact. While an out-of-order process is unlikely to use a direct mapped cache, it may use way prediction which can behave similarly.) A virtually tagged cache can (in theory) delay TLB access even more since the TLB is only needed to verify permissions not to determine a cache hit and the handling of permission violations is generally somewhat expensive and rare.
This means that L1 TLB access time will generally not be made public since it will not influence software performance tuning.
L2 hit time would be equivalent to the L1 miss penalty. This will vary depending on the specific implementation and may not be a single value. E.g., If the TLB uses banking to support multiple accesses in a single cycle, bank conflicts can delay accesses, or if rehashing is used to support multiple page sizes, a page of the alternate size will take longer to find (both of these cases can accumulate delay under high utilization).
The time required for an L2 TLB fill can vary greatly. ARM and x86 use hardware TLB fill using a multi-level page table. Depending on where page table data can be cached and whether there is a cache hit, the latency of a TLB fill can be between the latency of a main memory access for each level of the page table and the latency of the cache where the page table data is found for each level (plus some overhead).
Complicating this further, more recent Intel x86 have paging-structure caches which allow levels of the page table to be skipped. E.g., if a page directory entry (an entry in a second level page table which points to a page of page table entries) is found in this cache, rather than starting from the base of the page table and doing four dependent look-ups only a single look-up is required.
(It might be worth noting that using a page the size of the virtual address region covered by a level of the page table (e.g., 2 MiB and 1 GiB for x86-64), reduces the depth of the page table hierarchy. Not only can using such large pages reduce TLB pressure, but it can also reduce the latency of a TLB miss.)
A page table miss is handled by the operating system. This might result in the page still being in memory (e.g., if the write to swap has not been completed) in which case the latency will be relatively small. (The actual latency will depend on how the operating system implements this and on cache hit behavior, though cache misses both for the code and the data are likely since paging is an uncommon event.) If the page is no longer in memory, the latency of reading from secondary storage (e.g., a disk drive) is added to the software latency of handling an invalid page table entry (i.e., a page table miss).