In my previous post we were discussing why providing low latency is more difficult than providing scalability. Below is a scale showing latency cost of various operations.
Event
|
Latency
|
Scaled
|
1 CPU Cycle
|
0.3 ns
|
1 s
|
Level 1 cache access
|
0.9 ns
|
3 s
|
Level 2 cache access
|
2.8 ns
|
9 s
|
Level 3 Cache access
|
12.9 ns
|
43 s
|
Main memory access (DRAM, from CPU)
|
120 ns
|
6
min
|
Solid-state disk I/O (flash memory)
|
50-150 µs
|
2-6 days
|
Rotational disk I/O
|
1-10 ms
|
1-12 months
|
Internet: San Francisco to New York
|
40 ms
|
4 years
|
Internet: San Francisco to UK
|
81 ms
|
8 years
|
Internet: San Francisco to Australia
|
183 ms
|
19 years
|
TCP packet retransmit
|
1-3 s
|
105-317 years
|
Unit
|
Abbreviation
|
Fraction of 1 sec
|
Millisecond
|
ms
|
0.001
|
Microsecond
|
µs
|
0.000001
|
Nanosecond
|
ns
|
0.000000001
|
Data is from the book Systems Performance: Enterprise and the Cloud by Brendan Gregg
First table gives the cost of operations and the second table explains the units of time. The third column of first table converts the cost to time scale that we understand, time scale of seconds, minutes, days, months and years. Here the cost of operations is scaled assuming if the cost of CPU cycle is 1 second hypothetically than what will be the cost of other operations.
How to use knowledge of above latencies?
- It can help in understanding the lower bound of latency or response time for a certain scenario
- For example if the user is in Australia and server is in San Francisco than time between request and response will be greater than 2 network hops or greater than 366 ms (183 * 2 )
- Performance objectives cant be met using a certain technology/operation
- For example if you want response in microseconds than the critical path of code execution cannot have a disk read/write.
- If your algorithm has more than 10 random memory operation to do some work it will at least take more than 1.2 microseconds (120ns * 10).
- Overall latencies can be estimated by knowing the cost of basic operations
- Finding reasons for performance jitters
- For example if response time jumped from microseconds to milliseconds. What could be the possible reasons? We know disk reads can take milliseconds so page fault can be a possible reason. GC in Java could be a reason as GC jitters are from few to 100s of milliseconds, but we should not suspect networking at 1 GBPS LAN as it will not cause delays in milliseconds especially if utilization is low. Knowing the system latencies help you in narrowing your search for the culprit.
Related previous posts:
No comments:
Post a Comment