Globus.org 40 User Tests
This page summarizes the work to date on exploratory transfer runs involving 40 users on cli.beta.globus.org. The initial test scenario is comprised of 40 users, each of whom executes a Globus.org scp command to transfer 100 2GB files. Source and destination endpoints are identical across all users; source and destination files are unique.
Run #3
- The submission script for Run #3 can be viewed here; submission output is here
- The data for charts #3.1-3.3 (extracted with the Globus.org details command) are here; the R commands are here
- The data for charts #3.4 and #3.5 (extracted with the Globus.org events command) are here; the R commands are here
Chart #3.1 (Time)
- Each user's transfers appear along a single horizontal line due to submission timing and the fact that transfer subtasks inherit the parents' request time
- The event data for this run includes a cluster of timeouts around 21:30, which may at least partly explain the gap in completion times:
Charts #3.2 and #3.3 (Attempts)
Each circle represents the attempt count for one or more files (due to plotting overlap):
Charts #3.4 and #3.5 (Events)
- These charts show event counts and distribution across time
- 7,090 abnormal end events were recorded:
- 6,325 "an end-of-file was reached globus_xio: An end of file occurred"
- 762 "Could not list '/gpfs/pads/projects/CI-CCR000047/childers/data/dest/sdata2' globus_xio: An end-of-file occurred"
- 3 "no detail available"
Run #2
- The submission script for Run #2 can be viewed here; submission output is here
- The data for charts #2.1-2.3 (extracted with the Globus.org details command) are here; the R commands are here
- The data for charts #2.4 and #2.5 (extracted with the Globus.org events command) are here; the R commands are here
Chart #2.1 (Time)
- Each user's transfers appear along a single horizontal line due to submission timing and the fact that transfer subtasks inherit the parents' request time
- Available data does not explain the reason for the ~30 second request processing stall
- The throughput achieved during Run #2 was better than during Run #1; available data provides no explanation for the improvement
Charts #2.2 and #2.3 (Attempts)
Each circle represents the attempt count for one or more files (due to plotting overlap):
Charts #2.4 and #2.5 (Events)
- These charts show event counts and distribution across time
- 3,051 abnormal end events were recorded:
- 2,958 "an end-of-file was reached globus_xio: An end of file occurred"
- 91 "Could not list '/gpfs/pads/projects/CI-CCR000047/childers/data/dest/sdata2' globus_xio: An end-of-file occurred"
- 2 "no detail available"
Run #1
- The submission script for Run #1 can be viewed here; submission output is here
- The data for charts #1.1-1.3 (extracted with the Globus.org details command) are here; the R commands are here
- The data for charts #1.4 and #1.5 (extracted with the Globus.org events command) are here; the R commands are here
- Three code defects were uncovered during the run: [912] [909] [914]
Chart #1.1 (Time)
- Each user's transfers appear along a single horizontal line due to submission timing and the fact that transfer subtasks inherit the parents' request time
- The two rightmost transfers were stalled for a time due to expired credentials; the two users received an email at the top of the hour informing them of the expiration
- Based on available data it is unclear why the transfer jobs of a few users (such as lccso35) finished well ahead of the others
- The throughput achieved (1.4 Gbps) is well under the target for the NASA demo; further investigation is needed to identify bottlenecks
Charts #1.2 and #1.3 (Attempts)
Each circle represents the attempt count for one or more files (due to plotting overlap):
Charts #1.4 and #1.5 (Events)
Note that chart #1.5 shows the distribution of events across time. For example, most of the timeouts occurred early in the run: