Hold Summary

Refreshes every 5 minutes
Code Subcode Jobs Users Representative Hold Reason Updated
12 122 38548 5 Transfer output files failure at access point ap43 while receiving files from the execution point. Details: writing to file /home/mkossin/japan_moller-mock/output/isu_sample_22428.root: (errno 122) Disk quota exceeded 4 mins ago
13 256 22618 30 Transfer input files failure at execution point slot1_1@glidein_3183085_584553109@um-lm02 using protocol osdf. Details: failed to get namespace information for remote URL osdf:///nrp/cachetest/100gb: error while querying the director at https://osdf-director.osg-htc.org: Contact.Director Error: Error code 3001: 404: No caches can fulfill this request and no fallback origins with the 'DirectReads' capability found for this object. Request ID: 3bb14795-804c-4bcc-b80f-e2a5061d83f3 ( URL file = osdf://nrp/cachetest/100gb )| 4 mins ago
13 2816 21383 12 Transfer input files failure at the execution point using protocol osdf. Details: failed to get namespace information for remote URL osdf:///ospool/uw-shared/projects/ND_Savoie/orca_6_1_1-zip-multiwfn-2.sif: error while querying the director at https://osdf-director.osg-htc.org: Transfer.DirectorTimeout Error: Error code 6005: Get "https://osdf-director.osg-htc.org/ospool/uw-shared/projects/ND_Savoie/orca_6_1_1-zip-multiwfn-2.sif": dial tcp 128.105.69.14:443: i/o timeout ( URL file = osdf:///ospool/uw-shared/projects/ND_Savoie/orca_6_1_1-zip-multiwfn-2.sif )| 4 mins ago
1 0 12190 9 via condor_hold (by user ankur.pal) 4 mins ago
26 100 7196 12 The job restarted too many times 4 mins ago
36 -1 5022 2 Error from slot1_3@glidein_291320_282646156@node11.cluster: Starter failed to upload checkpoint 4 mins ago
3 0 4667 7 The job attribute OnExitHold expression '(ExitBySignal == true) || (ExitCode != 0)' evaluated to TRUE 4 mins ago
12 2 4088 67 Transfer output files failure at the execution point while sending files to access point ap40. Details: 1 total failures: first failure: reading from file /scratch.local/condor/execute/dir_2468363/glide_fg9dqS/execute/dir_541725/scratch/playground/event_0/EVENT_RESULTS_11/spvn_results_11.h5: (errno 2) No such file or directory 4 mins ago
13 2 2515 41 Transfer input files failure at access point ap40 while sending files to the execution point. Details: 1 total failures: first failure: reading from file /home/naohito.kadota/intro-2.1-words.txt: (errno 2) No such file or directory 4 mins ago
21 104 2386 16 Error from slot1_4@glidein_1273733_57799156@sdsc-57.t2.ucsd.edu: disk usage exceeded allocated disk 4 mins ago
34 102 1923 32 Error from slot1_23@e4078.chtc.wisc.edu: Docker job has gone over memory limit of 32768 Mb 4 mins ago
21 102 1849 22 Error from slot1_6@glidein_44993_876642354@CRUSH-OSG-C7-10-5-200-49: memory usage exceeded request_memory 4 mins ago
47 0 1847 30 The job exceeded allowed execute duration of 20:00:00 4 mins ago
21 0 897 52 Error from slot1_89@e2475.chtc.wisc.edu: Job failed to complete in 40 hrs 4 mins ago
26 0 585 4 Job in status 2 put on hold by SYSTEM_PERIODIC_HOLD due to memory usage 6292672. 4 mins ago
12 256 494 20 Transfer output files failure at the execution point using protocol osdf. Details: failed to get namespace information for remote URL osdf:///ospool/ap40/data/ankur.pal/Eta_Timelike/PBC/L=14/p=0.365/tau=7/mi_r1510634130.dat: error while querying the director at https://osdf-director.osg-htc.org: Contact.Director Error: Error code 3001: 404: No sources found for the requested path: no origins found for the requested namespace '/ospool/ap40/data/ankur.pal/Eta_Timelike/PBC/L=14/p=0.365/tau=7/mi_r1510634130.dat': Request ID: 891fd917-23a8-4013-b490-077a8b40a192 ( URL file = osdf:///ospool/ap40/data/ankur.pal/Eta_Timelike/PBC/L=14/p=0.365/tau=7/mi_r1510634130.dat )| 4 mins ago
9 22 405 2 Error from slot1_1@glidein_2070910_534119040@c218.mgmt.hellbender: StreamHandler: stdout: couldn't write to logs/job.3826922.129.out: Invalid argument (-1!=0) 4 mins ago
0 0 188 1 4 mins ago
45 -1000 182 1 Error from slot1_66@UNL-PATH-EP.osgvo-docker-pilot-ospool-bdc6cf4c6-vk5nd: Singularity test failed:FATAL: While checking container encryption: could not open image /cvmfs/singularity.opensciencegrid.org/.images/56/572d622ff6ab1374a682a9ba1272a4dba86f73fcf986d4e010d1023e4f8fb6: image format not recognized 4 mins ago
14 2 119 2 Cannot access initial working directory /home/osg07/bosco/sandbox/2e0c/2e0c30d5/chtc-path-facility-ce1.svc.osg-htc.org_9619_chtc-path-facility-ce1.svc.osg-htc.org_454885.0_1778007268: No such file or directory 4 mins ago
21 103 112 17 Error from slot1_148@e2605.chtc.wisc.edu: disk usage exceeded request_disk 4 mins ago
34 104 92 11 Error from backfill1_165@e4097.chtc.wisc.edu: Job has exceeded allocated disk (98.00 GB). Consider increasing the value of request_disk. 4 mins ago
6 0 55 1 Error from slot1_14@e2558.chtc.wisc.edu: Error running docker job: error while creating mount source path '/hdd': mkdir /hdd: file exists 4 mins ago
32 0 34 2 TransferInputSizeMB (5072) is greater than MAX_TRANSFER_INPUT_MB (5000) at submit time 4 mins ago
35 0 28 6 Error from slot1_55@e4076.chtc.wisc.edu: Cannot pull image registry.doit.wisc.edu/erwin.lares:1.0.10 Error response from daemon: Head "https://registry.doit.wisc.edu/v2/erwin.lares/manifests/1.0.10": denied: access forbidden 4 mins ago
12 2816 28 1 Transfer output files failure at the execution point while sending files to access point grid-submitter. Details: Pelican Client Error: failed upload to dtn-1.icecube.wisc.edu:8443: Contact.ConnectionReset Error: Error code 3005: the existing TCP connection was broken (potentially caused by server restart or NAT/firewall issue) (200ms since start) (Version: 7.23.0; Site: SU-ITS) ( URL file = pelican.monopoledetectorlam10run1335grid+osdf:///icecube/wipac/data/ana/BSM/IC86_SubRelativisticMonopoles/signal/Level0/WithLuminescence/Discrete/B0001L10/GRID/afterSLOP/Combined/det_mer_charge1_beta0.001_lambda10_nevents10000_1335.i3.zst )||FILETRANSFER:1:non-zero exit (11) from /var/lib/condor/execute/dir_34902/glide_LXew7j/main/condor/libexec/stash_plugin. |Error: Pelican Client Error: failed upload to dtn-1.icecube.wisc.edu:8443: request failed (HTTP status 403): Authorization Error: Error code 4000: server returned 403 Forbidden (0s since start) (Version: 7.23.0; Site: SU-ITS) ( URL file = pelican.monopoledetectorlam10run1335grid+osdf:///icecube/wipac/data/ana/BSM/IC86_SubRelativisticMonopoles/signal/Level0/WithLuminescence/Discrete/B0001L10/GRID/afterSLOP/Luminescence/det_lumi_charge1_beta0.001_lambda10_nevents10000_1335.i3.zst )||FILETRANSFER:1:non-zero exit (11) from /var/lib/condor/execute/dir_34902/glide_LXew7j/main/condor/libexec/stash_plugin. |Error: Pelican Client Error: failed upload to dtn-1.icecube.wisc.edu:8443: request failed (HTTP status 403): Authorization Error: Error code 4000: server returned 403 Forbidden (100ms since start) (Version: 7.23.0; Site: SU-ITS) ( URL file = pelican.monopoledetectorlam10run1335grid+osdf:///icecube/wipac/data/ana/BSM/IC86_SubRelativisticMonopoles/signal/Level0/WithLuminescence/Discrete/B0001L10/GRID/afterSLOP/ProtonDecay/det_decay_charge1_beta0.001_lambda10_nevents10000_1335.i3.zst )| 4 mins ago
26 102 27 5 Excessive CPU usage. Job used 6 CPUs, while request_cpus=1. Please verify that the code is configured to use a limited number of cpus/threads, and matches request_cpus. 4 mins ago
13 13 13 1 Transfer input files failure at access point wright-ap4000 while sending files to execution point backfill1_19@oconnor2000.chtc.wisc.edu. Details: 1 total failures: first failure: reading from file /home/ytao49/my-python.sif: (errno 13) Permission denied 187 hours ago
13 11 9 1 Transfer input files failure at the execution point while receiving files from access point ap40. Details: receiving file /scratch_space/glide_MJVLrJ/execute/dir_4468/full_orca_6_final_2.tar.gz: FILETRANSFER:1:FILETRANSFER: plugin for type osdf not found!|FILETRANSFER:110:No output from /scratch_space/glide_MJVLrJ/main/condor/libexec/stash_plugin -classad, ignoring 4 mins ago
20 0 6 1 Job missed deferred execution time 94 hours ago
19 7 5 2 Error from slot1_4@glidein_4048954_59564225@wn-d22-021.beowulf.cluster: PREPARE_JOB (prepare-hook) failed (reported status 007): Unable to download or build singularity image /cvmfs/singularity.opensciencegrid.org/jeffersonlab/gluex_almalinux_9:latest 4 mins ago
6 2 5 1 Error from slot1_18@e2614.chtc.wisc.edu: Failed to execute '/var/lib/condor/execute/slot1/dir_4095288/scratch/execution/script': (errno=2: 'No such file or directory') 4 mins ago
9 122 4 2 Error from slot1_1@glidein_1453844_593508470@rails15: StreamHandler: stdout: couldn't write to 3c68a939/8_14799300.out: Disk quota exceeded (-1!=104) 4 mins ago
9 0 3 1 Error from slot1_29@glidein_1998039_422628845@gpu01: unable to establish standard output stream 4 mins ago
15 0 2 2 submitted on hold at user's request 4 mins ago
13 28 2 1 Transfer input files failure at execution point backfill1_53@oconnor2006.chtc.wisc.edu while receiving files from access point ap2001. Details: writing to file /var/lib/condor/execute/slot1/dir_3923127/scratch/r_env.tar.gz: (errno 28) No space left on device 108 hours ago
6 8 2 2 Error from slot2_2@gpu2001.chtc.wisc.edu: Failed to execute '/var/lib/condor/execute/slot2/dir_860444/scratch/hw4.sh': (errno=8: 'Exec format error') 4 mins ago
16 0 2 1 Spooling input data files 89 hours ago
4 0 2 2 Job credentials are not available 4 mins ago
26 100001 2 1 Policy violation. Execution time limit exceeded, job was evicted and held to prevent rematching 4 mins ago
46 0 2 1 The job exceeded allowed job duration of 1:00:00 4 mins ago
6 13 2 1 Error from slot1_23@e2456.chtc.wisc.edu: Failed to execute '/var/lib/condor/execute/slot1/dir_673824/scratch/redo_hem1': (errno=13: 'Permission denied') 4 mins ago
12 11 1 1 Transfer output files failure at execution point slot1_3@glidein_27762_229309240@mem3 while sending files to access point grid-submitter. Details: sending file /wsu/tmp/glide_HRJjO2/execute/dir_13115/all_data_2021_8121_step2_use_all_unhit_doms.i3.bz2: FILETRANSFER:1:FILETRANSFER: plugin for type osdf not found!|FILETRANSFER:110:No output from /wsu/tmp/glide_HRJjO2/main/condor/libexec/stash_plugin -classad, ignoring 2 hours ago
13 512 1 1 Transfer input files failure at execution point slot1_1@GP-ARGO-mst-backfill.28b96da5c73d while receiving files from access point ap40. Details: file transfer plugin /usr/libexec/condor/stash_plugin exited (exit code 2), no valid classads in output file /pilot/osgvo-pilot-haQYfN/execute/dir_9077/.stash_plugin.out 21 hours ago
12 28 1 1 Transfer output files failure at access point submit06 while receiving files from the execution point. Details: writing to file /scratch/submit/cms/areimers/workdir_scetlib/Z_COM13_CT18Z_N3p0LL_NewNPs_Lattice_Lambda4Bugfix_FranksVals_NPScan_Fine/scetlib_outputs/inclusive_Z_COM13_CT18Z_N3+0LL_lattice_lambda4bugfix_franksvals_npscan_fine_pdf0_bins_2050_2255_var_060.pkl: (errno 28) No space left on device 127 hours ago
3 9001 1 1 Retrying after high memory usage 120 hours ago
26 1001 1 1 Policy violation. Memory limit exceeded: 9329 MB resident > 6500 MB requested. 25 hours ago
34 0 1 1 Error from slot2_1@vetsigian0000.chtc.wisc.edu: Job has gone over cgroup memory limit of 2048 megabytes. Last measured usage: 1916 megabytes. Consider resubmitting with a higher request_memory. 1 hour ago
19 0 1 1 Error from slot1@glidein_530417_589561095@c1806.swan.hcc.unl.edu: failed to execute PREPARE_JOB (/scratch/glide_9kzfid/client_group_main/prepare-hook) 4 mins ago
13 5 1 1 Transfer input files failure at the execution point while receiving files from access point submit06. Details: writing to file /scratch/gautschi/cmsgrid/glide_p8fIoo/execute/dir_627705/variations_lattice_allvars.conf: (errno 5) Input/output error 134 hours ago
13 21 1 1 Transfer input files failure at access point ap2002 while sending files to execution point slot2_4@gpu2005.chtc.wisc.edu. Details: 1 total failures: first failure: reading from file /home/yyi49/openpi/wandb/latest-run: (errno 21) Is a directory- Transfer of symlinks to directories is not supported. 11 hours ago
13 -1001 1 1 Transfer input files failure at the execution point while receiving files from access point grid-submitter. Details: receiving file /scratch/glide_xQnl23/execute/dir_275704/scratch/decay_charge1_beta0.0003_lambda0.001_nevents10000_4718.i3.zst: FILETRANSFER:1:FILETRANSFER: plugin for type osdf not found!|FILETRANSFER:110:No output from /scratch/glide_xQnl23/main/condor/libexec/stash_plugin -classad, ignoring 93 hours ago
12 -1001 1 1 Transfer output files failure at the execution point while sending files to access point grid-submitter. Details: 1 total failures: first failure: sending file /scratch/glide_Ol718E/execute/dir_4001962/scratch/all_data_2020_71_step2_use_all_unhit_doms.i3.bz2: FILETRANSFER:1:FILETRANSFER: plugin for type osdf not found!|FILETRANSFER:110:No output from /scratch/glide_Ol718E/main/condor/libexec/stash_plugin -classad, ignoring 64 hours ago
13 1 1 1 Transfer input files failure at execution point interactive3_1@gpulab2001.chtc.wisc.edu while receiving files from access point ap2002. Details: Error from interactive3_1@gpulab2001.chtc.wisc.edu: Failed to transfer files: Attempt to write to illegal sandbox path: .. Attempt to write to illegal sandbox path: ../basketball Attempt to write to illegal sandbox path: ../basketball/configs Attempt to write to illegal sandbox path: ../basketball/configs/v8Pose.yaml 268 hours ago
19 1 1 1 Error from slot1_8@glidein_3926248_485583295@node2045.palmetto.clemson.edu: PREPARE_JOB (prepare-hook) failed (reported status 001): Unable to download or build singularity image /cvmfs/oasis.opensciencegrid.org/jlab/halla/solid/soft/container/jeffersonlab_jlabce_tag1.5_digest:sha156:9b9a9ec8c793035d5bfe6651150b54ac198f5ad17dca490a8039c530d0301008_10110413_s3.9.5.sif 4 mins ago
14 13 1 1 Cannot access initial working directory /home/projects/GATech_Otte/TrinityDemonstrator/DataAnalysis/flasher_calibration: Permission denied 265 hours ago