INFN started running a Condor pool distributed throughout Italy and
communicating on the WAN in 1997, a time when Condor had no manpages (and closed source)!
23 DEC Alphas
4 HP workstations
6 DecStation running Ultrix
5 Pentium PCs running Linux (bold!)
The pool has seen a grand total of three days of downtime (Central Manager
updates before the HA setup appeared, Italy-wide blackout) during these
past 17 years. Todd may be right: why shut Condor down ?
Anyone with an interest in history can go back to the "10th
anniversary" material presented
during the EU Condor Week in 2006.
Condor @ INFN (2)
We thought we had reason to export part of this experience into the
emerging interest for distributed computing in the wake of the LHC
data-taking.
This effort got in the way of other plans
and was sidetracked, or flatly (pardon the pun) steamrolled.
The "Condor way".
"Did you see that guy claiming you can solve any problem with his software ?"
(overheard at CHEP 2004 in Interlaken)
In a less concise way, to substantiate this claim:
A "wonderful community" (TM) keen to bend over real problems
(many, many over the years).
A set of languages and components with rich semantic properties,
that allow composing new expressions if (when) the need arises.
Sometimes we have been pushing for new developments (we've been data-bound
since day 1), other times we have been pulled by the development of Condor.
A valuable, old school looking glass at the time of multiple, competing,
semantically
equivalent and syntactically incompatible developments that have the
lifetime of a meteor.
A wonderful community (TM)
(Miron Livny's closing slide for >10 years of Condor weeks)
The Milan Tier-2 Centre
A modest 400-core, 1.5 PB site.
Running Condor (the HT is silent, remember) since 2008, first on half of the nodes,
then on all of them. The site was plagued by storage issues, but no uptime
was lost because of Condor.
Main issue, as mentioned yesterday: trying to morph with the other sites and
mimick the multiple 'queue' setup of other sites.
Had to provide (to whomever asked: for a while it was only "us" running Condor
at a WLCG site on the continent, and Santanu Das at Cambridge off the continent),
duct-tape scripts for job submission (Cream and BLAH), monitoring and
accounting.
Times may be ripe to limit the flux of this setup.
We can be reached at blah@mi.infn.it.