The IPv6 reality check
What we learned so far on the IPv6 distributed testbed
Francesco Prelz
INFN, sezione di Milano
(and members of the IPv6 group)
Summary
- Structure of the testbed.
- Enabled services (and why).
- File transfer tests.
- One step further: FTS.
- Trying to help: Uberftp.
- Conclusions
Structure of the testbed
- Participants, with 1 testbed node each:
CERN,
DESY,
FZU,
GARR,
INFN,
KIT,
USLCHNet.
- All installations have uniform architecture (
x86_64
) and
uniform OS (Scientific Linux 5) for full support of `WLCG' applications
and middleware.
- At least 1 Gb/s network interface (and corresponding bandwith to the
border gateway) each.
- All running at least one GRIDFTP server, giving access only to the
ad-hoc
ipv6.hepix.org
VO.
- The GRIDFTP installation is done in a standalone fashion, off
packages available on EPEL, and with a trivial/'plug'
authorisation/user mapping module that will map only one VO to
only one user.
- The aim is to prevent any interplay of IPv6 issues that
may be appearing at once in different components.
- Integrated tests make sense on existing instances of production
'node' types (e.g. CE nodes, SE nodes, etc.)
- All gory and boring details on the IPv6 testbed wiki page.
What does 'IPv6 tested' mean ?
Deploying individual services in a testbed aims to answer
these questions:
- Does the service break/slow down when used with IPv4 on a dual-stack
host with IPv6 enabled ?
- Will the service try using (connecting/binding to) an IPv6 address (AAAA record), when available from DNS ?
- Will the service prefer IPv6 addresses from DNS, when
preferred at the host level ?
Does this need to be configured ? How ?
- Can the service be persuaded to fall back on IPv4 if needed ?
Why GRIDFTP ?
- Non just a transport level issue:
FTP (yes, I mean Postel et al.'s RFC959/765 FTP)
is possibly the earliest service that implied the underlying IPv4 address
structure in its
PASV
and PORT
commands.
- These were "fixed" in RFC 2428 (Sep. 1998):
PORT 132,235,1,2,24,131
becomes EPRT |1|132.235.1.2|6275|
allowing EPRT |2|2001:123::200C:417A|6275|
.
- Additionally, the Gridftp protocol includes commands for striped transfers
(
SPAS
, SPOR
), that do handle both address/port
syntaxes in a possibly non-documented fashion.
- This is the likely background of the decision to leave
IPv6 support in GRIDFTP disabled by default: an excellent
cue to try flipping that toggle and test.
- Deliberately last, but definitely not least, GRIDFTP is still the
workhorse for WLCG data transfers.
The "CMS" file transfer tests
- Reliability test - not a stress/performance test - preliminary results
- Single 2000 MB file from IPv6 VM at CERN transfer to 4 systems
- globus_url_copy and uberftp to confirm file arrived then delete
- Tests have been running continuously since February 22nd 2012 (file size was 10x smaller earlier on).
Statistics since April 20th:
Site |
Total transfers |
Failed transfers |
Average transfer time |
Transfer time range |
DESY | 390 | 13 (3.3 %) | 66 s (~ 30 MB/s) | 41 - 425 s |
gridka (KIT) | 780 | 29 (3.7 %) | 130 s (~ 15 MB/s) | 110 - 439 s |
INFN | 1299 | 43 (3.3 %) | 66 s (~ 30 MB/s) | 34 - 549 s |
uslhcnet | 1299 | 28 (2.2 %) | 81 s (~ 25 MB/s) | 38 - 549 s |
The steady ~3% failure rate started occurring when moving a firewall at CERN from
software to hardware (!). Only 2 transfers in total had failed in earlier tests. -
Spooky: to be investigated.
- Can still conclude: no show-stoppers. CMS PhEDEx should work.
One step further: FTS
- Dave K.'s idea: as we have a network of GRIDFTP servers, why don't we
overlay the File Transfer Service (FTS) ?
- FTS is not the simplest beast in the
big-G zoo: We summarize what we found when we tried deploying
it in the testbed.
- As we try our best to test services in isolation on the testbed, we
needed to figure out a recipe for standalone installation and
configuration of FTS as well. This wasn't easy, and we
had to base test on glite 3.2 repository, as many configuration
scripts still seemed not to match the EMI-1 installation at the
time (early March 2012) when we set the test up.
- cGSI-GSOAP
does not resolve IPv6 names up to version
`CGSI_gSOAP_2.7-1.3.3-1', still found on some production User Interfaces.
Gsoap
supports IPv6 on TCP since version 2.5 (2005), and on UDP since
version 2.7.2 (still 2005), but for a while it was apparently
compiled in without the
WITH_IPv6
flag.
FTS implies Oracle
- FTS just supports Oracle as a DB back-end, so, for the possible
benefit of other users of Oracle, we were immediatly sucked into
having to evaluate the level of IPv6 support of Oracle.
- Giacomo Tenaglia (CERN) luckily came to our rescue (see next slide),
and the testbed FTP service does now communicate with its Oracle
back-end at CERN via IPv6
- Oracle claims IPv6 compatibility since version
11g release 2. The FTS agents linking to both v10 and v11 Oracle
instantclient libraries appeared only with EMI-1
Update 13 on February 17, 2012, so
we were out of luck and had to rebuild them by hand.
- Any takers for extensive dual-stack tests of the Oracle
client/server communication via JDBC, sqlplus, libocci?
Oracle/IPv6 rollout issues
(material in this slide kindly provided by Giacomo Tenaglia, CERN)
- Oracle works over IPv6, basically it's enough to setup a listener on the
IPv6 address, the rest of the database configuration is unchanged.
- Caveat:
It is not possible to have an IPv6 listener on the same port used by an
IPv4 listener. It seems like this is due to the fact that the IPv4
listener binds to all addresses rather than just to the configured
address. This could probably be sorted out by playing with the listener
configuration file, but so far the recommendation from Oracle is to have
two separate ports.
- It is therefore mandatory to have an IPv6-only domain name to be used
in the listener configuration.
Then: will the traffic actually use IPv6?
- So, we got to the point where we had functional
FTS Transfer agents (and in general
Tomcat/Axis servlets) that could be
invoked on dual stack hosts and from dual stack clients
- but the
`urlcopy' agent would still use IPv4 for file transfer...
- This is because, exactly as in the globus-url-copy command and
in the GRIDFTP client library at large, IPv6
resolution in the Globus FTP client needs to be explicitely
enabled via the API (no runtime option so far).
- This is still not done anywhere in the SVN head of the transfer-url-copy
FTS component. We patched it for the testbed, but the issue
would probably better be addressed upstream, at the gridftp
library level.
- We opened GGUS ticket #80628 about this. A runtime option for enabling IPv6 was
promised by Globus "after the upcoming 5.2.1 release".
UberFTP
- UberFTP
is a GridFTP client tool that's often (e.g. in the CMS file transfer
tests) used interactively
(easier command line usage) or integrated in middleware.
- No IPv6 support, either at the transport or at the 'extended'
FTP command level, in the last release (2.6, August 2010).
- The author mentioned that an IPv6 patch could be considered for
integration. Trying to be good citizens, we
put an IPv6 patch together and sent it to the author last April 6.
- Got a reply two days ago: will try integrating it in a next release,
then possibly open GIT to external contributors.
- Interesting case: UberFTP always tries to do a reverse
DNS resolution of the target host, and uses the name it finds:
no way to force it to use IPv4 via DNS resolution for dual-stack hosts
whose canonical name resolves in both IPv4 and IPv6!
Conclusions
- From the FTS exercise proper:
- FTS services on IPv4 don't break on dual-stack hosts.
- Only what is used in production can be tested effectively (but we may be missing recent developments, and need to validate any conclusion on the development versions of affected components).
- Commercial dependencies do dictate their timelines and deployment options and may require large scale upgrades (and related testing).
- Functional IPv6 support in a software component does not imply that IPv6 transport is enabled by default. This is hard to capture in either a survey or by automated code-checking tools.
- Installation and configuration complexity of standalone components outweigh the extra trouble of pointing to individual components in an integrated service host.
- In general: there are a number of fine points and devilish details that
compose the "IPv6 ready" statement, and a simple testbed exercise
was enough to encounter many of them.