Cellular systems ever since GPRS are using a tunnel based architecture to provide IP connectivity to cellular terminals such as phones, modems, M2M/IoT devices and the like. The MS/UE establishes a PDP context between itself and the GGSN on the other end of the cellular network. The GGSN then is the first IP-level router, and the entire cellular network is abstracted away from the User-IP point of view.
This architecture didn't change with EGPRS, and not with UMTS, HSxPA and even survived conceptually in LTE/4G.
While the concept of a PDP context / tunnel exists to de-couple the transport layer from the structure and type of data inside the tunneled data, the primary user plane so far has been IPv4.
In Osmocom, we made sure that there are no impairments / assumptions about the contents of the tunnel, so OsmoPCU and OsmoSGSN do not care at all what bits and bytes are transmitted in the tunnel.
The only Osmocom component dealing with the type of tunnel and its payload structure is OpenGGSN. The GGSN must allocate the address/prefix assigned to each individual MS/UE, perform routing between the external IP network and the cellular network and hence is at the heart of this. Sadly, OpenGGSN was an abandoned project for many years until Osmocom adopted it, and it only implemented IPv4.
This is actually a big surprise to me. Many of the users of the Osmocom stack are from the IT security area. They use the Osmocom stack to test mobile phones for vulnerabilities, analyze mobile malware and the like. As any penetration tester should be interested in analyzing all of the attack surface exposed by a given device-under-test, I would have assumed that testing just on IPv4 would be insufficient and over the past 9 years, somebody should have come around and implemented the missing bits for IPv6 so they can test on IPv6, too.
In reality, it seems nobody appears to have shared line of thinking and invested a bit of time in growing the tools used. Or if they did, they didn't share the related code.
In June 2017, Gerrie Roos submitted a patch for OpenGGSN IPv6 support that raised hopes about soon being able to close that gap. However, at closer sight it turns out that the code was written against a more than 7 years old version of OpenGGSN, and it seems to primarily focus on IPv6 on the outer (transport) layer, rather than on the inner (user) layer.
OpenGGSN IPv6 PDP Context Support
So in July 2017, I started to work on IPv6 PDP support in OpenGGSN.
Initially I thought How hard can it be? It's not like IPv6 is new to me (I joined 6bone under 3ffe prefixes back in the 1990ies and worked on IPv6 support in ip6tables ages ago. And aside from allocating/matching longer addresses, what kind of complexity does one expect?
After my initial attempt of implementation, partially mislead by the patch that was contributed against that 2010-or-older version of OpenGGSN, I'm surprised how wrong I was.
In IPv4 PDP contexts, the process of establishing a PDP context is simple:
Request establishment of a PDP context, set the type to IETF IPv4
Receive an allocated IPv4 End User Address
Optionally use IPCP (part of PPP) to reques and receive DNS Server IP addresses
So I implemented the identical approach for IPv6. Maintain a pool of IPv6 addresses, allocate one, and use IPCP for DNS. And nothing worked.
IPv6 PDP contexts assign a /64 prefix, not a single address or a smaller prefix
The End User Address that's part of the Signalling plane of Layer 3 Session Management and GTP is not the actual address, but just serves to generate the interface identifier portion of a link-local IPv6 address
IPv6 stateless autoconfiguration is used with this link-local IPv6 address inside the User Plane, after the control plane signaling to establish the PDP context has completed. This means the GGSN needs to parse ICMPv6 router solicitations and generate ICMPV6 router advertisements.
To make things worse, the stateless autoconfiguration is modified in some subtle ways to make it different from the normal SLAAC used on Ethernet and other media:
the timers / lifetimes are different
only one prefix is permitted
only a prefix length of 64 is permitted
A few days later I implemented all of that, but it still didn't work. The problem was with DNS server adresses. In IPv4, the 3GPP protocols simply tunnel IPCP frames for this. This makes a lot of sense, as IPCP is designed for point-to-point interfaces, and this is exactly what a PDP context is.
In IPv6, the corresponding IP6CP protocol does not have the capability to provision DNS server addresses to a PPP client. WTF? The IETF seriously requires implementations to do DHCPv6 over PPP, after establishing a point-to-point connection, only to get DNS server information?!? Some people suggested an IETF draft to change this butthe draft has expired in 2011 and we're still stuck.
While 3GPP permits the use of DHCPv6 in some scenarios, support in phones/modems for it is not mandatory. Rather, the 3GPP has come up with their own mechanism on how to communicate DNS server IPv6 addresses during PDP context activation: The use of containers as part of the PCO Information Element used in L3-SM and GTP (see Section 10.5.6.3 of 3GPP TS 24.008. They by the way also specified the same mechanism for IPv4, so there's now two competing methods on how to provision IPv4 DNS server information: IPCP and the new method.
In any case, after some more hacking, OpenGGSN can now also provide DNS server information to the MS/UE. And once that was implemented, I had actual live uesr IPv6 data over a full Osmocom cellular stack!
We now have working IPv6 User IP in OpenGGSN. Together with the rest of the Osmocom stack you can operate a private GPRS, EGPRS, UMTS or HSPA network that provide end-to-end transparent, routed IPv6 connectivity to mobile devices.
All in all, it took much longer than nneeded, and the following questions remain in my mind:
why did the IETF not specify IP6CP capabilities to configure DNS servers?
why the complex two-stage address configuration with PDP EUA allocation for the link-local address first and then stateless autoconfiguration?
why don't we simply allocate the entire prefix via the End User Address information element on the signaling plane? For sure next to the 16byte address we could have put one byte for prefix-length?
why do I see duplication detection flavour neighbour solicitations from Qualcomm based phones on what is a point-to-point link with exactly two devices: The UE and the GGSN?
why do I see link-layer source address options inside the ICMPv6 neighbor and router solicitation from mobile phones, when that option is specifically not to be used on point-to-point links?
why is the smallest prefix that can be allocated a /64? That's such a waste for a point-to-point link with a single device on the other end, and in times of billions of connected IoT devices it will just encourage the use of non-public IPv6 space (i.e. SNAT/MASQUERADING) while wasting large parts of the address space
Some of those choices would have made sense if one would have made it fully compatible with normal IPv6 like e.g. on Ethernet. But implementing ICMPv6 router and neighbor solicitation without getting any benefit such as ability to have multiple prefixes, prefixes of different lengths, I just don't understand why anyone ever thought You can find the code at http://git.osmocom.org/openggsn/log/?h=laforge/ipv6 and the related ticket at https://osmocom.org/issues/2418