First Solaris-based contract in four years
For more than four years, I did 100% linux based work. But apparently there
are still people interested in Solaris stuff, since I just got my first solaris
based contract in quite some time.
Spent an incredible amount of time getting Solaris 9 installed on my Ultra 5,
which was only running Linux before. I never understood how Sun could rectify Solaris being so much slower than Linux on their own hardware ;)
[ |
permanent link ]
Proceedings of Developer Workshop 2004 online
I finally managed to finish the write-up and markup of the proceedings. They
are available in a number of formats at the documentation section of the netfilter home page.
In theory, there could still be lots of semantic markup added, but well, who cares...
[ /linux/netfilter |
permanent link ]
pkttables finally making some progress
I've found some time to work on pkttables again. Isn't that great news? If my
brain is not completely broken, I've now worked out a RCU-powered way to have
full table traversal with a completely lock-less reader path, while providing
atomicity either on table- or chain level.
Also, I ripped the "struct nf_attr" and NFA_xx macros from the nfnetlink core,
since they get replaced by my vTLV (Versioned TLV) code.
With some luck I'll be able to continue my pkttables work next week
[ /linux/netfilter |
permanent link ]
CLUSTERIP is in patch-o-matic-ng
About one year ago I did some work for SuSE
in implementing load-balancer-less load-balancing clusters ;) This is achieved
by replying to ARP requests with a link-layer multicast address, so all nodes receive all packets. Hashing parts of the ip header now determines whether the packet is to be passed up the stack on a given node.
The result is called the iptables CLUSTERIP target, and I've now finally put it
in patch-o-matic-ng, since it was only available in my undocumented public CVS
tree so far.
[ /linux/netfilter |
permanent link ]
Siemens is violating the Settlement
Siemens is offering the SE-505 firmware on their homepage without any reference
to the source code, the GPL, or the GPL text. This is in violation of the signed settlement agreement that I have concluded with them.
The lawyer is already informed, and we'll see what kind of legal options we now have in pushing Siemens [again *sigh*] for GPL compliance.
[ /linux/gpl-violations |
permanent link ]
Reworking the Linux neighbour cache
Since I've lately had some customer issues with regard to neighbour cache
overflows, I studied the current code quite a bit. From my point of view, it has a couple of shortcomings.
The general problem goes like this: What do we do, if we're attached to let's
say a /16 (formerly 'Class B') network that has a theoretical limit of 65535
neighbours at layer 2, and somebody sends us a single packet for every one of
those neighbours. We now start to send ARP requests for all those neighbours,
and until those time out (1sec default), thus flooding our neigbour table.
The current Linux strategy is to configure a static limit (default: 1024), and as soon as we reach the limit, we start deleting old entries. 'old' entries are those for real hosts to which we've recently had connectivity... We do not expire any of the incomplete neighbour entries in order to avoid ARP-floods.
So if you want to avoid that, you always have to set the gc_thresh3 value to at
least the theoretical number of total machines that could be directly reachable
at layer 2. While this is not a problem with /16, it suddenly becomes one with
/8, or with the extremely large IPv6 prefixes.
The problem is further increased, since the number of hash buckets is very low
(static number of 32), and the used hash algorithm apparently has a bad
distribution. So either we increase the hash table, increase the number of
buckets and improve the hash algorithm, or we change the expiration scheme to
also drop incomplete entries. But the current situation is definitely not good.
So I picked up some old 2.4.x patches from Tim Gardner, ported them to 2.6.x
and brushed them up. The number of hash buckets is now a kernel boot
parameter (if not specified, the hash is dynamically sized, like the TCP
syn-queue, fragment queue or ip_conntrack hash). The hashing algorithm now
uses a Jenkins hash, just like all other parts of the kernel use, too. The
patch is in testing at my machines at the moment, but I think I'll push it
soon.
[ /linux |
permanent link ]
libiptc2 bugfix (upcoming iptables-1.3.0 prerelease)
Since the segfault-bug in my recent re-implementation of libiptc has now been
fixed, I think we're about one week before a iptables-1.3.0 prerelease for
public beta-testing.
[ /linux/netfilter |
permanent link ]
NAPIfied natsemi driver
I've now successfully NAPIfied the second NIC driver: natsemi.c... this was the
only remaining driver that I care about, since it is used in the PC Engines WRAP embedded systems that I use
as routers/bridges/wlan-gateways.
The result is that I can now get about 34kpps routed on an embedded 266MHz
Geode CPU at full 148kpps 64byte single-flow udp flood on the input NIC.
[ /linux |
permanent link ]
Adding NAPI support to the sungem.c Ethernet driver
Yesterday I implemented NAPI support for the sungem.c driver. This was done
because I was annoyed by the fact the my notebook (Apple Powerbook with on-board
Gigabit Ethernet) could still be killed by a machine running pktgen and
flooding it with some 700 kpps.
After submitting the patch, David Miller pointed out that he has added NAPI
support to sungem.c to the bitkeeper tree about four days ago :( So I spend a number of hours in duplicating work that was already there... not that I didn't have other stuff to do.
Well, at least I learned a bit more about Linux NIC drivers..
I'm now facing the task of implementing NAPI for the natsemi.c driver, which is
used in the PC Engines boards that I've
been using recently as embedded Routers / Firewalls.
[ /linux |
permanent link ]
Working on the summary / proceedings of the 3rd netfilter developer workshop
Spent a couple of hours putting the notes of the 3rd netfilter developer workshop together in a single file, adding lots of Docbook-XML markup, ...
It's still far from being complete, but I have to finish this ASAP..
[ /linux/netfilter |
permanent link ]
Intel e1000 (82546) TX performance
After recent discussions with Robert Olsson at the netfilter workshop, I've
decided to investigate a bit further, why the Intel e1000 gigabit MAC's are
quite limited when it comes to TX performance and large numbers of pps.
My first assumption was that the in-kernel pktgen.c code might not keep the
transmitter busy at all times, resulting in only 760kpps (out of the
theoretical maximum of 1480kpps).
So I hacked the e1000 driver to hardcode a refill of the Tx queue with the same
skb over and over again. Using a 2048 Tx descriptor ring, I was able to keep the transmitter busy at all times (E1000_ICR_TXQE interrupts).
Unfortunately, I still didn't get more than the 760kpps in this setup (PCI-X,
66MHz, Dual-Opteron 1.4GHz, DDR-333 (PC-2700) RAM. So either we're seeing a limitation of the 82546 chip, or the PCI-X bus / memory latency / whatever.
I'll try the same experiments on a different machine with PCI-X 100 / 133MHz in order to find out what exactly is causing this limit.
[ /linux |
permanent link ]
netfilter workshop / Linux Kongress 2004
I've not been able to write any articles for this log over the last few days,
since I've been busy with the third netfilter developer workshop and
Linux-Kongress 2004.
The netfilter workshop went really well, apparently the
[ /linux/netfilter |
permanent link ]
Started a new 2.6.x based mini router distribution
I'm in the process of deploying a couple of PC Engines WRAP.1C embedded x86 boards deployed in my apartment. They make neat little playgrounds for Router/NAT/VPN/WLAN/... style appliances.
Unfortunately I didn't find any embedded Linux distribution project that was up
to my demands. Apparently they all use age-old kernels (2.4.17 or something
ancient like that). And they very rarely come with a decent automatic build
system that would allow you to rebuild it from scratch, adding your own
patches, ...
So what did I do? I started my own :(. Not that I'm proud of it, but it was
necessary. My home VLAN/firewall/PPPoE/NAT/VPN router is now running the
very first image of this new distribution I called 'gRouter'.
It's main features are kernel 2.6.8.1, uClibc-0.9.26, busybox-1.00rc3, pppd
with in-kernel PPPoE support, quagga, iptables-1.2.11, openvpn-1.6.0, and
dropbear for SSH. It all fits in about 8MB of compact FLASH.
The build process is semi-automatic, apart from a few glitches the whole image
compiles itself. I stole some of the build magic from the WISP-DIST project
(part of LEAF), although this is all quite simple scripting.
After some more cleanups and testing, I plan to release this distribution.
Please don't expect any support, or any configuration tools. It will be
available for Linux experts who can configure and setup their system from
scratch, and want to have the gadgets of the latest software releases.
On the todo list is cross-compilation support (well, since it is uClibc based, it already does cross-libc-compilation), madwifi support, and especially IPsec using the 2.6.x kernel implementation.
[ /linux |
permanent link ]
Getting the external VGA of my Apple Powerbook (TiBook IV) working
If you've attended one of my presentations during the last 12 months, you will
certainly have noticed the poor quality of the slides. Yes, the content and
the presentation is poor, too - but I'm mostly referring to the optical quality.
I've already spent at least a whole day in the past in trying to get the
external VGA working with Debian/ppc, with little success so far. I really
don't care whether the external port mirrors the content of the display, or if
it runs in dual head mode.
Today, I spent some three more hours in trail-and-error with the radeon driver
of the dri-trunk XFree86. I tried CloneMode, Dual Head, with and without
FBMode, and about any other parameter within XF86Config-4.
In the end it turned out that the man page was not up-to-date, and the
preferred way to get it running was the so-called MergedFB mode. This wasn't
as easy to configure as expected, and I still got lots of 'Signal 11'
segfault-style crashes.
The crashes seem to be totally unrelated to my graphics setup. In fact, it
crashes when eth0 is not configured yet, but works after the network device is
up. Now please somebody step up and explain...
[ /linux |
permanent link ]
Finishing preparations for upcoming netfilter developer workshop
I've spent a significant amount of time over the last couple of days with the
final preparations of the upcoming 3rd netfilter developer workshop. This is
the first one where I'm in charge of every tiny bit of the organization, and I
hope I got everything right.
The first attendees are scheduled to arrive tomorrow. They might even arrive
before me, since I'll be heading the 500km down south tomorrow.
[ /linux/netfilter |
permanent link ]
|