A day full of new hardware problems
It wasn't sufficient enough that our main build server had memory corruption
yesterday (in which case no RAID1 will help, because the buffer cache data is
corrupt and gets written to both disks corrupt).
Today, I had the pleasant experience of finding something like three more or
less independent severe hardware design bugs in the Neo1973. I know this is
really sad news for those of you who are eagerly waiting for their "Phase 0"
devices. But firstly, you have to understand how sad _we_ are about all this.
Even more so, specifically, how sad I am, personally... [now working 14hours
straight on this issue].
I was not involved in the early hardware design of the Neo1973, and was hired
as a pure systems level software guy. Over the progress of this project, I've
been involved more and more in hardware fixes / reviews / redesign. And it's
been only now that I've had a more detailed look at the suspend/resume/wakeup
related bits. Given the previous series of hardware bugs I should have
probably been more cautious and thoroughly review the whole design from the
beginning, but then: It is complex, time consuming, and I'm no hardware
engineer either, just joe random hacker.
The good news is that we are able to fix all this in the next version
(GTA01Bv4), and that there are likely-to-be-working hot-fixes for the
already-produced Phase-0 devices.
Originally USB DFU support for u-boot was supposed to be finished yesterday.
Now with those two days full of most serious hardware problems (server,
prototype), I wonder what's going to prevent me from working on DFU tomorrow.
[ /linux/openmoko |
permanent link ]
OpenMoko now runs 2.6.20
Despite the much-feared genirq and workqueue changes, it turned out to be way easier to
merge our patches to 2.6.20 maintain than reviewing and back-porting all the
relevant bug fixes from 2.6.20 down to our old 2.6.17.14 based system.
We probably wouldn't have been able to do this if Phase-0 wasn't held back due to
Bluetooth hardware problems. So everything seems to have its positive side, too :)
Ben Dooks (S3C2410 Kernel Port Maintainer) has already picked some of our
patches and is merging them, which is good. He also fixed a s3c2410fb bug in
vanilla 2.6.20 which I discovered [and just worked around by porting s3c2410fb
from 2.6.17.14. into 2.6.20, lazy as I am]
Today, I spent much time on restructuring our u-boot patches (getting them
ready for submission) and actually submitting the first nine patches to the
u-boot-users mailing list.
I also spent some time on proprietary software, after a _long_ time. I'm
trying to get TI's GSM Modem Firmware updater ported from Windows to Linux.
Eventually I want to be able to re-flash the GSM firmware from the S3C2410
side, not involving any PC and especially not any proprietary operating system ;)
I would love to see the firmware updater going public, too - but given the
nature of the GSM business, that chance is close to zero. Which is ridiculous,
since it doesn't reveal anything important at all. The GSM Modem will verify
the cryptographic signature of the firmware image anyway, no matter what the
downloader does. But well, we have different problems to solve than to engage in
endless discussions anyway...
[ /linux/openmoko |
permanent link ]
The first peak of load on openmoko.org servers
For the last seven hours I've been trying to organize the openmoko.org server
setup into providing more efficiency / performance. It was really amazing. We
were not on slashdot, not on any of the major news sites, but we were already
having something like 40MBps aggregate outbound traffic peaks on our servers.
The two major performance bottlenecks were ViewCVS and Mediawiki. Quickly I
installed memcached on one of our more idle boxes, and put two squid instances
on two separate machines in front of the mediawiki, which then seemed to do
mostly fine.
The ViewCVS apparently cannot be helped at all. What I found on the web is
that it's apparently just very inefficient code, and there's little one can do
without rewriting the code. I don't know whether that's still the current
situation, but when the next peak comes, I'll probably just disable ViewCVS to
save some CPU cycles.
In case you're interested: Our setup is currently running on four physical boxes,
running a total of ten OpenVZinstances.
One of the machines is dedicated to the GForge installation on projects.openmoko.org, the other one
a (at least intended to be) dedicated buildhost where we do our OE builds.
While this obviously has been quite enough for the last half year, we now have
different performance requirements. For Phase-0 this installation is probably
still quite sufficient, since this first-couple-of-days peak is bound to cease
at some point.
However, When Phase-1 starts (public availability of phones to developers by
means of direct order), we will definitely need a more sophisticated
infrastructure for our downloads.openmoko.org site, from
where we will make available the full source code that our OE builds need, plus
the full source and object code of every binary release we make. The idea is to
have a round-robin DNS setup of geographically distributed machines.
So as of now, we're soliciting mirrors with large disk capacity. If you want
to help us by providing a mirror (expected capacity requirement for 2007:
something like 300GB) and bandwidth, please contact me at laforge@openmoko.org. We're
particularly interested in the US and Asia regions.
Apart from that, a couple of secondary DNS servers would also help improving
our availability. If you already have a bind installation somewhere and want
to become a secondary DNS for openmoko.{org,com,net}, please contact me, too.
Finally, the good news is: We're down to 2..4MBps for now. Until the news appears on
major news sites, I guess.
After having discovered that mailman doesn't use templates for its
listinfo_overview page (the one you get when calling /mailman/listinfo without
a list name), I quickly hard-coded our openmoko header into the python code.
If somebody feels motivated to add proper templating support to all mailman
pages (such as the 'options password prompt' and the before-mentioned
listinfo_overview), you could help us out even with something like that.
Now I'll finally turn back to actual moko code. We have frame buffer support in
u-boot since two days ago (splash screens are important for marketing), I'll
now look into getting the CPU clock to 266MHz (we've had some issues with this)
and finally, after all, TS07.10 multiplex and gsmd infrastructure, for which I
consistently haven't found time until now.
[ /linux/openmoko |
permanent link ]
openmoko.org goes public
Today, We have added public access to
[ /linux/openmoko |
permanent link ]
OpenMoko / Neo1973 delayed, once again
As you can read in this announcement, there
will be another [slight] delay in the release schedule for OpenMoko and the Neo1973 phone.
I'm really sad and sorry about this, since the core team has been working
_very_ hard for the last couple of months to get this project somewhere.
However, a combination of Murphy's law, our high demands on quality at every
level, communication problems and a lack of FOSS-experienced developers
have made progress quite a bit slower than expected. I won't even tell you how
far we are behind the original internal schedule [It's an internal schedule, after all].
For somebody like me, who has primarily worked with and in the FOSS community,
even in his day by day professional career for the last decade, there have
been many cultural problems in this project.
I've originally been hired to take care of the low-level aspects of the system,
i.e. boot loader+kernel porting, driver development, and last but not least the
GSM communication infrastructure, as well as general consulting with regard to
FOSS matters.
In the end (up to now) I have been doing tons of more things. I've been doing
hardware related debugging, hot-fixing and consulting, providing lots of
support for our internal development team, doing all the system administration,
configuration and maintenance of our four physical and about 15 virtual
machines (wiki, lists, gforge, svn, build server, etc.). Today I even spent a
lot of time on web related issues [hey, I haven't done much web stuff since
HTML4 and CSS1 came out], since we have committed to go public with our web
sites public at some point.
We've had to teach people how to use request tracker, bugzilla, subversion,
mailing lists, IRC. Those basic means of communication, natural for everyone
ever involved in a FOSS project are all things that we had to bootstrap here.
Many of the things that are a complete given for me (and even us, the rest of
the core team consisting of Sean, Werner, Mickey and myself) are not at all
known, valued and/or respected [yet] by the various people and entities we had to
relate in this project.
To give you a short outline about some of the issues we have been fighting
to create our vision of a truly open device, targeted by developers for developers:
So e.g. for web designers, it's hard to cope with the demand that web pages
should fully scale, not have any fixed-pixel width graphics/style, not contain
flash, only make careful use of javascript, and should always pass XHTML / CSS
validation. We don't want 'hacks', but clean themes/styles/templates in the
native drop-in format that the respective applications (mediawiki, mailman,
bugzilla, gforge, etc.) support.
For source code, people who have always worked in closed-source environment,
and even with a concentration on embedded 'one time throw away' devices, it is
a cultural change that the source code has to be maintainable, that it has to follow
coding style rules, and that it has to be "pretty", since people will read it,
and it will make a bad impression if the code quality sucks.
Our code is not OK if it builds once on some system, but the applications need
to use autotools or similar mechanisms, be packaged for OE use, the package
description needs to be verified and debugged, and the resulting builds need to
be reproducible one everyone's system.
We do not use a vendor-supplied toolchain, but rather build our own. And yes,
that toolchain also needs to be built 100% from scratch, and that process
has to be solid and reproducible, just like the packaging.
For the hardware, we have to provide the interfaces that usually nobody wants
to give people access to (serial console, JTAG, ..). We have to make it easy
for people to update their firmware, rather than hard. We have to have
safeguards in hardware, since we can't prevent people to reconfigure the
battery charger algorithm in software. The battery should still survive this,
no matter what happens.
Preferably using hardware components with open documentation (versus "the
cheapest available that does the job") is a design criteria that almost nobody
in the industry will be used to.
Also, whenever we use hardware with specs under NDA (which we tried to avoid in
all place, the source code has), we have to submit that code to the vendor, and
ask whether it is fine to release it.
All in all, this is a quite exciting thing. Making people think different, trying
to get the values of Free Software into their mind set, even if only for this project.
Most of the people (as in numbers of people) involved in the development so far
have not had any relation to FOSS before. They might have done Linux based
development, but only because somebody asked them to get something done, cheap
and quickly. But that's now what we're after.
But this process takes time, and a lot of strength. And it's hard to scale if you
only have three to four people who actually have a clue about all those things.
So on the one hand, we have to learn how to scale better. We have to involve
more people from the community, with the FOSS/hacker background, both as paid
developers and the excited volunteers. So far Sean, Werner, Mickey, and myself
have been powerful but 'lonesome' fighters in the corporate world with whom we
had to interact.
I'd like to express my deepest thanks to FIC for funding this endeavor that
must appear like strange little experiment to them. I'd like to thank for all
the support we have received from the FIC hardware and software development
teams. I know we've been always very critical, and probably still seem
unsatisfied with many things. But it's nonetheless a unique chance to be able
to do this at all.
I've suffered a lot during the last seven months, and I've never worked as hard
in my entire life, spending at least 80 hours per week on this. Despite that,
if you look at the actual results, both in software and hardware (in the
upcoming days), you will probably see way more things that need to be done
rather than have been done. You will probably think: What, you haven't written
more code in that entire time frame? All I can do is ask you to read this mail
to get a grasp about what has been going on.
From now on, I hope we can lead this project more into the community. Make it
like any other FOSS project. Work will be way more fun. More creative people.
More cool hacks. More freedom.
Please consider this as the beginning, not the end. We're just bootstrapping
the world of as-open-as-possible mobile devices. There will be much cooler
devices, and there will be much cooler software. I'm honoured to be given the
chance to take a leading role in this.
[ /linux/openmoko |
permanent link ]
USB serial support in Neo1973 boot loader (u-boot)
The last two days I was adding support for USB serial (cdc_acm compatible) to
u-boot for the Neo1973 phone. This is definitely a nice way to give developers
access to the boot loader prompt, without having to have any special cables, the
debug board, hackers lunch box or else.
Obviously, it's not the same as a real serial port. But given that you have a
working u-boot in flash, and given you don't want to do boot loader development
and rather concentrate on kernel/user space issues, then this definitely is a
nice option.
The basic cdc_acm patch came from the handhelds.org SX1
project, so I mainly had to add a s3c2410 USB device controller driver for
u-boot. Usually this is not that much of a challenge. The lack of
documentation by Samsung is compensated by the handhelds.org s3c2410_udc.c
kernel driver, of which by now I know every single line.
However, this one was really hard. I couldn't even get the control pipe to do
all the tasks to enumerate properly on the bus. And that after having
implemented the control pipe handling for a different device (OpenPCD) only
half a year ago, so I basically still knew quite detailed what had to be done.
I read, and re-read, and re-read the code. Looked and verified the assembler
output. At some point, I was convinced the logic of my code is correct. It
must be some auxiliary issue. PLL configured wrongly, GPIO settings not right,
whatever. I still couldn't find it.
After two days (one of which had 16 hours straight) it turned out that the
problem wasn't actually the logic of the code, but it was pure timing. The
u-boot usbdcore.c EP0 implementation had done a memcpy of 18 bytes in a
code-path that turned out to be extremely time critical. Just not copying the
device descriptor (which is only done for on-the-fly patching the correct EP0
packet size) everything immediately worked.
What a relief. I can finally get back to work on GSM related stuff.
[ /linux/openmoko |
permanent link ]
Federal "Express" - One month to get a customer account
Since sending hardware to Werner
Almesberger in Argentina using DHL seems to be suboptimal, I decided to
give FedEx a try. So I went to their web-site, and tried to register for a
customer account / number.
What struck me first, is that they require you to enter both land-line AND
mobile phone number. As if everyone had both these days. I know a lot of
people who either only have land-line, or mobile. And obviously there are people
like myself, who would never want FedEx to contact them via mobile at all.
Anyway. What I then got back was an automatic email (in German) indicating that
the respective employee is "Out of Office till 21st of February", and that
"e-mails to this address will not be processed during this time".
Whew, I thought. What kind of express. It takes only three weeks to get a
customer number. Maybe I should resort to UPS next. *sigh*
[ /misc |
permanent link ]
|