Another episode of the 4-weekly statistics and some additional information

  • Growth is steady. Last month we’ve seen a 4% increase.
  • Crashes and ANRs (unresponsive screens) are again slashed. In the last 7 days (Taken because now most are on 1.37):
    • we have seen zero ANRs, cool!
    • we had 4 crashes, two were instances of the same problem.

The 30 days crashes and ANRs statistics now put us way above the “bad behavior thresholds” google has defined, and solidly into the second quadrile of all applications. We strive for the third quadrile, meaning less issues than the median of all applications in the Play Store. Though you might not notice “less crashes” actively as a user, it is improving the perception in the long run a lot. It is also bloody hard work!

ANRs and crashes are measured as percentage of affected daily sessions, where a daily session is defined as a day where an individual user used the app at least once. For example: suppose on Monday 20 users use the app once or more, and on Tuesday 30, then there are 50 daily sessions in those two days. If one user has 2 crashes on Monday, it counts as one affected session. If no crashes occurred on Tuesday, the crash rate over those two days is 2%. Google defines “bad behavior” for crashes as more than 1.09% over the last 30 days. Given that we see roughly 100 reported daily sessions per day, only one crash per day is “allowed”, though the median of all apps is 0.32%.

Now there are two important notes to address:

  • We only see total numbers. The reports are heavily anonymized and not generated at all if the total number of daily sessions for them is too low. This seems to be to avoid pinpointing users, which is a good thing. Though I admit I sometimes see errors where I think “how did the user manage to do that?”
  • Only users who allow sharing of usage and statistics provide this data, so we have no way of knowing how many actual daily sessions are operated. While we understand the privacy impact, especially towards google, we are very grateful to users who do have this option set to on as it allows us to notice issues and act fast on them. To check or change: Settings > Google > Three dots menu > Usage & Diagnostics > On / Off

Four weeks after the first statistics post, a short update on the subjects touched.

  • We’ve rolled out 2 new versions since. Well technically 3, but one was in error and superseded within an hour;
  • Growth the usual 4+% over the last month;
  • Massive (75+%) reduction in crashes and non-responsive states (ANRs).
  • I speculate April will see more of the same as the version released just 3 days ago tackles again 9 of those issues. Our goal is to at least half the crash and ANR rate again.
  • Three crashes after release are already fixed and committed for the next release, one of which is an issue that is very specific for Android 7.1. and one specific for Android 8.1.

We’re slowly heading into more esoteric territory where problems are less bug fixing and more avoiding platform issues on specific Android versions and/or on specific phones.

Please note that every single crash or ANR is anonymously reported through google *). While annoying do realize we investigate and almost always fix each and every one of them. The changelist is always available here.

*) if you didn’t disable that of course. Here’s Google’s wording on that verbatim:

You can view your app’s technical performance details collected from a subset of Android devices & OS versions, whose users have opted in to automatically share usage and diagnostics data. Learn more

CanZE has no “phone home” capabilities, but Google acquires quite a bit of data and I thought it would be fun to give you a bit of an insight at what we’re up against.

  • We are seeing a fairly consistent growth rate of 40% per year. The metric we use is “Installed on devices that have been online in the last 30 days”. As I write this in early March we’re seeing a total of roughly 4400 active installs.
  • Not surprisingly more than half of the installs are in Germany, France and the UK.
  • The top 6 devices are all Samsungs and a quick addition of all Samsung branded devices added up to over 1300.
  • Of the operators what was interesting to see is that about 20% have no operator listed. I interpreted that as devices having a second life without SIM card for basically CanZE only. I like that. Although those would be slow to update and probably miss out on the news bar.
  • Android versions: about 2700 are on Android 7 or higher, but believe it or not, 60 are on Android 4 and just over 300 on 5.

For health statistics we get quite detailed aggregated reports on crashes and hangs. The last couple of weeks you have seen quite bunch of new releases and that is because we really stepped up our efforts to root out as many as possible and as soon as we see them. To give some sort of idea, in the last 7 days, and filtering out devices that have not been updated for months we’ve seen:

  • One non responsive screen
  • Eight different clusters of crashes, 6 of which were reported only once.
  • The two were basically the same and accounted for 13 actual crashes. It is a silly bug in the Tyres screen.

As you can imagine it is that last problem we try to quickly focus on and it shouldn’t be a surprise that it is already fixed in the development branch and it will be fixed in the next release. And so are 3 of the single instance ones. For those interested, you can always check out what is in the pipeline here.

When we release about 35% of the active devices are updated within a day, and 70% within a week. But also note that if we assume the three last releases to be “current”, about 18% is not in that bracket after a week. A year after a release is superseded we still see about 2% of that release on active devices. And that is why we need to filter out some of the crashes.

Unfortunately we can’t see how much CanZE is actually used*) so it’s not easy to put those in perspective, but then again, less than 3 crashes per day on a 4000 installed base is too much but not crazy.

*) No, the new news bar does not tell us that. It fetches the news bar from github and we don’t have statistics about it’s usage.

While looking for some stretched screen, I came up with one of these new dash camera devices (1280×400 pixels). OK, the camera in itself doesn’t really interest me, because I only want to use it to make CanZE run on it, which is actually quite easy to do.

The only thing on that device that I do not like at all, is that the USB connector seams to be used only for power, so I can’t use it as development device out of the box but need to compile and transfer the app package in order to be able to install and run CanZE. 🙁

But at least the PlayStore is available out of the box and all underlying Android settings can be reached easily …

Micro introduction: A CANbus, which is used to connect all computers in the car, can only carry chunks of 8 bytes plus an ID, called frames. All data used to actually operate the car, such as switch positions, speed, and hundreds of other parameters are stuffed into unique frames and send out freely, almost always at fixed timed intervals. This is why we call them free frames. There is zero standardization among car makers on the meaning of the ID and the bits inside. Commercial dongles are not really designed to pick these up and have a lot of trouble doing so reliably or at all. (hint: “Timeout on ATMA” anyone?)

To actually diagnose the car, far larger “messages” are required. Even a VIN doesn’t fit in one frame, let alone i.e. the data for the voltage heatmap. For this purpose there is a protocol called ISO-TP which allows you to send longer messages. A software layer chops it up in frames, adds some synchronization data and sends it on. Exactly the same happens for long answers. ISO-TP formatted messages are almost exclusively of the query-response type. One participant (say the dongle or the dealers diagnostics tool) requests something, a computer on the bus answers. Even entire firmware up- and downloads are performed this way.

Why is this relevant? The ELM327 based dongles have basic support for ISO-TP. It’s a pain to set up, but it works and it’s actually what they are designed to do. The caveat being it only works for receiving long messages. We never gave this a lot of thought as the queries we put to the car always fit one single frame. Until the TPMS requirement came up. Writing the valve ID’s requires sending a long message. We implemented “long message ISO-TP” from the get go in the CanSee DIY dongle, so after tedious debugging, we knew setting TPMS worked, but now we had to tweak the driver for the ELM327 dongles to support long messages.

Luckily we were not the first. This nut was already cracked by Cedric Paille, the hero who made DDT4All. By carefully going through the logs of DDT4All, we could modify our ELM327 driver to now also send long messages.Thank you Cedric! The crazy thing is that if you use a commercial dongle, it does quite a bit of the ISO-TP hard work for receiving frames, but for sending quite a bit more is done CanZE and suddenly it is timing-relevant. This is what actually held up a new CanZE release.

Teaser: we have more things up our sleeves regarding not just the CanSee DIY dongle but also for the good old ELM’s. But first: a few final test and then release. As always, stay tuned.

You might want to have a peek at this “Frankenstein contraption” testing rig. Things have moved on since (more on that in a later post), but this is how projects like this start. Note that I laid it out neatly on my desk. Real life testing is often a lot messier, often smacking all of this, plus a power bank, plus my laptop, plus my phone in my car!

I do all CANbus connections using RJ45 connectors with a specific pin layout. On the breadboard is the ESP32 development board, powered, flashed and debugged through it’s USB cable. I use PlatformIO on top of Atom on Ubuntu, after having recently switched from the Arduino IDE. The dev board is slightly development unfriendly in that it leaves no pin row on one side. Luckily, that side only needs Vcc, so a small green wire pops out under the board.

On the left side is the 3.3 volt CANbus transceiver. powered from the dev board. From here, a short grey cable carries the CANbus. The T in the middle connects the bus components. On the right a short wire to the SAE J1962 plug going to the car, and on the left is my good old GVRET device based on an Arduino Due and the accompanying SavvyCAN desktop software to spy on the bus and see what is going on. This proved invaluable again. At first the wrong bus resistors made the controller refuse sending packets. Later on it confirmed the car computers did fine answering, but there was a bug in the firmware interpreting that.

The silver block is just a connector, but when this is not connected to the car, I can replace it with an identical block with a 60 ohm resistor over the bus terminals to create an independent CANbus-on-my-desk. Note that to test a CANbus device such as the ESP32, you not only need that bus termination resistor, but also at least one other device to set the acknowledge bit of each frame transmitted by your device under test.

We might have ironed out most issues with the DIY dongle. If there are people willing to build an ESP32 based dongle, especially if they own a Q90, R90, or R110 model, we would like to hear about it. Requirements would be:

  • Preferably drive one of the above models.
  • Willing and able to build a dongle, which requires ordering stuff through Aliexpress or ebay HK, basic soldering skills, and either some basic knowledge of using PlatformIO with Atom (or VSCode) with git, or the ability to upload binaries to an ESP32 development board from the command line.
  • Fool around with it, and be willing to cycle quickly though different updates of CanZE or the ESP32 code.

The KonnWEI dongles are, though the most stable in terms of what you get when you buy, not the best to run CanZE with, especially when fast performance is required. Think like the Driving, Braking and Consumption screens. For the technically inclined: this is because these type of dongles are designed to query, in their own pace, car computers using the ISO-TP protocol, where we had to misuse them to also intercept the raw, operational data. Most of that data can also be obtained through ISO-TP, but again, it would have been slow and we would have to do a lot of reverse engineering to make that work.

And then of course there are a ton of dongles that don’t work at all as they have severely stripped functionality.

With the availability of cheap ESP32 micro-controllers, that can do CAN, WiFi and Bluetooth, time had come to finally build our own hardware. In the next posts, I will describe the hardware, the software and the testing. For now suffice to say that we have something working over Bluetooth, with an unmodified CanZE instance on Android, for under 20 euros hardware, and it is blazing fast.

For those who want to follow in our footsteps and want to build their own dongle, let me start with a shopping list. Especially useful if you order on AliExpress!

  • ESP-32 development board. Maybe something like this.
  • CANbus transceiver board. Needs to be 3.3 volt, so for instance this.
  • Some sort of housing / SAE J1962 (“OBD2”) connector. My advice would be to buy the cheapest dongle you can get and gut it. You should be able to do that for under 3 euros.
  • A small 12 to 5 volt converter. While the ESP development board can take 12 volt, that is a maximum and I wouldn’t advice to run it on the car’s 13.5 volt. Example.
  • Some veroboard, wire, and other generic craft and soldering stuff.

Stay tuned!

Fred Leudon disassembled his Q model BCB himself and here a few of his pictures, now including the filter module!

Note that what I earlier identified as the flyback diode is actually a 63 uH coil. Also visible is the modest PCB of the rectifier module.

 

Here is the filter module, still closed. For orientation, note it is held upside down and the orange connector (normally connected to the loom going to the car’s “nose”), points to the front of the car when mounted. The four cables exit in the direction of the rectifier box; left side of the car. Also note that the N wire is substantially thinner than the three L wires (see below).

 

And here the filter is opened up, rotated 180 degrees when compared to the previous picture. The crimp on the center orange cable has it’s shrink wrap insulator removed. It was not properly crimped on the coil wires which made this module fail. The coils seem to be used both as filters as well as current sensors. Also note the black plastic box and a substantial PCB pair.

Now there is confusion is about the N <> L3 relay. My stance, based on what I saw on Renault provided schematics and me hearing clicking sounds, the black box should house said relay. The original author swears there was no power capable relay in the entire module, and that ZOE uses two diodes to N. At the moment, both stances are incompatible and we have no way to verify one or the other.

The much thinner N cable suggests there is substance to my stance, or the two diodes should be in this module. But that doesn’t jive with me, since all the rectification is done in the other box, but I am biased and could be totally wrong of course.

More investigation is needed. Maybe I will get brave and open mine……..

Thank you Fred and forumpro.fr user “Pixel”. Here a link to the original source and discussion.

Just like in brain research, often a lot can be learned when things go wrong. A friend driving a ZOE was struggling for months with the weirdest problem. The car charged fine on public chargers, but not at home. However, that home charger did it’s job fine on several other ZOEs. Dealer was helpful but couldn’t find a thing; charger supplier found nothing wrong.

Sequence of events was:

  1. cable plugged in and chargepoint light goes blue;
  2. the usual relays clicking noises from the car (the battery, and the 12 volt bus);
  3. the usual “CLOINK” of the contactor closing in the chargepoint;
  4. after 15 seconds and a bit of clicking in the car, contactors open, light goes green and everything stalls.

All this time, the dash shows “Ongoing checks”. No error, no red nose, but no charging.

After a few weeks of faffing around, trying here and there, including his fivari charger, he is suspecting it is one phase charging that fails, but three phase is OK (hint one). Everyone (yours truly included) says that is very unlikely. In private, he tells me he hears “electric sparking noises” from under the bonnet. Oh dear!

Finally, Renault NL is involved and I am gracefully invited / allowed to join in. So I head over on a misty Friday morning to his house. Three ZOEs present! CLIP tool hooked up and indeed an error is presented (DTC064063), suggesting either chargepoint, cable or filter in the BCB (hint two). All are a bit miffed the dealer missed this.

Then we open the bonnets of two ZOE’s and hook up the charger to each. Lo and behold, his ZOE made some soft, but scary noises the moment charging is supposed to start, just after the “CLOINK” (hint three). It’s not sparks, but it sure isn’t good, more like a rattle. The Renault tech pulls up the functional schematics and explains what might be wrong. To make a long story short: ZOE rectifies current from the 3 phases using a “three-phase full-wave rectifier”.

Note that the N (neutral) is nowhere to be seen. What ZOE does is when you connect single phase (between L1 and Neutral), a relay connects the N wire in the feed line to L3, so now the juice is between L1 and L3, and since L2 is not connected to anything, all is fine. Obviously said relay is not energized when on three phases. It is located in the filter module (see this post). It was this specific relay, or it’s control circuit, that had failed. Friend did a “yessss!!!” as he finally had a diagnosis and as he had confirmation he was right about the single phase after all.

The car has been repaired and is right as rain again. I am hoping for some more info on the filter module; how it works and what went wrong.