There’s a number of decisions that can be made at this point. The Beaglebone has HDMI out. There are all sorts of LCD screens made for the device – some of which have touchscreen… There’s also a webserver option, but I feel that’d force me down the road of web design or iOS app development.
I did want to stay in the Embedded Linux world though. Just plugging in an HDMI as the output seems like a waste. I also wanted to at least start with stuff I had lying around. Fortunately, I had one of these displays lying around.
Perfect. Hardware found. If I want to display RPMs, or power, or cycle through a few different things, this 32-character display should suffice. I wonder if there’s a Linux driver for a 16×2 LCD driver out there…
Well that first result looks nice. I should be able to start with this. I wonder what it was created for…
Oh wow. I’m running the 3.8.13 kernel on the Beaglebone Black. So it seems like this driver is pretty much the perfect fit!
The same steps from the GPIO were necessary.
Setting up a device tree overlay to configure the pin mux.
Configuring any parts of the driver that might be necessary.
Compiling and inserting the kernel module.
Testing and implementation
Setting up the Device Tree Overlay
The driver has hard-coded pins compiled in.
There’s no feedback from the LCD, so it is probably safe to assume they’re all outputs from the processor. Plus if you look at the fact that each pin calls gpio_direction_output, you can put two and two together. Each of these six pins then need to be configured as outputs. But there’s a bit of info needed for the device tree to be configured.
This site contains the GPIO mapping to their relative offsets in the CPU’s GPIO module. Yes, the processor’s TRM also has this information, but I read those enough in non-leisure environments. I don’t care to poke around those in my free time if I can avoid it.
The first pin – the reset pin – is GPIO 67, and is labeled as P8_8. Basically this is pin 8 on the connector 8. I can verify with that table here that Pin 8 (the last row photographed here:
… does in fact match GPIO 67
So I decided to trust this info, and what I was specifically looking for is the offset of 0x894 I mentioned in the caption above.
Comparing that table against the device tree overlay from Derek Molloy’s blog, I figured that I needed to subtract 0x800 from that offset. I can’t say I’m positive as to why that’s the case, but it is. With all that in mind, and going over all the pins, I came to this overlay:
Alright. Hopefully this worked…
Configuring the driver
Well this is a short section. It turns out I didn’t really have to change anything. None of the pins used for this driver overlapped with ones I’m using. I could change the pin numbers to module parameters. Meh… no need at this point. On to the next part.
Compiling the driver
On to the next step. For GPIO tachometer driver I’d started with the source and a makefile that built on the target itself. So I’ll just copy this over to the target, run make, and end up with a .ko file.
So another decision to make. Copy the file to the target and write a new makefile to build locally, or step up to the plate and finally set up a cross-compile environment. I decided to do the latter.
I had to figure out what version of the kernel I was running. A simple “uname -r” told me I was running 3.8.3-bone86. I need to pull down these sources locally, and they can be found at https://github.com/beagleboard/linux. A simple checkout of the 3.8.13-bone86 tag should be what I need.
Next, I need to configure the kernel in the same way that the Beagleboard was configured. This can be pulled from the Beaglebone itself, and can be found at /boot/config-3.8.13-bone86. I can copy this file to the .config and boom, the kernel is configured to compile for this device.
Throughout all this, I’d had to install cross compile tools. Specifically the arm-linux-gnueabi- toolchain. Then the single command “make -j6 ARCH=arm CROSS_COMPILE=arm-linux-gnueabi-” will spin up a number of threads to compile the whole kernel.
Lastly, to compile the kernel module I found myself needing to modify the path in the Makefile:
With that Makefile, and the kernel checked out and compiled in the KDIR directory:
I was able to simply run make in the kernel module directory to build the .ko from my main machine:
Copy the .ko file over, and run an insmod and the KLCD device shows up in /dev.
Better yet, running echo “Colin Here” into the device actually works!
Tying it all together
So now there’s a driver for measuring the GPIO, and a driver to display the events. At this point the program is incredibly straightforward. Read the speed and display it.
Sticking with the cross-compile technique, I made a simple Makefile:
There’s a little bit to the program insofar as signal trapping, but the main loop only consists of the read and write:
Ok, so it isn’t a “write” per se, rather an ioctl. I think a write / seek would work, but the reference app that came with the driver uses ioctl all over the place. I followed suit here.
The last steps can be found all over the internet, so I’m not going to detail them. Adding the overlays to be persistent, inserting the kernel modules at boot, and starting the monitoring app. That’s all that is needed for phase 1!
As I mentioned in Exercise Bike Upgrades, step 2 is getting the data into a computer. I’m starting at a point where I’ve got a signal coming off the bike – a discrete 0 or 1 based on zero or infinite resistance on a wire. I need to process this signal and get it into a format that it can be displayed to a user. How this’ll be done depends slightly on what I’ll be using as my computer. I’d like to learn and try new things on this project, but I don’t want to bite off more than I can chew. There’s an end-user who is expecting a product, after all!
Choosing a computer
The whole purpose of this is to keep the budget reasonably low. This means limiting my “computer” selection to what I already own. Fortunately there’s a fair amount to choose from.
Option 1: Raspberry Pi.
I have a bunch of these lying around. They’re cheap, powerful, and typically run full displays to an HDMI output. Things I’ve done with RPis in the past have involved KODI clients, small webservers, and kiosk-like interaction. The RPi is a fine choice but I’m a little averse since I’d probably find myself just writing python scripts. Been there, done that. Also I’ve used the RPi a lot, I’d like to try something new. Plus there are a lot of super easy projects to drop into a Pi – why tie one up with this project when I don’t have to? So I opted out of this one.
Option 2: FPGA Dev Boards.
I’ve acquired a number of FPGA dev boards that run Linux in the past few years. A Snickerdoodle and some Zedboards of different types are cluttering my desk. I have found absolutely no use for them yet, and this might have been the perfect opportunity to do so.
I have a growing amount of indirect experience with FPGAs – enough to have pulled the basics. Depending on the frequency of the level changes, an FPGA might be necessary.
For example: A quickly changing setup might require dedicated hardware. A product I’d worked on a lifetime ago utilized the Sensoray 2620 encoder counter. There might have been jokes made about how “Sensoray” sounds like a weapon from a D-grade SciFi movie. Looking back, that project was pretty bad-ass, but I digress… That hardware can handle frequencies of 2.5 to 10 MHz! Nobody is pedaling that fast.
An FPGA can approach those speeds without additional hardware. It could react to the input and logic to return state change counts could be implemented. This would be a learning experience, and is not very portable. Essentially – it is probably overkill. I’ll keep it in my back pocket if it becomes practical.
Option 3: Arduino
I have a few Arduinos lying around. The programming doesn’t really interest me though. They’d work, but would have bottlenecks when getting data from the Arduino to somewhere else. For example, if a webserver is ever expected to be used, Arduino would not suffice.
Option 4: Particle
I didn’t consider this since I don’t have extras lying around, but a Photon would probably be a great option. It has the same “webserver upgrade” limitation the Arduino option would have, but would be able to work indirectly through particle’s services.
This would be a good option for a quick solution, but wouldn’t be transferrable to other environments.
Option 5: BeagleBone
The last main option, and the one I went with, is the BeagleBone. I have a couple lying around, haven’t used them much, and they run Linux.
It isn’t very different from the Raspberry Pi. It runs Linux. It is supported by the community. The Beaglebone does have twice as many IO pins as the RPi, though I don’t think I’m that resource-constrained at this point.
Primarily my choice of BeagleBone over RPi was to avoid the temptation to go to GUI + Python. I knew I wanted to write kernel drivers, and more often than not the Pi users write several abstraction layers above this.
So I decided to try to start the project with the Beaglebone. Hardware Selection done!
Recording the data
With computer selection done, the next step is to get this into the system. With regards to the sensor, there’s a signal that is usually 0, becomes 1 for a brief moment, then goes back to 0. If we can count the number of times the signal goes from 0 to 1, that is a direct indication of how quickly the flywheel is spinning. One way this can be done is through sampling.
In a sampling configuration, we’d write a program that checks the level of the pin at specific intervals. When the level of the pin changes, we’d make note of this. The number of transitions over the last period of time would indicate how quickly the wheel was spinning.
For example, if you’re sampling at 100 samples / second, and there were three instances where the signal went from 0 to 1, you’d know that the flywheel was spinning at 3 revolutions per second.
There’s a huge downside to this though. In the case above, we’re taking 100 samples per second. Of those 100 samples, only the three transitions from 0 to 1 matter. So we’re maybe running at 5% efficiency. Further, there is the possibility of missing a measurement, particularly if the duty cycle is not at 50%.
In other words, it might be efficient to poll a signal that goes “0000111100001111”. But it would not be efficient to poll a signal that goes “000000001000000001”. The probability of missing that short-lived “1” is high, and will compromise your data while wasting a lot of CPU time. So in this case: don’t do that.
Since polling doesn’t really make any sense, I looked to trying to implement interrupts in order to trigger an event. Instead of having to come back all the time, the CPU could just count the number of times the interrupt has happened. That would allow me to check, at any time, how quickly the flywheel had been spinning. Much to my delight, this turned out to be fairly easy to implement.
Implementing the input software:
With many thanks to this blog by Derek Molloy, I was off to the races in doing the two things needed to configure the system: Setting up the necessary pins for inputs, and actually writing the driver software.
By default, CPU pins have a defined mode. In this case I needed to choose a pin that would act as my input. Basically, since the example in Derek’s system had used GPIO 15, I followed suit.
There was one difference between Derek’s code (as well as my initial test setup) and my final implementation. The initial system I had put together had a GPI tied directly to a GPO. This way I could toggle the GPO a number of times, and verify that the GPI was seeing the changes. The final implementation had a connected circuit followed by a disconnected circuit. So for this I’d decided on an internal pull-up on the GPI, with a ground signal connected to the other side of the reed switch.
Changing this configuration in the device tree, I simply ran the file through the device tree compiler, and was able to apply the device tree overlay to the system. Derek’s blog has all the info for that.
Writing the driver:
Now I have a signal going into the pin. It seems to be clean – no debouncing needed at my speeds – so I’m on to making the driver. The way that I think the driver could be most useful is this:
A character driver interface. This way my application only needs to read from a file.
Limited ioctl control. I much prefer read / write when it can be done.
Reads would return the number of times the interrupt had happened over the last period of time.
Before a character device can show up, it needs to go through the module_init. This is basically where the driver tells the system “I am a driver of this type” and will verify any resources it needs.
First, setting up the input. When setting up the pin to be a GPIO, the system could either be input or output. And the kernel doesn’t even recognize it as such. So the driver will verify the GPIO selection is valid, claim it, set it as an input, and export it so that it shows up in the /sys/class/gpio directory.
The pin can then be tied to an interrupt and function, with gpio_to_irq and request_irq
Since the GPI is set up, the driver can be registered to a file. I got a little experimental (and subsequently lazy) at this point. I’d tried to make it possible to have multiple GPITACH devices, but I think I got some pointers wrong. Since I don’t have a use-case, I figured I’d leave it for later. Maybe I’ll extend the behavior at some time. Probably not.
Anyway, the code with some error-checking goodness is as such:
From this, a file will be created at /dev/gpiotach1.0. Before looking at the interface to this, the interrupt logic needed to be created.
Interrupt routine:
My desire was this: At any time, the application could read from the file and know how quickly it is going. I didn’t want to rely on the application managing any time. An example of this would be the IRQ would just increment a counter. Read the counter at different intervals and you can deduce the speed.
What I’d rather have is the driver keeps track of when each interrupt happens. In this case, the old entries would “expire” and new entries would be added in. The way I achieved this was to use an array (or rather, circular buffer) of ktime measurements.
Using the circular buffer from the Linux kernel, items can be added to the head with this routine:
For the add routine, I’d decided to add my expiration time to the measurement. Essentially I took “currentTime + 3 seconds” and added that to the head of the circular buffer.
Armed with this, I can purge expired times from the buffer:
So by purging old times and pushing new ones as they happen, I can just look at the circular buffer count. This count will be how many times the interrupt has happened in the last 3 seconds. Perfect!
Finalizing the character driver:
Lastly, the open, read, write, and close routines must be written to finish off the driver.
Open and close don’t really do much. In the case where we’d have multiple potential instances, this might manage some memory, mutexes, etc. This doesn’t really do that – at least not well.
Similarly, writing doesn’t make much sense to this. I suppose the write could act as some sort of configuration instead of ioctl. Right now it only clears a counter that is otherwise unused.
Reading is where the magic happens. Magic because of the simplicity. Essentially it just purges out any remaining expired times and returns the buffer count to the user. Throw in some mutex protection for the IRQ-accessed buffer and that’s it!
The full driver can be found here. I referenced the code from September 1, 2020. It might be expanded or cleaned up at some point.
Compiling the driver
This one was easy. I’d started using Derek’s suggestion from this part of his blog to compile the driver on the target. This might not be the fastest way to compile the driver, but it sure was easier to get set up. The next step in this project might include setting up a cross-compile environment.