OctoPrint Camera Carnival

This probably should be turned into a blog post at some point, but after a full week of working on this I'm too tired to format my findings into anything other than a nearly-unintelliglbe mess.

The Goal

  • One Raspberry Pi camera mounted on my Prusa MINI's X-axis stepper using this mount,
  • One Wyze Cam V3 mounted on the Y-axis extrusion using this mount,
  • The ability to monitor both cameras using OctoPrint, and one camera using Polymer,
  • A live YouTube stream of the Wyze Cam feed with the "close-up" Pi feed overlayed in the corner, and finally
  • The ability to monitor this overlay stream in OctoPrint.
  • Oh, and I wanted all of these things to occur on one or two Raspberry Pis, as I didn't want to leave my computer on all the time.

Collosal Failure

After days of messing around, here are a few assorted conclusions I arrived at:

  • The Raspberry Pi 4's CPU is not powerful enough to overlay two input streams (even without scaling!) on top of one another and maintain more than a single output stream. I was able to get the overlay working with just a stream to YouTube, but placing any more load on the Pi would cause the FPS and bitrate to absoloutely crater. Weirdly, it would sometimes start out at full FPS and slowly decay down to almost nothing. My initial guess was thermal throttling (I didn't attach a fan), but running vcgencmd get_throttled never returned anything other than 0x0 (no throttling).
  • MJPEG as a protocol isn't very popular. I couldn't get hardware accelerated mjpeg encoding/decoding working on the Raspberry Pi, nor could I find a lot of information about it.
  • I should have just bought a Wyze Cam v2. There's so much more documentation, discussion, and general hackery with it. The Wyze Cam v3 only had an RTSP version of its firmware released last month!
  • Proper HLS support can't come soon enough for Octoprint. This thread has a lot of info but I ran into weird HLS issues when using Safari, perhaps related to Cross-Origin.
  • I really need a proper homelab.

A less cool, but usable solution

I eventually gave up on my original goals and settled on just having two camera individual camera streams (Pi Cam + Wyze) visible via Octoprint, with the Wyze Cam being broadcasted on YouTube. Here's the final setup I came up with:

  • Setup the Wyze Cam v3 as normal, then install the RTSP firmware. Make sure the camera has a static IP, then note down the RTSP url and login info.
  • On the Raspberry Pi running Octoprint, plug in the Pi Cam and verify the local MJPEG server is active (it should be on /webcam/?action=stream). Install the Multicam plugin, and setup the Pi camera as the second camera.
  • On the second Raspberry Pi, setup MotionEyeOS. There was a bug with the latest stable release for the 4
  • Raspberry Pi camera plugged directly into the server (Raspberry Pi 3B) running Octoprint, which serves an MJPEG stream by default locally.
  • A second Raspberry Pi (4 2GB) running MotionEyeOS (again with a static IP), configured with two cameras: the MJPEG stream from the Octoprint server, and the RTSP stream coming from the Wyze Cam. Setup MotionEyeOS so it provides an MJPEG endpoint (it magically does not take up that much CPU!) for the Wyze Cam.
  • Configure Octoprint to use the MJPEG stream of the Wyze Cam as the first camera. If you use Safari, use this hack so you can actually switch between cams in the control tab.
  • At this point, we can see both cams via Octoprint using the browser. We have not introduced and CPU-intensive loads on either the Octoprint device or the MotionEyeOS device (someone PLEASE explain to me this blackmagic).
  • If you want out-of-home access to Octoprint, checkout Octoeverywhere. I couldn't get this to work with any non-device cameras, since Octoeverywhere can't tunnel your network's internal traffic. I may have just been too lazy to configure it properly though; I've recently become a huge Tailscale fan so I just installed it on my Octoprint server. Now I could use my phone's (or iPads, or laptops, you can tell I love Tailscale) browser anywhere to access Octoprint and view both cams. The Polymer app is also super neat but doesn't support multiple cameras, but it's maintained by a single developer so I'm not going to hold it against him.
  • For YouTube streaming, we want the stream to only be up when the printer is running. Otherwise I will be broadcasting close-up shots of my ugly face frowning at the printer when trying to remove prints or doing first-layer calibration. Luckily, Octoprint has a neat HTTP API for querying its current status (among other things), so let's make two files:
  • stream.py, which polls the Octoprint server every 60s or so and figures out if it's printing anything. The script should start running the ffmpeg streaming command in the background if the printer starts printing, and kill the process whenever the printer stops printing. Make this script compatible with Python 2, as (spoiler) MotionEyeOS only has Python 2. I setup my ffmpeg command inject an empty audio stream since I don't want anyone to be able to hear what kind of music I listen to while waiting for my 10th benchy to print.
  • userinit.sh, which simply starts stream.py in the background and redirects stdout / stderr wherever you want.
  • Since the Raspberry Pi running MotionEyeOS has a ton of free compute available (how, I ask again), we'll setup the streaming to run on this device. Copy the scripts into /data/output (which is mapped to the sdcard folder if you FTP into the device). Symlink (or move, I don't care) userinit.sh to /data/etc/userinit.sh so it runs on startup. Make sure userinit kicks off python stream.py in the background, so you don't block the boot process!
  • Everything should be working at this point. Whenever the printer starts printing, the stream should go live within a minute. When the print is cancelled or finishes, the stream should go dead.
  • I asked the ffmpeg gods here if I could have my beautiful overlay streamed instead of just the Wyze Cam, but it didn't work again. The FPS would sometimes start high, but without fail a minute into the stream it would barely output a frame per second. ¬†In the future, I expect to have a "transcoding" box running Plex/what have you and one of the tasks it will have is streaming my overlay.
  • Bonus: You can actually run Tailscale on MotionEyeOS using their static binaries. I was quite surprised this works, but this allows me to view the camera feed in Octoprint no matter where I am, without requiring the Tailscale daemon on the Octoprint server to forward local routes pointing to the MJPEG stream. You can setup the daemon to start tailscaled on boot/via the Python script and then connect.

Below are the scripts I used (may not be up to date, sorry! I keep them on a private git repo), along with some more miscellaneous findings.

stream.py

import urllib2, json, subprocess, time, os, signal

# Determines whether or not the stream should be live.
# Returns true only if the printer is currently printing something.
def should_stream():
    try:
        # Query the Octoprint API for the printer state
        request = urllib2.Request("http://192.168.68.101/api/printer")
        request.add_header("X-Api-Key", "<API Key>")
        response = urllib2.urlopen(request)
        data_raw = response.read()

        # Return true only if the "printing" flag is true
        data = json.loads(data_raw)
        flags = data["state"]["flags"]
        return flags["printing"]

    # Octoprint is down, authentication failed, or printer is not connected
    except:
        return False


# ffmpeg process that we control based on printer state
ffmpeg_process = None

# Poll the printer state every 60 seconds,
# and decide whether to start or stop the stream if needed.
while True:
    now = time.strftime("%b %d %H:%M:%S", time.localtime())
    try:
        # Start the stream if necessary
        if should_stream() and ffmpeg_process is None:
            print("[{}] Starting stream!".format(now))
            ffmpeg_process = subprocess.Popen(
                [
                    "ffmpeg",
                    "-nostdin",
                    "-hide_banner",
                    "-loglevel",
                    "error",
                    "-f",
                    "lavfi",
                    "-i",
                    "anullsrc=channel_layout=stereo:sample_rate=44100",
                    "-fflags",
                    "nobuffer",
                    "-fflags",
                    "+genpts",
                    "-flags",
                    "low_delay",
                    "-strict",
                    "experimental",
                    "-use_wallclock_as_timestamps",
                    "1",
                    "-thread_queue_size",
                    "4096",
                    "-rtsp_transport",
                    "tcp",
                    "-i",
                    "rtsp://username:password@192.168.68.124/live",
                    "-map",
                    "1:v",
                    "-map",
                    "0:a",
                    "-c:v",
                    "copy",
                    "-r",
                    "15",
                    "-c:a",
                    "aac",
                    "-ar",
                    "44100",
                    "-f",
                    "flv",
                    "-drop_pkts_on_overflow",
                    "1",
                    "-attempt_recovery",
                    "1",
                    "-recovery_wait_time",
                    "1",
                    "rtmp://x.rtmp.youtube.com/live2/token",
                ],
            )
        # Kill the stream if necessary
        elif not should_stream() and ffmpeg_process is not None:
            print("[{}] Killing stream!".format(now))
            ffmpeg_process.terminate()
            ffmpeg_process = None
        # Otherwise, leave the stream alone
        else:
            print(
                "[{}] Leaving stream ({}) alone!".format(
                    now, "dead" if ffmpeg_process is None else "alive"
                )
            )
    # On failure, do nothing and just wait for the next iteration of the loop
    except Exception as e:
        print("[{}] Error: {}".format(now, e))

    # Sleep for a minute before we poll the printer again
    time.sleep(10)

userinit.sh

python /data/output/stream.py &

Yes, I know these scripts are far from the pinnacle of software engineering. I promise I know how to code. I've barely slept the past few days getting this to work, so I won't be loosing any more sleep even knowing I do the ol'

except Exception as e:
	# Do literally nothing
    pass

trick :)

Miscellaneous findings

  • ffmpeg is insanely powerful. After learning the filter syntax, I feel like I can do a lot of neat things with it now.
  • Hardware acceleration, especially on the Raspberry Pi, is super neat. Encoding x264 using h264_v4l2m2m or h264_omx barely stresses out the CPU at all. Luckily for my use case I could just use -c:v copy since the Wyze stream was x264 and YouTube accepts it with no problem.
  • YouTube will not stream anything without audio. So glad I figured this one out early, I would have been banging my head against a wall for weeks otherwise.
  • Here are a few tools or repos I came across that were useful, cool, or both: restreamer, rtsp-simpler-server, rtsp2mjpg, ustreamer, v4l2loopback, mac-local-rtmp-server. I tried a bunch of different things to get this to work acceptably and some of these were either potential solutions or just useful testing harnesses at some point.
  • There are a bunch of weird edge cases with video handling - it's a lot more complicated than it looks. A few examples are: v4l2loopback devices can't be treated as regular UVC cameras even though appear to be - software like ustreamer can't stream them. The new Raspberry Pi camera stack libcamera is neat but I couldn't figure out how to just do ffmpeg -i /dev/video0 with it. In addition to video encoding, there's a whole bunch of different "pixel formats" such as YUYV and YUV420p (or something). ¬†Stream timing is a whole thing- ffmpeg kept complaining about the RTSP stream from the Wyze missing timestamps (or something). I had to do some ffmpeg voodoo magic to align the two streams (Pi Cam local device + Wyze RTSP) when I was doing the overlay. Getting them in sync looked dope though.

That's it. I'm tired now. Hopefully this will help someone (including and most likely me) down the road.