I'm afraid I can't answer your question directly, but maybe some information is better than none. It is true that you won't find general multi-threading on micro-controller platforms, except perhaps as computer science demonstrations, or for very specific projects. A lot of times, when writing for a micro-controller, you are near the limits of RAM or some other resource anyway, so the overhead of a general-purpose multi-threading library isn't welcome. Instead, one way a similar capability can be fabricated is to simply make a function per "thread" and ensure that each "thread" function is cooperatively multi-tasking, which is to say, when you call into the function, it should have a quick check on whether it's time to do some piece of work, and exit quickly if not. That way, you can get to the next "thread" quickly. But it all has to be cooperative. Make sure you never call any "delay" function. If you need to "wait" you can take the current timestamp, add the "wait" time to it, and set a static variable with the wakeup time. Then, in your function, you can check to see if it's time to do that important thing you needed to do.
Another thing to note is that often times, we add "shields" to our projects that have processors on them that are much more capable than the main micro-controller itself! I've worked with the Rogue Robotics audio shield (traditional Arduino), and it has quite a lot of capabilities. So in those cases, you are handing off a complex task to a "co-processor" but you generally retain authority over what happens there. Think of all the intelligence that has to go into the kinds of Ethernet and WiFi shields where the entire TCP/IP stack is baked into the hardware. Those are a lot more capable than the Atmel on most Arduini!!