Empower Apps | Transcript: Building a Video SDK with Marc Schwieterman

September 18, 2023 • 43 Minutes

Building a Video SDK with Marc Schwieterman

Leo Dion (host): Welcome to
another episode of Empower Apps.

I'm your host, Leo Dient.

Today I'm joined by Mark Schwieterman.

Mark, thank you so much
for coming on the show.

Mark Schwieterman (guest): you iOS.

Leo Dion (host): Yeah.

Before we get into daily and
what daily does, I'll let you

go ahead and introduce yourself.

Mark Schwieterman (guest): Sure.

So I've been in tech for a while
primarily did backend stuff and chemical

information industry for quite a while.

Grew up on Apple computers,
refocused on that around iOS two

or three or something like that.

And that's been my primary
focus in recent years.

I've been in a few different places and
been a daily for about six months now.

I just, I don't, I really love
working on video stuff and just

sort of communication technology
that helps people connect as we

all saw during the recent events.

Leo Dion (host): Yes.

I can imagine.

So you're working at
right now with daily.

You want to explain what daily
is exactly and what do they do?

Mark Schwieterman (guest): Sure.

So daily basically does pretty
much anything related to video.

So we're a real time video
infrastructure for developers.

We offer SDKs both for the
browser in the form of daily JS.

And then we have another daily
X library, which we use for

mobile SDKs, but for iOS,
Android react native flutter.

We also are just about to
release a Python library that

can be used for AI workflows.

We have.

Global mesh infrastructure.

So you don't have to deal with the
server side of things for your apps.

You can just get started and
start using things immediately.

In addition to that, we have a
little latency video streaming

options where you can actually
like composite different things

and stream those to an audience.

Both video and voice calls.

And we also satisfy a variety of
different compliance requirements,

such as SOC two and like a HIPAA
as an example for a telehealth.

Companies and things of that nature.

So if you have a problem,
we have a solution.

Leo Dion (host): Yeah.

And that's, it's a
growing thing right now.

No, is it mostly for like meetings
or conference calls essentially?

Mark Schwieterman (guest): So you
can really build whatever you want.

Our systems are built on WebRTC,
and we have a product also called

Prebuilt which is something that'll
give you like a call experience,

where you can have meetings with
other people and create rooms

breakout sessions and stuff like that.

But really we give you the tools to
build pretty much anything that you

want to related to video and streaming.

Leo Dion (host): Okay.

I'm asking if it's do you do anything
with like live streaming, Twitch

or YouTube or something like that?

Or is it mostly focused
on meetings between one or

more or two or more people?

Mark Schwieterman (guest):
We can support that too.

That is not my personal area of
expertise, but I believe there's

like RTMP and similar protocols
and we're compatible with those.

So like you could use our services
to do a similar broadcast as those.

And I'm not positive, but I believe
we can also forward things on

through other things too, but I would
actually have to double check on that.

Leo Dion (host): Yeah no, no problem.

No problem.

Cool.

Today we wanted to talk about you're
the head of the iOS SDK or maybe

do you do work on all the SDKs?

Mark Schwieterman (guest): So we
have like a client team and there's

two of us, myself and another
one of my team members who are

primarily focused on the iOS SDK.

But we do jump around like
we have a core layer that

we need to work on at times.

And so we just jump in wherever needed.

Leo Dion (host): Okay.

Okay.

So let's talk specifically
about the iOS SDK.

I looked at it I gotta say the
documentation is solid really good stuff

out there we'll get into that too, but
look what kind of makes what makes it

unique for Developing an SDK for iOS
specifically in this like video space.

Mark Schwieterman (guest): I would say
one thing that makes daily and our SDK

is unique is the team that we have here.

We just, we have a variety of people
that are very deep in different areas.

And so one of the features that we just
released in the SDK is adaptive HEVC.

And so typically with WebRTC by default
it uses VP8 and software encoding which

will actually run in your app process.

However with this, our adaptive HDVC
feature it will actually use the

hardware encoders on the device.

And that's also what lets us do H.

265, which will provide like
significantly better video quality.

With lower bandwidth and there's lots
of there's different characteristics

make it more optimal for delivery
and things of that nature.

So we have access to those lower level
like hardware features that you wouldn't

find with more, off the shelf solutions.

And we'll also we basically,
we like helping our Customers

solve their problems.

So if there's something that's like
possible and a customer is interested in

doing it, like we can usually find a way
to make that work for them and we'll add

features and such in support of that.

Leo Dion (host): So I want to, I'm
curious about the whole encoding space

as far as what you need to do most
of it is cause my understanding was

HEVC is only Apple, or is it like all?

All platforms support HEVC,
or how does that work?

And then when you're talking about
hardware encoding, I assume you mean

like the SOC, the A19 or whatever
the heck chip we're on, I forgot.

A17 is the new one, right?

Like those have encoders built
into the chip itself, correct?

Mark Schwieterman (guest): Yes.

Correct.

Yeah.

So this is very complicated and
there's lots of details, so I'm going

to make sure that I get everything
aligned right here, but so basically

with codecs, typically you have open
source versions of things and then

closed sourced similar to like how
with MP3s, there were open source

and closed source versions of that
provided by different companies.

So in the video space.

H two six four H two six five.

Those are not open source, but
then there are corresponding

versions of those.

And I'd probably get that wrong.

I'm almost certain that VP eight
is the open source version of

H two six four, but that may
not actually be the right one.

Leo Dion (host): Hehehehe.

Okay.

Mark Schwieterman (guest): As
far as interoperability goes

with web RTC browsers are
required to support basically.

VPH and H264.

So they have to support all of them.

So as long as you support
those, you have compatibility.

And then H265 is not actually supported
by most browsers that I am aware of.

And so basically when you stream
video, you can have multiple layers so

you can actually have like different
encodings, like all in the same

spree, same stream, more or less.

Leo Dion (host): Oh no,
I used the wrong word.

Mark Schwieterman (guest): I think
container is more of the file formats.

I'm not actually, yeah, I'm not
sure what the exact term would be.

But so as one example you
might you might publish three

different quality levels.

So if you think of an iOS app.

Like little thumbnails, you just
need low resolution because you

don't need 720 by 1280, right?

Which your phone can't
actually publish if you

Leo Dion (host): Yeah,
yeah, I know what you mean.

Yeah I, let's just say hypothetically
I've gone through and...

Run ffmpeg or youtube.

dl and it'll tell you all the
different streams you can get access

to so Yeah, I think that's what
you're getting at is you can get

like the you can get the webm or you
can get the h264 or whatever it is

Mark Schwieterman (guest): It exactly.

So we include basically two layers.

One that's H.

264 for compatibility.

So browsers and devices that
can't actually handle the H.

265 can then use that.

And, exactly.

And then for H.

265 and clients that support
it, it will then use that.

And so that allows us to have higher
quality video images, basically,

with less resource consumption.

And there's some other features.

There's this concept of
temporal scalability.

And so if you imagine I
believe it's every eight.

So there's imagine eight frames.

You can drop frames to decrease
the frame rate, which then also

reduces the bandwidth that's needed.

Leo Dion (host): Yeah,

Mark Schwieterman (guest):
So broad strokes, H.

264, H.

265.

As far as the encoding.

By default that'll happen in your app.

So if you were to profile it in
instruments or something, you

would see CPU usage and stuff like
that going on in your app process.

And then if you use the hardware
encoders, which are actually on Apple's

SoCs that'll actually go out and use the
hardware instead of your app process.

And if you profile those,
you'll see increased activity in

MedioserverD, which is the little
process on iOS that manages that.

So you're actually having your
video frames go over to that

process and be encoded before they
actually then get streamed out.

Leo Dion (host): Nice.

Nice.

So what, was there anything
new that came out as far as

with the Apple event recently?

And the new hardware?

Mark Schwieterman (guest): Yeah.

I haven't taught, had time
to dig into this in detail.

But I have seen basically the new
iPhone 15 pros and what is it?

The a 17 pro, I think the way
they're offset always trips me out.

But that supposedly actually
has an AV one decoder.

And so that's like
the successor After H.

265.

Also H.

265 HEVC are basically the same thing.

Leo Dion (host): Okay.

Okay.

Mark Schwieterman (guest):
use them interchangeably.

But so eventually we will have
hardware decoding support.

Now that's not something we
can use on the encoding side

yet, but it is interesting to
see, these things being added.

and so in the future, encoders,
then that's something else that we

could potentially start leveraging
to, to have even more options.

Leo Dion (host): Yeah.

Was there anything else you want
to talk about when it comes to

video encoding and decoding?

Or should we talk about the SDK stuff?

Mark Schwieterman (guest):
I think probably SDK.

Leo Dion (host): Alright,
let's get into it.

You guys deliver an
XC framework, correct?

First of all, let's explain what an
XC framework is to those who don't

know as far as a mode of delivering
a Swift package, so to speak.

Mark Schwieterman (guest): Sure.

So XC frameworks, SPM or at least
after SPM became more in use.

And so you can package up a binary
assets, XC framework being one of those,

and it lets you combine XC The platform
specific macho slices for different

PLA platforms all into one executable.

So you can basically distribute
a framework that you could then

use in iOS, mac, oss, or like
whatever platforms you need to

support, and you'll have one slice
for each supportive platform in

that overall framework bundle.

So it's kinda like a
framework of frameworks.

Leo Dion (host): And
it's a binary, right?

Because Swift packages, people think
it's just the source code that you

import, but you're not doing that.

You're doing an actual binary.

What are the advantages of that?

Mark Schwieterman (guest):
Let's see here.

I think for one that you
don't have to build it.

So it doesn't add your compilation time.

But for us it's basically a necessity.

Just because being dependent upon
WebRTC, that actually uses an

entirely different build system.

Leo Dion (host): Ah, okay.

Mark Schwieterman (guest): we couldn't
create a completely pure Swift SPM

package just because of some of
the dependencies that we have to

use to build the overall framework.

Leo Dion (host): Okay,
that makes total sense.

What was the the new kind of
library that you can build?

I'm losing it now.

It's not dynamic, and it's not
static, but it's the one in between.

They

Mark Schwieterman (guest):
Oh, the new ones.

Yeah.

And I was 60.

Oh, I forget.

I forget what it's called too.

But yeah, I know what
you're talking about.

Yeah, so there's the basically it
can be either one more or less.

Leo Dion (host): Have you
looked into that at all?

Or is it, is Web, is WebRTC, is
that like C or what is it exactly?

Mark Schwieterman (guest): C and.

Leo Dion (host): Okay.

Mark Schwieterman (guest): WebRTC
is developed and maintained by

Google and then the bulk of that is
written in C for kind of like the

core layers of it but then there
are also platform specific SDKs.

So for iOS, then there's
additionally Objective C

Leo Dion (host): Okay.

Mark Schwieterman (guest): of that.

Leo Dion (host): Okay.

Okay.

Yeah.

What's it called?

It's called mergeable libraries.

Mark Schwieterman (guest):
Mergeable libraries.

Yes.

Yeah.

I'm so

Leo Dion (host): go ahead.

Mark Schwieterman (guest): go ahead.

Leo Dion (host): I was just
going to say, I know they added

support for C plus this year.

Mark Schwieterman (guest):
I've seen that.

I've not had a chance to dig into
that yet, but that's something

that I'm very interested in.

Just because you work with WebRTC
and you can use a lot of the

classes, but sometimes you need
to customize their behavior.

But you potentially still need
to be able to like, call C code.

And there, there's some cases
where we've started building like

our own sort of replacement for
some of those standard types.

Leo Dion (host): Oh, okay.

Mark Schwieterman (guest): needed that
yet, but I can see that being useful.

And then you have the benefit of the
code you're working on is all Swift.

So you're not having to jump over
into other languages quite as often.

Leo Dion (host): I had to do

Mark Schwieterman (guest):
very appealing to me.

Leo Dion (host): a while ago before
I had even a blog post about this,

was this like how to have Swift talk
to C and I ended up having to create

an Objective C interface in between
and it was a mess, so yeah that's a

Mark Schwieterman (guest):
That, that's my life.

We basically have a core layer that's
objective C plus, and we actually

have our, some of our core libraries
are rust and we have to call that,

which that's a whole nother thing we
could probably talk about for a while.

But.

Leo Dion (host): So let's talk
a little bit about the CI setup.

So you guys, I assume, I don't know
where your code is stored, but you're

using, how are you setting it up?

So that way your XE framework is built
and always tested and things like that.

Mark Schwieterman (guest): Sure.

We use GitHub and GitHub actions,
and currently we have some co

located runners basically, and
are actually in the process of

setting up some images just to make
the maintenance of that easier.

And actually, brief tangent, but
I noticed that you have something

named Bushel that seems like
it's potentially in this space.

We've been looking at Tartlet,
but I'm just curious is Bushel

intended to be a CI solution or

Leo Dion (host): No, it's not.

Tartlet is definitely
more of the CI solution.

I definitely want to look at, I've,
yeah, I've gone down that rabbit hole.

But no, Bushel is more to run on
device to test your app on device.

More so than a CI solution.

For now.

Mark Schwieterman (guest): So as
you probably know, there's like

license restrictions and stuff.

So you're only allowed to run like two,
two virtual machines, like per Mac iOS

license or something of that nature.

And at least so far from what I've
seen the hosted runners you can run

like two VMs, which works pretty nice.

So our build system effectively,
we trigger through GitHub actions.

But then we also have to build all
of our frameworks and we need to be

able to build all of The languages.

So it gets quite complicated.

We have some stuff that runs for
Android on other hosted runners.

And then for iOS stuff,
we run the core build.

And so basically we build the core
library which is both WebRTC and

media soup C plus and C languages.

Then we have the daily core
library, which is built in rust.

And that controls, that's like things
that we add on top of WebRTC basically

to provide additional feature.

And then obviously some of these
customizations like the adaptive

HTTPVC that we were talking about
earlier that for that we actually

had to modify WebRTC itself so we
have our own custom build of that.

And then to actually call into
Rust code from Swift, you can't.

So we use FFI and bindgen basically.

So we generate C bindings to our Rust
code and then call that from Swift.

Yeah, so there's the,

Leo Dion (host): Yeah, what,
I'm not familiar with FFI.

You want to explain a
little bit how that works?

Mark Schwieterman (guest): So
it's foreign function interface.

If you've ever used a dynamic language
like Ruby, you've probably seen this

where like you install certain gems
and it'll start compiling things.

And so that's the idea.

And so basically it works,
it's similar in concept to how

obviously message send works.

Leo Dion (host): A little bit.

Yeah.

Mark Schwieterman (guest): So you
have the self pointer and then like

a selector which kind of defines
the method that you're going to call

and then some number of arguments.

And so the way this works is you
basically generate C bindings

into, in this case, Rust
code and then to invoke that.

You have a pointer to that you pass
in, which is the thing that you're

actually going to invoke the method
on, and then additionally pass along

some arguments in some other format.

And so we have like a serialization
format, basically, where we give

it, we call, we have a pointer, and
then we have a payload that more or

less has all the stuff needed to do.

To make the actual method invocation.

So in CI, all that has to be
built from scratch and then

cleaned and then at the end of it
packaged up and into a framework.

So it's it can get pretty complicated.

Leo Dion (host): Yeah.

Yeah.

Okay, let's talk a little bit
more about the SDK and the iOS

SDK as far as what it provides.

So I deep dived into it.

It's really great.

Like you got a UI kit, everything's
UI kit and easy to plug in.

Where are things that as
far as SwiftUI is concerned?

Mark Schwieterman (guest):
So we do support Swift UI.

We have a demo app on GitHub
Iowa starter kit, and that is

written in Swift UI there are
not really native view there.

So have you had to use like
UI view representable UI view

controller representable in this?

Leo Dion (host): Yeah.

Yep.

Mark Schwieterman (guest): So that's
basically what we do currently is

we package up some of the WebRTC
related views inside of a Swift UI

view and then expose it that way.

Leo Dion (host): What are some of the
concerns folks should have if they're

going to start plugging this SDK?

What are some things they
should know off the bat that

they need to be aware of?

Mark Schwieterman (guest): For the
most part, things do just work.

It is a little bit different.

I would say the number one thing
with SwiftUI, and I'm sure you're

familiar with this But is, you
need to be conscious of changing

state in a way that can cause the
views to have to be re rendered.

Leo Dion (host): Okay.

Mark Schwieterman (guest): So what
we recommend basically our SDK is

very robust and we provide a lot of
functionality to do different things.

And it's often helpful to have
like your own domain layer.

And so you can transform things between
like the fuller at a high level we

have the concept of a call that you
would then join or leave and then

you have participants in the call.

And so that's like me whomever.

You don't want to be redrawing your
screen if like someone who's not

even going to talk un mutes something
or mutes it again because you're

not actually going to see that.

And you want to make sure that you
have things structured in a way that

is going to be effective for you and
that also gives you like a seam to

simplify things and then have you
used the SwiftUI Preview Canvas much?

Leo Dion (host): Yes.

Yep.

Mark Schwieterman (guest): Okay,
that is probably hands down

my favorite feature, period.

Leo Dion (host): Yeah

Mark Schwieterman (guest): Have
you been able to use SwiftUI

much in your own work personally?

Leo Dion (host): Yeah, I do
everything in SwiftUI right now.

Mark Schwieterman (guest): Okay.

Some of the companies I've
worked at, we've needed to

support UI kit and legacy apps.

And haven't really been able
to use Swift UI there much.

And I've gotten much deeper
in it and probably the past

a year and a half or so.

And it's the productivity benefits
of the preview canvas, different

devices, not having to build and
run a different screens and re

navigate like that is just amazing.

So that's why I mentioned
creating your own domain.

So like using our SDK with
SwiftUI, it will just work.

But there are huge benefits both to
thinking through like the performance

implications of like how you model
your own domain and additionally

structuring things so that you can
very easily create structs for use

in the preview canvas, just to do
your iterative UI development there.

Leo Dion (host): Yep,
that makes total sense.

Yeah, it reminds me we just did an
episode with Brandon Williams from

Point Free and he was talking about how

Mark Schwieterman (guest): Ah, love TCA.

Leo Dion (host): He keeps, keeping
stuff separate so that way you don't

have to worry about the video working
when you run the SwiftUI preview

or a simulator, for instance, and
that kind of sounds like what you're

talking about with the domain stuff.

It's exactly the same concept where
you try to keep stuff separated out.

Mark Schwieterman (guest): In fact,
like I, I am a huge fan of TCA wanted to

use it at a previous company, but just
couldn't do to like backward support.

I actually built something.

With Daily's Frameworks and TCA is
like one of the first things that I did

is just like a personal fun project.

The current starter kit that we
have it doesn't use TCA because I

didn't want to use any third party
dependencies just so that people

could, to make it more accessible.

Like you already know the stuff you need
to know for Apple development basically.

But I would very much
to port that to TCA.

The, it does a great job of solving
all of these sort of like problems

that SwiftUI doesn't really have an
out of the box solution to basically.

Leo Dion (host): right, yeah, exactly.

You, you obviously, if you're in
UI kit, you're gonna do a lot of

delegates and things like that.

Have you, how's that gone with
transitioning to using more like

publishers or async sequences or
combined or any of that stuff?

Mark Schwieterman (guest): Yes.

Mostly good.

Although I we did have
a learning experience.

So we support, I, I
believe back to iOS 13.

And Okay.

13, it might be 14.

I think we may have raised it to
14, but we also want to make sure

that, customers in any type of
business can use our framework.

And so depending on the type of company,
they may need to still support UI kit.

So we do have a delegate and for
UI kit apps, that's probably a

good way to go in some cases.

Additionally, we also
have published properties.

For some of the states or for
example, like call state is a

published property currently.

So our like participants we
have a big participant struct

and that gets published.

That can work good if you're doing like
an MVVM style UI kit app or in Swift UI.

But the one problem with that is
that you can't really create a

protocol for something that has
like a published property on it.

Yeah, so if you're wanting to provide
like I personally would love to

provide, like a first party test
double that people can use just so

they don't have to write their own
make that easier with published

properties, you can't use that.

One change that we plan on making soon
is basically switching from published

properties to explicit publishers for

Leo Dion (host): Oh, interesting.

Okay.

Yeah what I've ended up doing is
just putting a protocol on the

thing inside the observed object.

And then having the observed object
just be whatever the class it's gonna

be in and the protocol's inside it.

And then the proper, the protocol
provides whatever you want to provide

to the published properties that way.

But, yeah.

Mark Schwieterman (guest): Oh like the
pro like the properties of the object

or value that would be published.

Leo Dion (host): Yeah, and then have
the protocol be like a different

implementation for simulator or for
whatever yeah, but that is, I've

heard that problem a lot with a lot
of people where they're like, I want

to use a published property and it's
we'll get into the new stuff, but yeah

Mark Schwieterman (guest): and that
is, yeah, obviously that, that is

really interesting because, that's
all going to be changing with iOS 17.

And I personally do you think
combine is here to stay?

Cause it seems.

Like some of it's going into
Swift packages, but I don't know.

Leo Dion (host): Or you
mean to like observation

Mark Schwieterman (guest):
so Combine itself, I think it

seems like it was introduced
initially in support of Swift UI.

Leo Dion (host): right,

Mark Schwieterman (guest): but then
the async await Swift concurrency

stuff seems to like largely be
superseding it, but there's that

there's a, I believe it's Swift async
algorithms package that you can use

and that like similar stuff that you
would have with combine publishers

is being added to that framework.

Leo Dion (host): So I'll be honest,
I haven't deep dived into, excuse me.

I haven't deep dived into
the async algorithms more.

I think the observation stuff
has really removed a lot of the

combined code that I've been doing.

In bushel, for instance,
I've talked about this.

Like I've, I have one piece of combine
right now in bushel, and that's

because it was something that I wrote
a year ago that I just am too lazy

to migrate over and it works great.

I think it's more of the observation
stuff has really made it.

Not necessary.

There's things I miss about it for
sure, but, and I like the functional

programming aspect and I think that's
what async algorithms provides.

But yeah, I, is it here
to, is it here to stay?

Probably not.

But it's not going to
go away anytime soon.

I would imagine.

Mark Schwieterman (guest):
I'm very conflicted.

Cause it took me a while
to warm up to combine.

It's just, it's a very different model.

And I feel like as soon as I started to
get my head around it Swift concurrency

came around and then, have you done
much with the async sequences and

Leo Dion (host): Yeah, I have not
done anything with async sequences.

That's where I, my, that's my blind
spot is is this like async sequences

and how that replaces combine.

I'm much more comfortable with async
08 when it comes to just the simple

Oh, this is an async call to a
network and you get back something.

But the async sequence stuff,
I'm still, I'll be honest.

They're like, I had to do something
with store kit recently and it was like.

Okay, I guess this is a for
loop, but yeah, it's weird.

Mark Schwieterman (guest): Yeah, and
just, yeah, like the sequencing and

there's stuff where like you basically
have to use a task group if you don't

want things to serialize and there's
just little things that at least

aren't always immediately intuitive.

I think it's really interesting for
SDKs though, because like, how do

you serve a bunch of customers that
have a variety of different needs?

Leo Dion (host): Yeah, you screwed,
honestly, in a lot of ways.

Yeah, cause it's oh, if you had to serve
somebody in iOS 14 and I still have

some apps like that, and it's yeah, good
luck trying to do any of that stuff.

Mark Schwieterman (guest): Where I've
settled on and so like we, we have,

and we'll continue to have a delegate.

I think we're going to replace the
published properties with publishers

and that way people that need them can
use them and then probably add in like

the async streams at some point, but you
can also get those off of publishers.

So they're interchangeable,
but not exactly.

Leo Dion (host): yeah, you gotta
serve everybody in every space.

I was gonna ask, like, how does
observation look to you as far as as

an alternative to published properties?

Cause I've got all in observation with

Mark Schwieterman (guest):
actually, yeah so I have

not used observation a ton.

I am very excited about
it for yeah for apps.

I'm very excited about it because what
I've seen, at least with Swift UI is

it's very non intuitive that changing
something non visible could cause

your entire view tree to re render.

And so I think that's gonna like
basically just solve a lot of that.

But I haven't used it enough.

Can you actually use observable
with to provide like a protocol

or for an interface that you
want someone else to use?

Have you tried anything like that yet?

Leo Dion (host): No, I haven't.

I'm so stuck on using it the
way observed objects are that I

haven't wrapped, I haven't put my
observed objects, or my, oh gosh,

I knew I was going to say that,
observation stuff in a protocol.

I still do it the old way where I have
a protocol inside the observed object.

No, I haven't.

I, you probably could?

I don't know.

It I would think it would
break the macro, but I could

totally be wrong on that.

I know I've had

Mark Schwieterman (guest): The whole
macro thing, like that, that just seems

Leo Dion (host): and SwiftData and
trying to use protocols in that space.

Macros are inter I love them, but
you gotta be careful because there's

stuff you can do with macros that'll
break them and you don't even know

until you like deep dive that, oh.

This modifies the code this way.

That's why it doesn't work.

So yeah.

It'd be interesting to see
if you can go that route.

Mark Schwieterman (guest):
That, that expansion feature

is just like magical to me.

Cause the property wrappers are just
opaque and it's we don't necessarily

know how these things work and just
being able to see like the actual code

Leo Dion (host): Yeah.

Yeah.

I think the biggest advantage of macros
is just now that we can see how some

of those stuff behind the scenes.

Works and we have access to it.

Like you said, property wrappers are
codable or any of that stuff that was

always like in the compiler that we
couldn't understand what was going on.

Now we can do that ourselves.

Which is awesome.

Mark Schwieterman (guest): So I
don't know that we'll be adopting

observation, at least on the API.

We'll probably do that like
publisher change that I mentioned

first, but my plan is basically
to update our starter, get app to

Leo Dion (host): Starter kit.

That's what I was going to say.

Yeah.

Cause

Mark Schwieterman (guest): yeah.

Leo Dion (host): I don't know
how to do this with opposite.

Yeah.

Okay.

Mark Schwieterman (guest): Exactly.

Yeah, I was hoping to do a
beta branch, but did not have

time to actually do that.

But now that, iOS 17 is about
to drop, it's about time.

So I'll probably be doing that soon.

And then I always like to try
something first and then take

what works and pull it up.

And so probably just use it directly
in the app and then figure out okay,

can we pull this back into the SDK
itself in a way that's helpful?

Or maybe the app is like the,
app layer is like the appropriate

place to use that, right?

Leo Dion (host): What were there any
other API APIs that were introduced

this year that you're like, this is
going to be awesome for us at daily?

Mark Schwieterman (guest): Yes.

Yes.

Some of which we can't actually use
yet, which is currently frustrating,

but I think I found something.

One thing that's really cool is, have
you used continuity camera at all?

Leo Dion (host): Not yet.

I know what it is.

It's like where you can use the
iPhone as a camera on a Mac or a TV.

Yeah.

Mark Schwieterman (guest):
E Exactly, yes, yeah.

There's there's actually like a third
party app named like Camo or something

which sort of provided a way to do this

Leo Dion (host): Yeah.

Mark Schwieterman (guest):
And continuity of cameras.

Kind of Apple's first
party version of that.

They're now bringing it to TVOS, which
for video conferencing stuff is really

cool because, you could just sit back
and have a conference on your Apple TV.

And then use your phone
as the camera for that.

And that totally works something
we can maybe come back to.

Although currently it's a bit more
difficult to actually get our framework

to work on tvOS due to the way it's
built and some of the Rust interactions.

Leo Dion (host): Oh, gotcha.

Okay.

Mark Schwieterman (guest): but
I think we have an approach

that will work with that.

So that is going to be really
cool once we get that working.

There's also the addition
of I think it's called UVC.

It's like a USB peripheral
device for iPad.

This is something else I want to
explore, but in some cases I think

it's helpful to be able to have a
video conference on an iPad and then be

able to have potentially like a camera
that is not your iPad, similar to how

you can do a continuity camera, but

Version of that.

There's new improved voice processing.

So for any of these things where,
you're worried about background

noise and stuff, that's just
going to automatically get better.

And then just as a fun thing I
was 17 is adding, like gesture

based reactions where it'll just
recognize gestures that you do on

screen and then do a visual effect.

Have you played with that at all?

Leo Dion (host): no, but I was just
thinking about the Apple event and

how we can do how we can do the double
tap, like all these little things that

we've gotten from what do you call it,
from the Vision Pro, like Apple's been

working on this stuff on the side and
they're just like here you go Here's

a little something for the iPhone.

You can guys can use.

Yeah, it's

Mark Schwieterman (guest): So funny
thing on that the watch double tap, I

saw that and I was like in awed and, I
looked at that and immediately I just,

I, in my head, I could see like the
vision pro, thing when they showed it

off and like the pinch gestures, right?

It turns out that was like
an existing accessibility

gesture on watch OS already,

Leo Dion (host): I just
released the podcast episode.

So we just talked about that.

Yeah.

Mark Schwieterman (guest): Okay,
so you got into that, on that.

Okay, I will check that out
then for the full backstory.

I just, I thought that was fascinating
because I was like, oh, this is cool.

Leo Dion (host): episode it was like
do they do all this with assistive

touch like yeah No, totally.

Oh I was gonna ask

Mark Schwieterman (guest): So good.

Leo Dion (host): Okay I they mentioned
this, they had a video about it.

I don't know if you saw this and I don't
know if you've you do anything with it,

but did you see like the rotating stand
thing that they added to the iPhone?

Mark Schwieterman (guest): Oh night
nightstand or stand something,

Leo Dion (host): There was a road.

There's a, they had a whole
WWDC video about this.

Okay.

I'm going to look it up.

Sorry.

But there's a new API, so you
can plug into a USB stand and.

Mark Schwieterman (guest): Oh yes.

Yeah, I did actually see that.

And it would let you, like remote
control it or something like that?

Leo Dion (host): Yeah.

What was that called?

Mark Schwieterman (guest):
I know the video.

I think Yeah so I've used a video
conferencing stuff before I worked

at one company remotely and they
would have like cameras in rooms

and you could control them.

So if you're trying to like
talk to someone, you could

like zoom in or zoom out.

And yeah, whatever that is,
they, they're, I believe

now adding support for that.

So I'm imagining like a little,
like a iPhone driven, like robot or

something that can like roam around
and then you can like control.

It's I would just be
something super fun.

Leo Dion (host): I'm trying to
look what, which one it was.

Oh, here it is.

Dot kit integrate
motorized iPhone stands

Mark Schwieterman (guest): That's it.

Leo Dion (host): Yeah.

Yeah.

So yeah, I thought that was interesting.

I don't think we've seen any
hardware about it yet, but.

Yeah

Mark Schwieterman (guest): one I saw.

And yeah, like I, I don't actually I
don't know, I guess I'd like to see

if I could get my hands on one of
those devices just to play with it.

Cause that seems, and you think
about the A lot of this like AI image

recognition stuff is getting really
good and in some cases you don't even

need to use third party things for that.

So like with iOS you know
how iPad has center stage?

Leo Dion (host): Yeah,

Mark Schwieterman (guest): And in the
Apple studio supply display as well

basically you can actually get like
the region of interest back from that.

So it'll give you like the rectangle,
like where faces are basically not so

long ago, you couldn't do that yourself,
basically you had to write your own

code, but just, be able to build like
an app or something that uses that.

And then we'll potentially use doc
kit or something to be like, oh,

we'll reorient towards a person
or something interesting that

we're trying to keep an eye on.

Leo Dion (host): Yeah, exactly.

Okay, let's step back, we'll
stop talking about the fun

stuff and we'll talk about the

Mark Schwieterman (guest): All right.

Leo Dion (host): So a lot of this,
I assume you have to run a lot

of this through AV Foundation.

How was that experience?

What were some curveballs that you
got that were just like wow, this is

not what I thought it was gonna be?

I've done a little bit with
AV foundation, but not a lot.

I want to, because it's super cool.

But what yeah, give me the low down on
what it's like doing that experience

Mark Schwieterman (guest): So
AV foundation is very capable.

I have played with it in the past.

I once actually made a.

A podcast app for basically
dropping markers like audio markers

where you might want to edit

Leo Dion (host): chapters.

Oh, okay.

Yeah.

Mark Schwieterman (guest): Yeah.

And you can do all this stuff.

I guess that was more core audio
feature, but I would consider

that part of that umbrella,
but you can do virtually.

Anything with it from like building
apps that actually process things

offline to doing things live there.

There's a lot going on there Basically
with web RTC specifically a lot of that

is handled for you, but you still have
to interact with it Which is strange

So the two main classes that you use
are AV audio session Which is a shared

global singleton and then like other
apps can also Make changes to that.

So that is one thing that's tricky.

So like when you want to do things
like choosing your audio device

you can use the system, I think
it's called MP volume view as one

example, you don't necessarily
have to write code to do that.

But in some cases, you actually do
want to and so as an example, if you

want to override to like always use
the speaker instead of an attached

device or something, then you have to
get in there and change that session.

So we have some code that provides
some management of that, but in some

cases like someone building an app
might want to manage that directly

too, and you still have to deal
with that, and you get notifications

on a route change, which means any
input or output device has changed.

Or if like the session is interrupted,
like imagine like you get a phone call

or something, the kind of things you
might get where like normally like

the audio might be ducked or something
like that just to be able to hear it.

And so dealing with that
is one point of complexity.

And then on the video side of things,
which is probably what I should have

started with, but you have a AV capture
device and that's your interface into

all of the cameras that you have.

And so WebRTC itself
supports that directly.

And so you can choose between
the cameras that you want to use.

So as one example, you could flip
between the camera facing you,

the user, or the camera on the
other side facing the environment.

And you need to switch between those
and working towards adding support

for like continuity camera on tvOS.

That's also like you need to use those
API is slightly different because you

have to deal with like reconnection and

Leo Dion (host): Ah

Mark Schwieterman (guest):
disappearing on you.

Leo Dion (host): okay.

Mark Schwieterman (guest): The biggest
curveball gotcha that I got, which it

was actually a learning moment for me.

It has to do with.

Video rotation, basically.

So you can just use WebRTC
out of the box and it handles

device rotation and all that.

Question.

Most of the apps you've worked on,
have they supported device rotation?

Leo Dion (host): probably not Honestly,

Mark Schwieterman (guest): Okay.

Okay.

On iPad in particular, that's something
that you need to deal with just because

Leo Dion (host): Yeah iPad, but
that was like over 10 years ago.

So yeah,

Mark Schwieterman (guest): Yeah,
something I've just through coincidence

of the things that I've worked on like
most of them have supported rotation.

So I'm just used to that.

And anyway, so this is the gotcha,
is WebRTC, all of that works fine

for local video if you use like the
video views that come with it and

everything rotates, but if you have
an app that doesn't rotate, there's

actually an issue with that because
it has to rotate the camera that

like the video that it's sending.

So that it's still like the right,
right direction on the receiver's

screen, but on the local device,
if the device doesn't rotate with

it, it will rotate it anyways.

And then it ends up being offset.

So to work around that there's
AV capture video preview layer.

And so that one you can actually use to
preview, the way that you would expect

and then let everything else rotate.

And that's this is actually something
that I'll be integrating into our

SDK shortly to handle that better.

Leo Dion (host): Cool.

As far as like networking stuff,
like actually sending the video out.

How does that work?

Mark Schwieterman (guest): We
do not have to write code to

do that really much at all.

WebRTC pretty much

Leo Dion (host): Does that work?

Okay.

Mark Schwieterman (guest): Yeah, and you
basically you just give it the device

and the video frames get past WebRTC
and then based on the encodings that you

have configured things, it'll just be
streamed out and then On the receiving

side, like more or less of things will
be used, but frames can be dropped.

Like WebRTC itself is very resilient,
so it can handle packet loss.

And then there's a server
component called an SFU that also

participates in these interactions.

Leo Dion (host): All right,
let's jump into future plans.

What do you have as far
as support for other?

Obviously not watchOS, sorry.

I guess unless you're going to
show video and kill the battery,

I don't think you're going to
support watchOS anytime soon.

But as far as we talked about tvOS
did you want to expand on that at all?

Mark Schwieterman (guest): Yeah.

So this is where things
can get interesting.

So WebRTC being C plus you can
build it for lots of platforms and

Apple's platforms specifically.

However the platforms
have different behaviors.

And so WebRTC out of the box
has its own like conditional

compilation for Mac OS and UI kit.

Leo Dion (host): Okay.

Mark Schwieterman (guest):
So you probably know, I like,

like the coordinate spaces are
flipped as one example, there's

a variety of things that are
just like a little bit different.

Similar to that, you potentially
need to support other platforms too.

And with rust in the mix we also
need to have platform support there.

And so in order to build, the way that
Rust works is you you build on a certain

platform, and then you can also build
for certain targets, but they have

to be supported by Rust itself, and
they have different tiers of support.

The first tier is it just works,
but it doesn't support the standard

library, which quite often you need
if you're doing anything non trivial.

Tier two is then the standard
library should work, but it's

not like officially supported.

So the cur the current state of things
right there is basically Rust does now

have Tier 2 support for tvOS, and so it
is possible to build everything with the

standard toolchain and have that work.

However, it's not actually
shipped that way, so you have

to set up your own compiler.

Ha!

More or less, a nightly
build to get it working.

Vision OS, which I am
super excited about, is...

It's not yet supported by rust.

However have you heard of Mars uponify?

I think that was like maybe
Steve drones, Smith's project,

or there's a few of these

Leo Dion (host): Maurice
Cepan, the codename for

Catalyst a hundred years ago.

Yeah.

Mark Schwieterman (guest): yeah as
an example you have an arm 64 library

that you would run on a device and
previously, you couldn't run on

a Mac until max where I'm 64 too.

But then.

You could, but you still
couldn't because it wasn't

actually support, supported.

But there, there is actually a way
to go in and rewrite the macho slices

to basically like claim that it
supports a platform, that it doesn't.

And I've been trying to get that
working, but I think that might,

cause rest itself, like the
support is really quite similar.

And so there's not something
functionally different on a

lot of these, different chips
on different device types.

It's just more how
things are identified.

Leo Dion (host): Have you
applied for a developer lab?

Mark Schwieterman (guest): No,

Leo Dion (host): Okay.

I just thought I'd

Mark Schwieterman (guest): I
really want to, but I feel like

I need an app for that and have
not, you mean provision OS,

Leo Dion (host): yeah.

Mark Schwieterman (guest): Did you?

Leo Dion (host): Yeah.

No, no way.

No way.

No, I'm not flying to California.

Mark Schwieterman (guest): No.

Leo Dion (host): Sorry.

Mark Schwieterman (guest): I
think there's one in New York.

Leo Dion (host): I hope so.

It'd be nice.

Yeah.

Mark Schwieterman (guest): Yeah, in
an ideal world I want to try to get

this kind of rewriting the slices
approach working for tvOS and VisionOS.

As soon as I do that, I'm going to try
to port our app and then I don't know, I

guess it's maybe too late at this point.

But I would definitely like to take
advantage of that if I can find a

good scenario in which it makes sense.

Leo Dion (host): Anything else you
want to mention regarding daily or

future plans before we close out?

Mark Schwieterman (guest):
Just, if you're interested in

building apps check us out.

We have tons of resources
online, lots of educational

stuff about WebRTC generally.

And very interested in helping customers
build whatever they want to build.

Oh, also I guess we just, we're
just about to release this daily

Python SDK, which is really cool.

You can just drop this into
the various kind of like.

Like Jupyter notebook and things
of that nature that use Python just

like you just import it and you have
working like video stuff that you can

run, write in those things immediately
and you can use it to process a

video and audio to do transcription
and a variety of other things.

So really cool and worth
checking out, I think.

Leo Dion (host): Yeah.

If you're looking for really good
documentation, definitely take a

look at it, but you guys have done
a daily really impressive stuff.

Great starter kit that gets you going.

So definitely take a look at that.

Mark, thank you so much
for coming on the show.

I really appreciate it.

Mark Schwieterman (guest): All right.

Thanks for your time.

And I had a lot of fun Leo.

Thank you.

Leo Dion (host): Where can
people find you in daily online?

Mark Schwieterman (guest):
Daily is daily.

co and for me, I guess if you can
spell Schwedermann, just search for

Mark Schwedermann and I'm around.

I'm Mark is me on
GitHub, M A R C I S M E.

Leo Dion (host): Yep.

And we'll pull links to that in the
show notes in case people don't know

how to spell Swederman sch swederman.

People can find me on x I hate that.

At Leo g Dion, I'm a Mastodon at Leo G.

Dion at c Im.

My company is bright digit.

Take a look there.

There's some new articles
coming out about humane code

and CI and all sorts of stuff.

So thank you so much for
joining me for this episode.

If you're watching this on
YouTube, please and subscribe.

If you are listening to this on a
podcast player, I would love a review.

if there's something you want to talk
about or there's something you want

me to find a guest to talk about or
maybe me to talk about let me know.

I'd love to hear back.

Thank you again.

And I look forward to
talking to you again.

Bye everyone.

Creators and Guests

Host

Leo Dion

Swift developer for Apple devices and more; Founder of BrightDigit; husband and father of 6 adorable kids

Guest

Marc Schwieterman

Eagerly awaiting some Ivory @marc@xoxo.zone

Building a Video SDK with Marc Schwieterman

Creators and Guests

Join our newsletter