Attending:
Eric Burger, eburger@snowshore.com
Robert Patzer, rpatzer1@email.mot.com
Gonzalo Camarillo, gonzalo.camarillo@ericsson.com
Henning Schulzrinne, hgs@cs.columbia.edu
Arnoud vanWijk, arnoud.van.wijk@eln.ericsson.se
Dean Willis, dean.willis@softarmor.com
Transcript:
NEW SPEAKER: We only
have an hour.
NEW SPEAKER: I have
another meeting at six so we have to get moving.
NEW SPEAKER: I'm not
sure I'm leading this, but since, Dean, did you want to.
NEW SPEAKER: Please
go ahead.
NEW SPEAKER: You have
the proposals in the area.
NEW SPEAKER: So we
now have, as you well.
NEW SPEAKER: Okay.
So we have a requirements document that is I believe now, in I S G, or where
is it,
NEW SPEAKER: R C editor.
NEW SPEAKER: I went
through it again, today, and I believe the next step, of ramong other organizational
issues which you would like to discuss, are what are the concrete technical
problems, now that we have a generic requirements, that we might be able
to actually do something about it.
I have been able to, by
going through the document, to define two such requirements, and there may
be others which I missed. Two generic requirements, in there.
And one, several which fall, which, in that category. One is trans
coding requirement. Namely, the ability to insert a third party service,
which allows you to trans code various media into each other, primarily,
for hard of hearing needs, deaf and hard of hearing needs, this would be
text to speech and speech to text. And/or, and for blind users, visual
impaired users that might be, like busy talk or something like that.
A system to trans code into
sign language.
So, they're both trans coding
services. They are to be able to trans code services, for example,
into a classical TTY type of service, I imagine, I'm not sure if this is
mentioned in the draft.
The second one, which I
think is actually somewhat easier to deal with, but is a generic problem
that we just discussed in a different vain today, is the notion of a profile
for a user, which makes it easy for a user to move to a different device
and still have, for example, the ability to call for a trans coding service
automatically, without having to enter some, lots of magic things whenever
they walk up to a telephone.
I believe we can dispense
with that one relatively quickly, in the sensible that is just a special
case of the generic issue of profile mobility, and so this may be addressed
in one of the mechanisms that we talked about, profile mobility, for example,
I believe that the bind proposal I made today may actually help solve the
problem and there may be other solutions to this problem. But obviously,
that did not find ununanimous support so there may be other solutions.
So the hard one, I believe,
that we need to at least look at more detail, is the trans coding issue.
Are there any others I missed from going through the draft, from your collective
recollection, are there other technical requirements, like some privacy requirements
and imaging requirements. To me, they struck me as being special cases
of things that we already should have in any event. Really more like
B C P type of good, good operation practice, but anything I could extract
a requirement on possible SIP related work out of.
NEW SPEAKER: Only
one, usually, user preferences, is that included with the profile.
NEW SPEAKER: That
to me, is at least roughly equivalent to user preference, yes.
NEW SPEAKER: Okay.
NEW SPEAKER: Okay.
Any other items which might be, I got, I think this was sent to the list,
to one of the SIP lists as well, from the P GP P, they also had the trans
coding as their primary requirement. Who sent that, What?
NEW SPEAKER: Who sent
it, Drage?
NEW SPEAKER: Keith.
NEW SPEAKER: Keith
Drage. Yes.
NEW SPEAKER: Yes.
Okay. So, is that, I mean, should we try to look at this problem in
more detail? Do we want to stop basically and say, okay, we've identified
possible things, how do we go forward, what do you want to do?
NEW SPEAKER: Well,
there's one kind of development going on, that worries me, and also other
people in deaf community. That FCC has stated that, in America, the
trans coding must happen. They are using streaming text. In fact,
almost all D T MS, like, TTY sort of, and the problem is that they require
this, as the only solution, using voice channels or stimulate voice channels
and it is would stop audio service, of course, trans coding in TTY, there's
only one possibility. They don't have sign language interpreter coming
in between, or even like what she's doing here, that video conference, that
is kind of closed caption, sub titles. But these stopped almost.
NEW SPEAKER: Okay.
I'm sorry if this was, came across wrong. I should have noted others,
TTY, two TTYs, just one is probably the worst example. Okay.
So I just happened to put that up by random hand movement.
So, this could be, it could
be text to and from speech, could be
NEW SPEAKER: Sign
interpreter.
NEW SPEAKER: Yes.
Let me just put sign language, without calling, and any other number of other
things. But my goal at least, here, and I think this is the, would
be the SIP approach, is that we do not want to explicitly enumerate all the
translations which are possible, because there may we will be others, which
are for, for people with different disabilities, that may need other translations.
So, the point here is what
I believe I've abstracted this into a protocol context. I want to be
able to transition from an initial contact which is a signaling level, from
A to B, where one, where some subset of a data path could be multi media
or if it's video and both, and there's no need for that, to take the media
and reroute it one or more of the media streams, reroute it through a translator
in one or two directions.
I believe that is actually
plausible and solved, and we need to identify whether this is plausibly solved
by third party control mechanisms that we have. That would be plausible
candidate mechanisms for this would be like three three cc, call transfer.
All right. This is, we could argue this is effectively transferring
the call to something, and then so on. That would be one other model.
What I think is somewhat
different about this model, that it might be in a P cc type of context, is
the notion that you have a combination of two, of translated media, with
a trans coding, and of straight through media which you have no intention
at all of passing through trans code err, because it knows nothing, has no
idea what to do with it.
Let's say, again, in the
case of text to speech type of transitions, presumably, you would have, let's
say a speech on this side, text and on this side and vice versa. And
video part is completely unaffected by that.
NEW SPEAKER: In fact,
we implemented that proceed tow time type of everything you are saying.
NEW SPEAKER: That
is do able.
NEW SPEAKER: Yes.
NEW SPEAKER: So the
question then becomes, if that is do able
NEW SPEAKER: It was
working.
NEW SPEAKER: It would
work. Because you just have to reset the IP address of that channel,
the speech channel to T, and you leave those alone, basically. So did
you actually do the work?
NEW SPEAKER: In fact,
I have call flow schema for inserting several transcoders in the path.
I'm going to put my computer up anyway. So, even we did it with many
folks as well, so we did all kind of rows.
NEW SPEAKER: Including
the --
NEW SPEAKER: Yes,
including one is directly and the other is -- they simply, what they are
doing is fetching an IP address from T.
NEW SPEAKER: And is
T, are you basically since this is a, you start with a dual call, they don't
know they need that ahead of time. Because they could both be, have
a similar capabilities without the need of a transcode err. You still
need the ability to tell T what's going on. So are you doing that,
how is the signaling? How is T involved in the signaling?
NEW SPEAKER: The thing
is, typically, if you're doing B, just back to back, it's very easy.
But when you want A to invoke it, the thing is like typically you will send
an invite or whatever and it will fail because you don't have matching ability
and then you just re invite.
NEW SPEAKER: We, I
think, the question is how does T know what to do?
NEW SPEAKER: Because
you are actually, what you do from A is fetching, really this point.
NEW SPEAKER: Go slow,
please. Don't talk too fast, please.
NEW SPEAKER: You need
this point and this point. And you actually, with the third party call
control, with the SDPs, you have both and you can shuffle them and let these
guys know, this point and this point.
NEW SPEAKER: Because
we also did this. And we actually had two ways of doing it. One
is that T looks like a conference bridge.
NEW SPEAKER: Yes.
NEW SPEAKER: And the
capabilities of A are speech, and the capabilities of B are text, and conference
bridge magically knows how to transcode between the two. So the other
thing we did, though, was explicit trans coding, where, and again, typically,
a three P cc, but basically T gets an invite, and in the invite, and this
will shock you, that we use the left hand side, basically says, this is a
trans coding call, and parameters come after. So that we can do both.
Because the benefit of that is, you can have a symmetric transcodings, like
you might have some one who wants to get text, but they can still speak.
So you don't need the text back to speech. They're unknee directional
legs.
Even with the flow for the
mixing thing. I believe the mixing thing is really more general can.
But Jonathan commented, if it's really text to speech, or whatever, we can
do it with like two media lines.
NEW SPEAKER: Which
actually, the latest net am document, has got that call flow in it.
The a symmetric call flow.
NEW SPEAKER: And the
things that you have less signaling, the conference thing, you neat two invites.
NEW SPEAKER: Right.
NEW SPEAKER: Okay.
NEW SPEAKER: Keep
in mind that sometimes some deaf people, like me, if somebody is video conference
call, I also like to hear, I still hear something. I hear audio, so,
keep in mind, it's not trans coding, the voice goes in, and only text, but
A and B, but also the voice, voice is bidirectional.
NEW SPEAKER: This
is done with our draft. The F I D.
NEW SPEAKER: So,
NEW SPEAKER:
NEW SPEAKER: Basically
you will say, A to, or in this case, B, to send the video, or whatever, to
both places at the same time.
NEW SPEAKER: Okay.
NEW SPEAKER: To T
and A. So you will get the video, the voice directly and then the speech
through T.
NEW SPEAKER: And then
there's the reality is, most devices won't do two streams. That's more
business for us though.
NEW SPEAKER: If I
may summarize, because we don't necessarily have to now go through the call
flows, is, unless somebody brings up a requirement which we cannot meet,
is what we need to do is make, create a draft which is probably going to
be B C P style, which is basically specifically the call flows which goss
through the three or four permutations of this, in the sense of double flows,
symmetric, a symmetric, this type of thing and enumerate what we need to
do. And possibly also give advice, in the F I D type of scenario, that
for these type of forking media, that support of the ability to send two
copies, is simplifies life, because then you don't need a media, or we show
a conference bridge scenario, where the conference bridge can can do this.
That would I believe be very helpful, because that may actually end up being
part of, requirements specification for devices, that you can actually do
that, and we can test that. And we can when people know what they,
they should be able to do. We send them call flows that they can test
against. Does that sound reasonable?
NEW SPEAKER: That's
good, one thing that you know is actually, there the beginning, we talked
about, the kinds of trans coding. I'd like to make an assertion and
see if it sticks or falls off the wall. That human interpretation,
so where you call an operator, that does the TTY, or relay. Or text
to speech or speech to text, or does the signing, that that would actually
be two calls.
NEW SPEAKER: Okay.
NEW SPEAKER: That
they are literally a back to back user agent. And that's an assertion.
I could see --
NEW SPEAKER: Could
you --
NEW SPEAKER: I can
see good and bad things.
NEW SPEAKER: Could
you do it like that. This is, T is actually a person?
NEW SPEAKER: Yes,
the difference, and the reason why I was thinking. At first I was thinking
T could be a person. But then you quite often have sort of private
conversations here. Maybe even before the call is set up.
NEW SPEAKER: Okay.
NEW SPEAKER: The down
side is, when we do this, how should we say, you know, in the, like if we
use third party call control, there's information about this dialogue that's
preserved, whereas if it's a call from here and a call to there, some, you
know, it's external. The correlation of the legs.
NEW SPEAKER: Yes.
NEW SPEAKER: So that's
why I'm throwing in and seeing if it sticks, because that might make it fall
off the wall, or that it's too hard, because this person can do all sorts
of bad things that are not in the SIP spec. And I think that might
simplify things.
NEW SPEAKER: The part
I like from three P cc is that actually there's no signaling between T and
B. So everything is transferring to the end point. So I guess
NEW SPEAKER: Yes.
I would like to have, even with a back to back user agent, you can can basically
rep indicate the model that we use
( Replicate. That
we do today. )
If I know I need assistance
in con versing with my destination, I first call an operator and tell that
operator, in some mechanism, typically by talking to the person, please dial
this number, and the operator stays on the call. That model clearly
we can support.
NEW SPEAKER: That
always works.
NEW SPEAKER: Right.
So maybe what we should do, since I see the, from what I can tell so far,
is that this actually, the effort that we are engaging on, is actually more
of a, almost educational effort, rather than necessarily a protocol design
effort. And in that environment, it is certainly useful, because it
may be be designed for people who are not, course expert, we may need to
do draft requirements, in a more non technical level, in the sense of FCC
type requirements.
So, having the operator
model in that call flow spec, even if it is trivial to us, may we will be
useful.
NEW SPEAKER: Yes,
make it complete.
NEW SPEAKER: Make
it complete basically.
NEW SPEAKER: Now,
maybe one interesting feature of it, using SIP is. A calls B.
And B is hearing, and there are two ways that relay or trans coding can be
activated. By A or by B. So you have to keep up in mind, that
also that happens, to prevent A and B both activated. That kind of
scenario.
NEW SPEAKER: From
B cc
NEW SPEAKER: It should
be automatically. I don't want the lady to call first, operator, I
just want to place a phone call directly to B and I'm somehow T will get
involved, depending on user preferences, it will say this T is in there.
NEW SPEAKER: From
B it's very easy, because you have the call coming and you can do all this
kind of stuff.
NEW SPEAKER: Yes.
How, if we were --
NEW SPEAKER: But from
A.
NEW SPEAKER: From
A, is the difficult case. But when you are the callee, it's very easy.
NEW SPEAKER: Yes.
NEW SPEAKER: But of
course we can address that in the draft as well.
NEW SPEAKER: So we
need to basically show that. Because always you would probably have
to treat it almost more like a conference type bridge dial out, such that
first call T, and then you would ask T to call, but that's again more the
agent part.
NEW SPEAKER: Yes.
NEW SPEAKER: That
doesn't --
NEW SPEAKER: I don't
like that. But actually, you know net N O two, I think, has got that
flow. Not the T calling out, because you know, we never called out.
But I think we have both
NEW SPEAKER: If you
want to have a look at this flow, because it's, it shows how A can speak
to B actually. And it can put a number of of E
intermediary in the middle. So this is basically on all the flow and
we cannot go into detail. And the red lines indicate that actually,
you can make everything and then the invite goes through B. So if B
wants to do stuff, this is actually done.
NEW SPEAKER: Okay.
NEW SPEAKER: So I
will distribute this call flow. Everything I have it down already.
NEW SPEAKER: Okay.
So, again, it sounds like there may still be detail and we need to make sure
we cover all the permutations that you identified, in particular, but it
sounds like we are pretty close to covering all the cases.
It would be, the question
now is, should we, one proposal in order to make reasonably quick progress
would be to simply have a small document which is somewhat smaller than Allen's
call flow document, and separate and, in particular, separate from it, because
as came up during one discussion, somebody mentioned the call flow document
may stay a living document, essentially forever, and that's not what we want
here. We want closure and get it out.
And in particular, because
I want to avoid that we get external requests from you shall do this, and
not having, then we can just say, yes, we solved this problem, here's the
solution, don't try to impose a solution which may not be necessary.
NEW SPEAKER: We can
regulate ourselves.
NEW SPEAKER: Yes.
We can at least show them that there is, you know, we don't have to maybe
add a special header method or knee jerk reaction might we will be.
So, Ken, since you already
have the most difficult part and the two of you already have the most difficult
part, would it be possible that the two of you could simply, I mean, there's
not much text to it. There's really only, enumerate the cases that
we have identified, in terms of a three P cc model, a back to back user agent
model. A dual stream model, a conferenceing model, and A initiated
model and B initiated model. Those seem to be roughly the ones we talked
about i.E., Yes.
NEW SPEAKER: And do
that quickly. And I see no particularly, looking towards at least one
corner, chair corner here. That actually does not even seem, that is
a SIPPING document.
NEW SPEAKER: Yes.
NEW SPEAKER: Not,
because there's no
NEW SPEAKER: It's
just usage.
NEW SPEAKER: And I
see no particular
NEW SPEAKER: We have
NEW SPEAKER: Process
problem.
NEW SPEAKER: We have
all the tools.
NEW SPEAKER: We have
the tools. There's no P headers involved. So,
NEW SPEAKER: In fact,
it's information.
NEW SPEAKER: It's
strictly informational. Okay.
NEW SPEAKER: Okay.
Will you send me what you have.
NEW SPEAKER: I can
elaborate on the draft and then we can bounce it back and forth.
NEW SPEAKER: Were
you taking notes.
NEW SPEAKER: Can we
save those notes.
NEW SPEAKER: Would
it be be acceptable to to have it like in two weeks.
NEW SPEAKER: I'm on
vacation next week.
NEW SPEAKER: I'm in
Japan, actually, so I'm going to Finland the 29th of July. So I will
not be able to do it.
NEW SPEAKER: Let's
make it three weeks. You'll be gone for a week and you'll take a eke
to do the stuff and give it to me and I'll put it altogether.
NEW SPEAKER: My hope
would be, we can basically, since again this should not be hard, there the
sense it should not be long.
NEW SPEAKER: Right.
And we have most of the text already.
NEW SPEAKER: You have
the text just needs framework, you need glue to make it read as opposed to
just be call flows.
All we need at that point
is, I mean, I would hope that by, that we can basically get, before the next
IETF, have this be --
NEW SPEAKER: Oh, yes.
NEW SPEAKER: By far.
NEW SPEAKER: Have
this be in last call. I mean, nothing goes instantaneously.
NEW SPEAKER: P headers.
NEW SPEAKER: Make
it P headers, it will be faster.
(Several people talking.)
NEW SPEAKER: We'll
have an RFC number this afternoon.
NEW SPEAKER: Yes,
but that sounds like a plan to me.
NEW SPEAKER: Okay.
So, the second question, the user preferences, is there, there's two problems,
two small sub problems to that. The first sub problem is the generic
one which I don't think we want to solve here, which is the access to a profile
by a, or user preference, to, by a user, so that device, the S T P was outgoing.
What that means presumably, technically, is that if I get my profile here,
it would identify that I prefer a transcode err, if I'm talking to a hearing
person. So, there would be some kind of mechanism which will automatically
set my SDP and set my signaling to do the right thing, because I don't want
to crank all this manually.
So, the question then becomes,
the hard part which I don't think we can actually solve right now, is that
this imposes more requirements on a user profile. So that would be
kind of the, the Dan Petrie corner of things. And probably even somewhat
different than that. Because it does not, it is not just a list of
parameters. In some sense, unless you just simply say, need to transcode
err equals one. Presumably you want to say more than that. Because
for example, you might want to specify in that user profile, that I prefer,
which this was one of the requirements, that I prefer the trans coding service
located at, I mean text to speech dot org or something look that. And
so that would be part of the profile. So we just, I think, at this
point, that it may be the case, that the, since we're, the profile mechanism
is evolving, hopefully with a little bit more speed than it had in the last
year or two, that once we, we have two parts, namely, we need to make sure
that the profile itself, allows the ability to indicate more, than simple
numeric parameters, but some notion of prefab SDP or whatever else is needed
for doing that.
That's the, I mean, is anything
more we can do at this point? Can can we identify in more concrete
detail what you you would need? What you you would want to be able
to specify in the user preferences, beyond just the address or name of this
device.
NEW SPEAKER: The name
and which kind of media.
NEW SPEAKER: Which
kind of media you want. Okay. So that sounds to me like an e-mail
message to Dan. To make sure that they're aware of that particular
need, that gets reflected in the requirements document for the framework.
NEW SPEAKER: Yes,
it's a general requirement, really, being able to invoke an application server
for every call.
NEW SPEAKER: Exactly.
NEW SPEAKER: Yes.
NEW SPEAKER: So we'll
do that. Okay.
NEW SPEAKER: I'm trying
to follow everything, but I was thinking a lot of things in my mind.
We have to event just follow the FCC regulation, the FCC limits us.
First look at how can we do it, and see what the services are also being
possible, we don't have to name them now, but the infrastructure will be
there. That's a good thing.
Another thing is, there
might be terminals that don't have all the capabilities, because one of you
mentioned that some terminals are not able to have note multiple streams.
And sometimes, there's one stream. Video and audio and sub titles,
but they make it in one stream, video and audio and sub titles in one.
So it turns out to be one media stream. So we have to keep that in
the back of our mind, that there's kind of an in between device which helps
for communication, the last leap. Let's say A, having A sent, let me,
let's say that you don't have this, and there's some node and then there's
a deaf user, because it might also, so this node has a lot more capability
and it might be part of where it is network. Or from there, third party,
or whatever, it keeps open now. But you have multiple terminals, for
a while. Mobile terminal is not likely to have many media streams.
So you might simplify it. But I like to have that also included in
the models.
NEW SPEAKER: The question
I have -- sorry. Go ahead.
NEW SPEAKER: One thing
is, maybe I'm looking too much into it. But, it might be nice if the
terminal knows which stream is actually relay service stream and which is
the normal call stream.
NEW SPEAKER: This
is the call flow I have actually. Because in fact, you have, I mean,
it's a more difficult call flow, when actually, they invite that A sense
doesn't have SDP. That's why we were working on that. Because
it presents a couple of complications, but it's possible this way as well.
NEW SPEAKER: So, is
this a SIP user agent?
NEW SPEAKER: Yes.
This is typically a wireless terminal, because you don't want to do third
party call control over the air and this is an application server that does
things on your behalf.
NEW SPEAKER: So this
is a back to back user agent? Or is this --
NEW SPEAKER: It's
in the
(Several people talking.)
NEW SPEAKER: In fact.
No, I was, that's why I had some arrows in red, because it's possible to
act as a proxy for the invited call to B. But of course to D, you are
back user, you don't have anything to avoid. But it's actually possible
to just proxy the request to B. Which is the good thing of this flow.
NEW SPEAKER: Okay.
So, just,
NEW SPEAKER: Yes.
NEW SPEAKER: And identification
of media stream for which relay service, or trans coding service, it's just
normal call from B. Maybe something in that area, or not, I was thinking
about it. It might be that the terminal wants to know if it's a normal
call stream.
NEW SPEAKER: Okay.
That would
NEW SPEAKER: The actual
invite, yes.
NEW SPEAKER: Yes.
NEW SPEAKER: The requirement
or a need was identified here, how we would want to identify that if B gets
the request, if we do the A initiated model, gets a request, that this is
actually going through a transcode err. There the sense that I don't
know if this is a legal requirement or just a convenience requirement.
NEW SPEAKER: Actually,
it would become an I A B requirement. I'm thinking about this.
Because the same, there fact, it's the same operation.
NEW SPEAKER: Yes,
I would. In general, I would want to know that the trans coding is
happening, because that may introduce, I don't want text stream to look the
same as if you are typing it because there may be inaccuracy because of speech
to text technology. I'd rather not take everything literally, like
you said. Because
NEW SPEAKER:
NEW SPEAKER: Two text
streams for example. It also happens that I'm sending typing on a white
board or something, and the it's also text, and it's the same protocol.
Which one is, you're just hammering on the keyboard for example. Or
to voice streams because it's another person being involved or for a deaf
person trying to talk and then they're brightening text and the trans coding
service puts the text also into voice. We have two voice streams to
B for example. But they need something to see it. Because in
SIP, the other thing, the audio stream, video stream is video stream.
And we need to deal in another way.
NEW SPEAKER: So, the
question is, that appears to me like it would be dealt with with an indication
in the media in the session description protocol, because it's a property
of each media stream, not of the call.
NEW SPEAKER: Right.
It's going to be, the media stream specific.
NEW SPEAKER: Pardon
me for a Inn.
(Several people talking.)
NEW SPEAKER: This
is the translation or this is subject version of it. It can be just,
because it's for human.
NEW SPEAKER: Yes.
Is
NEW SPEAKER: Is it
any different than receiving two audio streams?
NEW SPEAKER: No.
With information like we are --
NEW SPEAKER: What
you need, you want to actually be able to identify in some Unified manner
so that you can recognize because, since not all devices have the ability
to enter strings of text, it may we will be sensible to just simply have
a defined flag which says, this stream has been subject to translation.
And so that I can put a little asterisk next to it, or do whatever, and in
addition to that, I can say, the translation was performed by, courtesy of
whoever.
NEW SPEAKER: So something
for the machine consumption.
NEW SPEAKER: Yes.
NEW SPEAKER: It's
just an A attribute. Say.
NEW SPEAKER: I can't
even make a more clear picture. We have video conference call and assume
I know sign language. The sign language relay or trans coding service,
has the same video stream with the little woman signing. How does it
a device know that the big screen and the lady should be on the small screen
in the corner? Because it's using the same protocol, same video stream.
You need two separate streams, or in some cases it might be a note in between,
that you paste together, but how do you know this is Gonzalo.
NEW SPEAKER: Because
it's MIME type video slash M peg four dash sign language.
NEW SPEAKER: No.
But you can do that. If you get just one single stream, it's not a
problem. But when you get two, we have this problem anyway.
NEW SPEAKER: Actually,
the nice thing is, if it's not audio, it actually seems to be easier, to
separate it.
NEW SPEAKER: Yes.
But, in general, this is probably maybe a generalize able concern, we talk
about this in a bit, in the conferenceing discussion, which is layout problem.
Namely, how much of a hint do you want to give the recipient, that, about
how they should lay it out. Because you may actually, for example,
I mean, in the conferenceing, you might want to align spatially the, audio
comes from left or speaker shown on the left type of deal.
Here, in order to do that,
it's certainly always useful to know what the content is. And again,
having the ability to indicate, is there, the question then becomes, is there
a generalize able content label, which basically says, what type of content
is part of that stream. We had so far, I think, in SDP and G, you can
identify language. The problem is, once you go down the that road,
you end up at M peg 7 which exactly does that. Because it can exactly
identify what each stream is, down to who the actor is, or presumably here,
who is is doing the transcoding, a name of a person or anything else.
That's probably not what we want.
NEW SPEAKER: and actually,
that's one question and I don't know the answer, is that information an attribute
of the stream, meaning SDP, or is it an attribute that this leg of the session,
meaning SIP?
NEW SPEAKER: I thought
it would be an attribute of a stream, because you could have a, composite,
composition stream, or you know, and can be multiple purposes. So it
can certainly be multiple. As soon as they can be multiple, you don't
want to have, you don't want to have to do the mapping from the session down
to the third stream service
NEW SPEAKER: Right.
Actually, now I'm imagining the invite coming in with all the SDP and that's
just it. It's all the SDP. It's going to be associated with the
stream.
NEW SPEAKER: So
NEW SPEAKER: So it,
it's an A line.
NEW SPEAKER: Yes.
It's an A line. The only question is, do we specifically do something
here, or do we basically say, we define a purpose or a source or time or
whatever field, where sign language or caption, close captioning if it's
transmitted as video signal, otherwise it's just text. But, other other
indications are conveyed, is this rich enough that it's worthwhile giving
a generality?
NEW SPEAKER: Well,
now, I don't know whether it's popping up or popping down. Looking
at the requirements from the I A P, so there, there is the requirement that
you must let basically both parties know that there's a trans coding device.
Now, I think --
NEW SPEAKER: There
was only one party.
NEW SPEAKER: At least
one party. At least one party had to know.
NEW SPEAKER: Yes.
NEW SPEAKER: Will
we run a foul of that in that the requirements here said that, I guess we're
not. Because it said that, you know, like maybe A does not want B to
know that they have the trans coding services. Well as long as A knows
they are being trans coded, that's okay.
NEW SPEAKER: They
are being invoking the service specifically. There's also a difference
here, between, the party, it's basically, the party which is generating the
media, needs to know that it's going there a trans coding service.
Because they need to know that it gets translated along the way and may not
lie in the same condition --
NEW SPEAKER: Well,
that's the point. So, you're hearing impaired. I call you.
My stream is going to be trans coded. But you don't want me to know
that you're hearing impaired.
NEW SPEAKER: Okay.
That's true. I mean, that is a, which is different from the other requirement
where you basically had both sides, actually, the network, is the adversary
in some sense, which you want to at least know about. Here it is, the
network comes to the aid of one of the participants, which does not necessarily
want to let the other one know.
NEW SPEAKER: So we
meet the requirement, because you know, it's one of the parties that knows.
NEW SPEAKER: Yes,
and I think it might be helpful in the draft, to simply at least refer to
those requirements and say that, before we get into lots of heat,
NEW SPEAKER: Right.
NEW SPEAKER: We thought
about this, and we may not have addressed it completely, but at least we
thought about it.
NEW SPEAKER: Right.
NEW SPEAKER: Yes.
But actually, I mean, I still believe that the point you you made in the
previous session, they were right. They were so concerned about Eintermediary
s, but you said if I want the E intermediary to be there
like a SIP proxy, I don't see the problem. But I guess they didn't
quite get it.
NEW SPEAKER: Okay.
So do we have, we have identified a very small action item.
NEW SPEAKER: The only
thing with the A attribute is maybe we need a two line M music draft for
this.
NEW SPEAKER: Exactly.
NEW SPEAKER: Well,
the question again is, how generalize able is it? And it might we will
be worth simply putting it out there and see if it is more than what we have
identified here. If not, that's fine too. Probably, it's just
simply a name space registration.
NEW SPEAKER: Yes.
NEW SPEAKER: And a
single line, which is identify able. And that's, it's a, I mean,
NEW SPEAKER: M music,
I guess like 2,000 50 will get standardized.
NEW SPEAKER: That's
it.
NEW SPEAKER: It's
a P A attribute.
NEW SPEAKER: That's
quite far estimation.
NEW SPEAKER: M music
is not exactly the fastest in the world.
NEW SPEAKER: Again,
I'll let you have the stamp.
NEW SPEAKER: We may
still get it, even in this category, as soon as the FCC gets around to talking
to three GP P.
NEW SPEAKER: This
is just a reference point. We didn't get a correspondence note from
Keith that says make sure you a account for three G T T translation, because
that's going to be mandatory.
NEW SPEAKER: What's
G T T
NEW SPEAKER: Telephone
phony. That's what they call it. G T T global.
NEW SPEAKER: That's
the same as TTY
NEW SPEAKER: It's
not quite. The G T T work that is at Ericsson, pretty much led accounts
for doing TTY with R T P.
NEW SPEAKER: Okay.
But it is similar in spirit so to say.
NEW SPEAKER: But is
there any document we have to read? Or to try to comply to that?
NEW SPEAKER: Because
that would be great to have that.
NEW SPEAKER: As a
reference.
NEW SPEAKER: I'll
probably get a reference here.
NEW SPEAKER: But let's
be careful about global text telephony part, because, they are just focused
on using audio channel or streaming text over R T P and they might be limited
in this way. I don't know, I did read some of that work, and one of
those people was actually being involved in making that protocol.
NEW SPEAKER: People
from Ericsson.
NEW SPEAKER: Yes.
But there's also another function in G T PP from multi media side.
But I don't have all the information. And they, might also want to
get involved. Intention telephony. But more from the multi media
side. More capabilities coming on. More than we have. But
G T T is difficult, because they are just for the FCC.
NEW SPEAKER: As far
as, I mean, clearly, we are all in agreement, that whatever transcoders we
are, are going to be only examples. And so that basically, we only
role of calling out what T might be, is simply to give people the notion
that we're not just thinking of one specific type. And that the list
is not going to be exhaustive. I see no need. The only real need
that we may need to identify here is make sure, is that if G T T, we need
to identify what it is like. Namely, whether it has an appropriate
registration, as an, as a media type in SDP so that we can actually negotiate
it. Because otherwise none of this is --
NEW SPEAKER: But that's
it. Anything that you can have SDP for we can translate.
NEW SPEAKER: The only
general issue that we need to know is, for every media type that T handles
here, is do these fields, do these types have appropriate media type registrations,
because they may not. So that's maybe one thing to, since you're more
familiar with all these translation techniques, is, and now that we know
about it, is to simply check what do they actually do? Is it a TCP
stream, over, just a text over, ASCII over TCP, or is it ASCII. Or
text over R T P? Or is it.
NEW SPEAKER: G T T
stuff is --
NEW SPEAKER: I can
say, it is audio tones over R T P. That is what the global text telephony
will do. That will require the tone to know what to deal with about
tones. So instead of beep beep beep, that is in fact, what global text
telephony will do.
NEW SPEAKER: So it
is actually modem tones, effectively?
NEW SPEAKER: But they're
bizarre tones.
NEW SPEAKER: It's
like T T M F effectively so every character has its own tone.
NEW SPEAKER: No, it's
like a 5 bit ugly.
NEW SPEAKER: It's
one of the
NEW SPEAKER:
( I don't know what he said.
)
NEW SPEAKER: It's
old technology.
NEW SPEAKER: Yes,
and there's also a variation, which is real time text over R T P.
NEW SPEAKER: Yes,
I'm familiar with that
NEW SPEAKER:
NEW SPEAKER: The reference
is 23 Dot 22 six if you're interested. Of the three GP P specifications.
NEW SPEAKER: Okay.
So, it's one action item for somebody in this group, is general, is to make
sure that all of these have registrations and that we have, I don't think
we should do this ourselves. Generally, it's best if the domain experts
do this. That we make sure that we contact, if we can, the people who
do G T T with people who do V I Z E talk. That they should register
at least a name. And if it's R T P definition or pay load format, make
sure that we can more than just talk about it, but actually negotiate it.
Do you know of the busy talk people?
NEW SPEAKER: I know
one person, but, he's very slow to respond to e-mail. But Gonzalo also
knows him.
NEW SPEAKER: Who?
NEW SPEAKER: I might
know I am but I'm not sure.
NEW SPEAKER: No.
NEW SPEAKER: I can
try to find out. There are more people in G T T I can try to find out.
NEW SPEAKER: Okay.
NEW SPEAKER: Well,
we do what we can. There's not much, if we can't do it we can't do
it.
NEW SPEAKER: Okay.
Are there any other action items, I guess, in, between, if it takes two people
to do, write a ten line SDP draft, somewhere, between between one or two
of us, we can crank one out quickly.
NEW SPEAKER: Yes.
NEW SPEAKER: Okay.
So we have, I believe we now have a signed, unless there are other items,
we're running out of time here, unless there are other technical action items,
at this point, I would propose that we report back to the group, summarizing
what we have done. And I'm hopefully, within roughly let's say a month.
NEW SPEAKER: Yes.
A month.
NEW SPEAKER: Roughly
a month, we'll have the two drafts out. And we have the action item
which I'll take on, to simply, I'll cc the group on it, to offer Dan to incorporate
user preferences.
NEW SPEAKER: Yes.
NEW SPEAKER: Good.
NEW SPEAKER: Okay.
NEW SPEAKER: Happy,
Dean?
NEW SPEAKER: Yes.
NEW SPEAKER: Whatever.
NEW SPEAKER: Dean
is happy, we are happy.
NEW SPEAKER: That's
right.
NEW SPEAKER: And so,
could you e-mail, if you send it to me, I can put it on the web site.
From the minutes on the meeting.
NEW SPEAKER: Those
are the most accurate minutes we'll ever have.