Notes, Planning for Work to Address Requirements for Hearing Impaired

Transcribed by Renee Cohen, edited by Dean Willis


IETF meeting for hearing impairment, July 17, 2002, at 5:30 p.M.

Attending:

Eric Burger, eburger@snowshore.com
Robert Patzer, rpatzer1@email.mot.com
Gonzalo Camarillo, gonzalo.camarillo@ericsson.com
Henning Schulzrinne, hgs@cs.columbia.edu
Arnoud vanWijk, arnoud.van.wijk@eln.ericsson.se
Dean Willis, dean.willis@softarmor.com

 Transcript:

         NEW SPEAKER:  We only have an hour.   
         NEW SPEAKER:  I have another meeting at six so we have to get moving.   
         NEW SPEAKER:  I'm not sure I'm leading this, but since, Dean, did you want to.   
         NEW SPEAKER:  Please go ahead.   
         NEW SPEAKER:  You have the proposals in the area.   
         NEW SPEAKER:  So we now have, as you well.   
         NEW SPEAKER:  Okay.  So we have a requirements document that is I believe now, in I S G, or where is it,  
         NEW SPEAKER:  R C editor.   
         NEW SPEAKER:  I went through it again, today, and I believe the next step, of ramong other organizational issues which you would like to discuss, are what are the concrete technical problems, now that we have a generic requirements, that we might be able to actually do something about it.
         I have been able to, by going through the document, to define two such requirements, and there may be others which I missed.  Two generic requirements, in there.  And one, several which fall, which, in that category.  One is trans coding requirement.  Namely, the ability to insert a third party service, which allows you to trans code various media into each other, primarily, for hard of hearing needs, deaf and hard of hearing needs, this would be text to speech and speech to text.  And/or, and for blind users, visual impaired users that might be, like busy talk or something like that.
         A system to trans code into sign language.
         So, they're both trans coding services.  They are to be able to trans code services, for example, into a classical TTY type of service, I imagine, I'm not sure if this is mentioned in the draft.
         The second one, which I think is actually somewhat easier to deal with, but is a generic problem that we just discussed in a different vain today, is the notion of a profile for a user, which makes it easy for a user to move to a different device and still have, for example, the ability to call for a trans coding service automatically, without having to enter some, lots of magic things whenever they walk up to a telephone.
         I believe we can dispense with that one relatively quickly, in the sensible that is just a special case of the generic issue of profile mobility, and so this may be addressed in one of the mechanisms that we talked about, profile mobility, for example, I believe that the bind proposal I made today may actually help solve the problem and there may be other solutions to this problem.  But obviously, that did not find ununanimous support so there may be other solutions.
         So the hard one, I believe, that we need to at least look at more detail, is the trans coding issue.  Are there any others I missed from going through the draft, from your collective recollection, are there other technical requirements, like some privacy requirements and imaging requirements.  To me, they struck me as being special cases of things that we already should have in any event.  Really more like B C P type of good, good operation practice, but anything I could extract a requirement on possible SIP related work out of.
          
         NEW SPEAKER:  Only one, usually, user preferences, is that included with the profile.   
         NEW SPEAKER:  That to me, is at least roughly equivalent to user preference, yes.   
         NEW SPEAKER:  Okay.   
         NEW SPEAKER:  Okay.  Any other items which might be, I got, I think this was sent to the list, to one of the SIP lists as well, from the P GP P, they also had the trans coding as their primary requirement.  Who sent that,  What?   
         NEW SPEAKER:  Who sent it, Drage?   
         NEW SPEAKER:  Keith.   
         NEW SPEAKER:  Keith Drage.  Yes.   
         NEW SPEAKER:  Yes.  Okay.  So, is that, I mean, should we try to look at this problem in more detail?  Do we want to stop basically and say, okay, we've identified possible things, how do we go forward, what do you want to do?   
         NEW SPEAKER:  Well, there's one kind of development going on, that worries me, and also other people in deaf community.  That FCC has stated that, in America, the trans coding must happen.  They are using streaming text.  In fact, almost all D T MS, like, TTY sort of, and the problem is that they require this, as the only solution, using voice channels or stimulate voice channels and it is would stop audio service, of course, trans coding in TTY, there's only one possibility.  They don't have sign language interpreter coming in between, or even like what she's doing here, that video conference, that is kind of closed caption, sub titles.  But these stopped almost.   
         NEW SPEAKER:  Okay.  I'm sorry if this was, came across wrong.  I should have noted others, TTY, two TTYs, just one is probably the worst example.  Okay.  So I just happened to put that up by random hand movement.
         So, this could be, it could be text to and from speech, could be  
         NEW SPEAKER:  Sign interpreter.   
         NEW SPEAKER:  Yes.  Let me just put sign language, without calling, and any other number of other things.  But my goal at least, here, and I think this is the, would be the SIP approach, is that we do not want to explicitly enumerate all the translations which are possible, because there may we will be others, which are for, for people with different disabilities, that may need other translations.
         So, the point here is what I believe I've abstracted this into a protocol context.  I want to be able to transition from an initial contact which is a signaling level, from A to B, where one, where some subset of a data path could be multi media or if it's video and both, and there's no need for that, to take the media and reroute it one or more of the media streams, reroute it through a translator in one or two directions.
         I believe that is actually plausible and solved, and we need to identify whether this is plausibly solved by third party control mechanisms that we have.  That would be plausible candidate mechanisms for this would be like three three cc, call transfer.  All right.  This is, we could argue this is effectively transferring the call to something, and then so on.  That would be one other model.
         What I think is somewhat different about this model, that it might be in a P cc type of context, is the notion that you have a combination of two, of translated media, with a trans coding, and of straight through media which you have no intention at all of passing through trans code err, because it knows nothing, has no idea what to do with it.
         Let's say, again, in the case of text to speech type of transitions, presumably, you would have, let's say a speech on this side, text and on this side and vice versa.  And video part is completely unaffected by that.   
         NEW SPEAKER:  In fact, we implemented that proceed tow time type of everything you are saying.   
         NEW SPEAKER:  That is do able.   
         NEW SPEAKER:  Yes.   
         NEW SPEAKER:  So the question then becomes, if that is do able  
         NEW SPEAKER:  It was working.   
         NEW SPEAKER:  It would work.  Because you just have to reset the IP address of that channel, the speech channel to T, and you leave those alone, basically.  So did you actually do the work?   
         NEW SPEAKER:  In fact, I have call flow schema for inserting several transcoders in the path.  I'm going to put my computer up anyway.  So, even we did it with many folks as well, so we did all kind of rows.   
         NEW SPEAKER:  Including the --  
         NEW SPEAKER:  Yes, including one is directly and the other is -- they simply, what they are doing is fetching an IP address from T.   
         NEW SPEAKER:  And is T, are you basically since this is a, you start with a dual call, they don't know they need that ahead of time.  Because they could both be, have a similar capabilities without the need of a transcode err.  You still need the ability to tell T what's going on.  So are you doing that, how is the signaling?  How is T involved in the signaling?   
         NEW SPEAKER:  The thing is, typically, if you're doing B, just back to back, it's very easy.  But when you want A to invoke it, the thing is like typically you will send an invite or whatever and it will fail because you don't have matching ability and then you just re invite.   
         NEW SPEAKER:  We, I think, the question is how does T know what to do?   
         NEW SPEAKER:  Because you are actually, what you do from A is fetching, really this point.   
         NEW SPEAKER:  Go slow, please.  Don't talk too fast, please.   
         NEW SPEAKER:  You need this point and this point.  And you actually, with the third party call control, with the SDPs, you have both and you can shuffle them and let these guys know, this point and this point.   
         NEW SPEAKER:  Because we also did this.  And we actually had two ways of doing it.  One is that T looks like a conference bridge.   
         NEW SPEAKER:  Yes.   
         NEW SPEAKER:  And the capabilities of A are speech, and the capabilities of B are text, and conference bridge magically knows how to transcode between the two.  So the other thing we did, though, was explicit trans coding, where, and again, typically, a three P cc, but basically T gets an invite, and in the invite, and this will shock you, that we use the left hand side, basically says, this is a trans coding call, and parameters come after.  So that we can do both.  Because the benefit of that is, you can have a symmetric transcodings, like you might have some one who wants to get text, but they can still speak.  So you don't need the text back to speech.  They're unknee directional legs.
         Even with the flow for the mixing thing.  I believe the mixing thing is really more general can.  But Jonathan commented, if it's really text to speech, or whatever, we can do it with like two media lines.   
         NEW SPEAKER:  Which actually, the latest net am document, has got that call flow in it.  The a symmetric call flow.   
         NEW SPEAKER:  And the things that you have less signaling, the conference thing, you neat two invites.   
         NEW SPEAKER:  Right.   
         NEW SPEAKER:  Okay.   
         NEW SPEAKER:  Keep in mind that sometimes some deaf people, like me, if somebody is video conference call, I also like to hear, I still hear something.  I hear audio, so, keep in mind, it's not trans coding, the voice goes in, and only text, but A and B, but also the voice, voice is bidirectional.   
         NEW SPEAKER:  This is done with our draft.  The F I D.   
         NEW SPEAKER:  So,  
         NEW SPEAKER:   
         NEW SPEAKER:  Basically you will say, A to, or in this case, B, to send the video, or whatever, to both places at the same time.   
         NEW SPEAKER:  Okay.   
         NEW SPEAKER:  To T and A.  So you will get the video, the voice directly and then the speech through T.   
         NEW SPEAKER:  And then there's the reality is, most devices won't do two streams.  That's more business for us though.   
         NEW SPEAKER:  If I may summarize, because we don't necessarily have to now go through the call flows, is, unless somebody brings up a requirement which we cannot meet, is what we need to do is make, create a draft which is probably going to be B C P style, which is basically specifically the call flows which goss through the three or four permutations of this, in the sense of double flows, symmetric, a symmetric, this type of thing and enumerate what we need to do.  And possibly also give advice, in the F I D type of scenario, that for these type of forking media, that support of the ability to send two copies, is simplifies life, because then you don't need a media, or we show a conference bridge scenario, where the conference bridge can can do this.  That would I believe be very helpful, because that may actually end up being part of, requirements specification for devices, that you can actually do that, and we can test that.  And we can when people know what they, they should be able to do.  We send them call flows that they can test against.  Does that sound reasonable?   
         NEW SPEAKER:  That's good, one thing that you know is actually, there the beginning, we talked about, the kinds of trans coding.  I'd like to make an assertion and see if it sticks or falls off the wall.  That human interpretation, so where you call an operator, that does the TTY, or relay.  Or text to speech or speech to text, or does the signing, that that would actually be two calls.   
         NEW SPEAKER:  Okay.   
         NEW SPEAKER:  That they are literally a back to back user agent.  And that's an assertion.  I could see --  
         NEW SPEAKER:  Could you --  
         NEW SPEAKER:  I can see good and bad things.   
         NEW SPEAKER:  Could you do it like that.  This is, T is actually a person?   
         NEW SPEAKER:  Yes, the difference, and the reason why I was thinking.  At first I was thinking T could be a person.  But then you quite often have sort of private conversations here.  Maybe even before the call is set up.   
         NEW SPEAKER:  Okay.   
         NEW SPEAKER:  The down side is, when we do this, how should we say, you know, in the, like if we use third party call control, there's information about this dialogue that's preserved, whereas if it's a call from here and a call to there, some, you know, it's external.  The correlation of the legs.   
         NEW SPEAKER:  Yes.   
         NEW SPEAKER:  So that's why I'm throwing in and seeing if it sticks, because that might make it fall off the wall, or that it's too hard, because this person can do all sorts of bad things that are not in the SIP spec.  And I think that might simplify things.   
         NEW SPEAKER:  The part I like from three P cc is that actually there's no signaling between T and B.  So everything is transferring to the end point.  So I guess  
         NEW SPEAKER:  Yes.  I would like to have, even with a back to back user agent, you can can basically rep indicate the model that we use  
         ( Replicate.  That we do today. )
         If I know I need assistance in con versing with my destination, I first call an operator and tell that operator, in some mechanism, typically by talking to the person, please dial this number, and the operator stays on the call.  That model clearly we can support.   
         NEW SPEAKER:  That always works.   
         NEW SPEAKER:  Right.  So maybe what we should do, since I see the, from what I can tell so far, is that this actually, the effort that we are engaging on, is actually more of a, almost educational effort, rather than necessarily a protocol design effort.  And in that environment, it is certainly useful, because it may be be designed for people who are not, course expert, we may need to do draft requirements, in a more non technical level, in the sense of FCC type requirements.
         So, having the operator model in that call flow spec, even if it is trivial to us, may we will be useful.   
         NEW SPEAKER:  Yes, make it complete.   
         NEW SPEAKER:  Make it complete basically.   
         NEW SPEAKER:  Now, maybe one interesting feature of it, using SIP is.  A calls B.  And B is hearing, and there are two ways that relay or trans coding can be activated.  By A or by B.  So you have to keep up in mind, that also that happens, to prevent A and B both activated.  That kind of scenario.   
         NEW SPEAKER:  From B cc  
         NEW SPEAKER:  It should be automatically.  I don't want the lady to call first, operator, I just want to place a phone call directly to B and I'm somehow T will get involved, depending on user preferences, it will say this T is in there.   
         NEW SPEAKER:  From B it's very easy, because you have the call coming and you can do all this kind of stuff.   
         NEW SPEAKER:  Yes.  How, if we were --  
         NEW SPEAKER:  But from A.   
         NEW SPEAKER:  From A, is the difficult case.  But when you are the callee, it's very easy.   
         NEW SPEAKER:  Yes.   
         NEW SPEAKER:  But of course we can address that in the draft as well.   
         NEW SPEAKER:  So we need to basically show that.  Because always you would probably have to treat it almost more like a conference type bridge dial out, such that first call T, and then you would ask T to call, but that's again more the agent part.   
         NEW SPEAKER:  Yes.   
         NEW SPEAKER:  That doesn't --  
         NEW SPEAKER:  I don't like that.  But actually, you know net N O two, I think, has got that flow.  Not the T calling out, because you know, we never called out.  But I think we have both  
         NEW SPEAKER:  If you want to have a look at this flow, because it's, it shows how A can speak to B actually.  And it can put a number of of E              intermediary in the middle.  So this is basically on all the flow and we cannot go into detail.  And the red lines indicate that actually, you can make everything and then the invite goes through B.  So if B wants to do stuff, this is actually done.   
         NEW SPEAKER:  Okay.   
         NEW SPEAKER:  So I will distribute this call flow.  Everything I have it down already.   
         NEW SPEAKER:  Okay.  So, again, it sounds like there may still be detail and we need to make sure we cover all the permutations that you identified, in particular, but it sounds like we are pretty close to covering all the cases.
         It would be, the question now is, should we, one proposal in order to make reasonably quick progress would be to simply have a small document which is somewhat smaller than Allen's call flow document, and separate and, in particular, separate from it, because as came up during one discussion, somebody mentioned the call flow document may stay a living document, essentially forever, and that's not what we want here.  We want closure and get it out.
         And in particular, because I want to avoid that we get external requests from you shall do this, and not having, then we can just say, yes, we solved this problem, here's the solution, don't try to impose a solution which may not be necessary.   
         NEW SPEAKER:  We can regulate ourselves.   
         NEW SPEAKER:  Yes.  We can at least show them that there is, you know, we don't have to maybe add a special header method or knee jerk reaction might we will be.
         So, Ken, since you already have the most difficult part and the two of you already have the most difficult part, would it be possible that the two of you could simply, I mean, there's not much text to it.  There's really only, enumerate the cases that we have identified, in terms of a three P cc model, a back to back user agent model.  A dual stream model, a conferenceing model, and A initiated model and B initiated model.  Those seem to be roughly the ones we talked about i.E., Yes.   
         NEW SPEAKER:  And do that quickly.  And I see no particularly, looking towards at least one corner, chair corner here.  That actually does not even seem, that is a SIPPING document.   
         NEW SPEAKER:  Yes.   
         NEW SPEAKER:  Not, because there's no  
         NEW SPEAKER:  It's just usage.   
         NEW SPEAKER:  And I see no particular  
         NEW SPEAKER:  We have  
         NEW SPEAKER:  Process problem.   
         NEW SPEAKER:  We have all the tools.   
         NEW SPEAKER:  We have the tools.  There's no P headers involved.  So,  
         NEW SPEAKER:  In fact, it's information.   
         NEW SPEAKER:  It's strictly informational.  Okay.   
         NEW SPEAKER:  Okay.  Will you send me what you have.   
         NEW SPEAKER:  I can elaborate on the draft and then we can bounce it back and forth.   
         NEW SPEAKER:  Were you taking notes.   
         NEW SPEAKER:  Can we save those notes.   
         NEW SPEAKER:  Would it be be acceptable to to have it like in two weeks.   
         NEW SPEAKER:  I'm on vacation next week.   
         NEW SPEAKER:  I'm in Japan, actually, so I'm going to Finland the 29th of July.  So I will not be able to do it.   
         NEW SPEAKER:  Let's make it three weeks.  You'll be gone for a week and you'll take a eke to do the stuff and give it to me and I'll put it altogether.   
         NEW SPEAKER:  My hope would be, we can basically, since again this should not be hard, there the sense it should not be long.   
         NEW SPEAKER:  Right.  And we have most of the text already.   
         NEW SPEAKER:  You have the text just needs framework, you need glue to make it read as opposed to just be call flows.
         All we need at that point is, I mean, I would hope that by, that we can basically get, before the next IETF, have this be --  
         NEW SPEAKER:  Oh, yes.   
         NEW SPEAKER:  By far.   
         NEW SPEAKER:  Have this be in last call.  I mean, nothing goes instantaneously.   
         NEW SPEAKER:  P headers.   
         NEW SPEAKER:  Make it P headers, it will be faster.   
         (Several people talking.)  
         NEW SPEAKER:  We'll have an RFC number this afternoon.   
         NEW SPEAKER:  Yes, but that sounds like a plan to me.   
         NEW SPEAKER:  Okay.  So, the second question, the user preferences, is there, there's two problems, two small sub problems to that.  The first sub problem is the generic one which I don't think we want to solve here, which is the access to a profile by a, or user preference, to, by a user, so that device, the S T P was outgoing.  What that means presumably, technically, is that if I get my profile here, it would identify that I prefer a transcode err, if I'm talking to a hearing person.  So, there would be some kind of mechanism which will automatically set my SDP and set my signaling to do the right thing, because I don't want to crank all this manually.
         So, the question then becomes, the hard part which I don't think we can actually solve right now, is that this imposes more requirements on a user profile.  So that would be kind of the, the Dan Petrie corner of things.  And probably even somewhat different than that.  Because it does not, it is not just a list of parameters.  In some sense, unless you just simply say, need to transcode err equals one.  Presumably you want to say more than that.  Because for example, you might want to specify in that user profile, that I prefer, which this was one of the requirements, that I prefer the trans coding service located at, I mean text to speech dot org or something look that.  And so that would be part of the profile.  So we just, I think, at this point, that it may be the case, that the, since we're, the profile mechanism is evolving, hopefully with a little bit more speed than it had in the last year or two, that once we, we have two parts, namely, we need to make sure that the profile itself, allows the ability to indicate more, than simple numeric parameters, but some notion of prefab SDP or whatever else is needed for doing that.
         That's the, I mean, is anything more we can do at this point?  Can can we identify in more concrete detail what you you would need?  What you you would want to be able to specify in the user preferences, beyond just the address or name of this device.   
         NEW SPEAKER:  The name and which kind of media.   
         NEW SPEAKER:  Which kind of media you want.  Okay.  So that sounds to me like an e-mail message to Dan.  To make sure that they're aware of that particular need, that gets reflected in the requirements document for the framework.   
         NEW SPEAKER:  Yes, it's a general requirement, really, being able to invoke an application server for every call.   
         NEW SPEAKER:  Exactly.   
         NEW SPEAKER:  Yes.   
         NEW SPEAKER:  So we'll do that.  Okay.   
         NEW SPEAKER:  I'm trying to follow everything, but I was thinking a lot of things in my mind.  We have to event just follow the FCC regulation, the FCC limits us.  First look at how can we do it, and see what the services are also being possible, we don't have to name them now, but the infrastructure will be there.  That's a good thing.
         Another thing is, there might be terminals that don't have all the capabilities, because one of you mentioned that some terminals are not able to have note multiple streams.  And sometimes, there's one stream.  Video and audio and sub titles, but they make it in one stream, video and audio and sub titles in one.  So it turns out to be one media stream.  So we have to keep that in the back of our mind, that there's kind of an in between device which helps for communication, the last leap.  Let's say A, having A sent, let me, let's say that you don't have this, and there's some node and then there's a deaf user, because it might also, so this node has a lot more capability and it might be part of where it is network.  Or from there, third party, or whatever, it keeps open now.  But you have multiple terminals, for a while.  Mobile terminal is not likely to have many media streams.  So you might simplify it.  But I like to have that also included in the models.   
         NEW SPEAKER:  The question I have -- sorry.  Go ahead.   
         NEW SPEAKER:  One thing is, maybe I'm looking too much into it.  But, it might be nice if the terminal knows which stream is actually relay service stream and which is the normal call stream.   
         NEW SPEAKER:  This is the call flow I have actually.  Because in fact, you have, I mean, it's a more difficult call flow, when actually, they invite that A sense doesn't have SDP.  That's why we were working on that.  Because it presents a couple of complications, but it's possible this way as well.   
         NEW SPEAKER:  So, is this a SIP user agent?   
         NEW SPEAKER:  Yes.  This is typically a wireless terminal, because you don't want to do third party call control over the air and this is an application server that does things on your behalf.   
         NEW SPEAKER:  So this is a back to back user agent?  Or is this --  
         NEW SPEAKER:  It's in the  
         (Several people talking.)  
         NEW SPEAKER:  In fact.  No, I was, that's why I had some arrows in red, because it's possible to act as a proxy for the invited call to B.  But of course to D, you are back user, you don't have anything to avoid.  But it's actually possible to just proxy the request to B.  Which is the good thing of this flow.   
         NEW SPEAKER:  Okay.  So, just,  
         NEW SPEAKER:  Yes.   
         NEW SPEAKER:  And identification of media stream for which relay service, or trans coding service, it's just normal call from B.  Maybe something in that area, or not, I was thinking about it.  It might be that the terminal wants to know if it's a normal call stream.   
         NEW SPEAKER:  Okay.  That would  
         NEW SPEAKER:  The actual invite, yes.   
         NEW SPEAKER:  Yes.   
         NEW SPEAKER:  The requirement or a need was identified here, how we would want to identify that if B gets the request, if we do the A initiated model, gets a request, that this is actually going through a transcode err.  There the sense that I don't know if this is a legal requirement or just a convenience requirement.   
         NEW SPEAKER:  Actually, it would become an I A B requirement.  I'm thinking about this.  Because the same, there fact, it's the same operation.   
         NEW SPEAKER:  Yes, I would.  In general, I would want to know that the trans coding is happening, because that may introduce, I don't want text stream to look the same as if you are typing it because there may be inaccuracy because of speech to text technology.  I'd rather not take everything literally, like you said.  Because  
         NEW SPEAKER:   
         NEW SPEAKER:  Two text streams for example.  It also happens that I'm sending typing on a white board or something, and the it's also text, and it's the same protocol.  Which one is, you're just hammering on the keyboard for example.  Or to voice streams because it's another person being involved or for a deaf person trying to talk and then they're brightening text and the trans coding service puts the text also into voice.  We have two voice streams to B for example.  But they need something to see it.  Because in SIP, the other thing, the audio stream, video stream is video stream.  And we need to deal in another way.   
         NEW SPEAKER:  So, the question is, that appears to me like it would be dealt with with an indication in the media in the session description protocol, because it's a property of each media stream, not of the call.   
         NEW SPEAKER:  Right.  It's going to be, the media stream specific.   
         NEW SPEAKER:  Pardon me for a Inn.   
         (Several people talking.)  
         NEW SPEAKER:  This is the translation or this is subject version of it.  It can be just, because it's for human.   
         NEW SPEAKER:  Yes.  Is  
         NEW SPEAKER:  Is it any different than receiving two audio streams?   
         NEW SPEAKER:  No.  With information like we are --  
         NEW SPEAKER:  What you need, you want to actually be able to identify in some Unified manner so that you can recognize because, since not all devices have the ability to enter strings of text, it may we will be sensible to just simply have a defined flag which says, this stream has been subject to translation.  And so that I can put a little asterisk next to it, or do whatever, and in addition to that, I can say, the translation was performed by, courtesy of whoever.   
         NEW SPEAKER:  So something for the machine consumption.   
         NEW SPEAKER:  Yes.   
         NEW SPEAKER:  It's just an A attribute.  Say.   
         NEW SPEAKER:  I can't even make a more clear picture.  We have video conference call and assume I know sign language.  The sign language relay or trans coding service, has the same video stream with the little woman signing.  How does it a device know that the big screen and the lady should be on the small screen in the corner?  Because it's using the same protocol, same video stream.  You need two separate streams, or in some cases it might be a note in between, that you paste together, but how do you know this is Gonzalo.  
         NEW SPEAKER:  Because it's MIME type video slash M peg four dash sign language.   
         NEW SPEAKER:  No.  But you can do that.  If you get just one single stream, it's not a problem.  But when you get two, we have this problem anyway.   
         NEW SPEAKER:  Actually, the nice thing is, if it's not audio, it actually seems to be easier, to separate it.   
         NEW SPEAKER:  Yes.  But, in general, this is probably maybe a generalize able concern, we talk about this in a bit, in the conferenceing discussion, which is layout problem.  Namely, how much of a hint do you want to give the recipient, that, about how they should lay it out.  Because you may actually, for example, I mean, in the conferenceing, you might want to align spatially the, audio comes from left or speaker shown on the left type of deal.
         Here, in order to do that, it's certainly always useful to know what the content is.  And again, having the ability to indicate, is there, the question then becomes, is there a generalize able content label, which basically says, what type of content is part of that stream.  We had so far, I think, in SDP and G, you can identify language.  The problem is, once you go down the that road, you end up at M peg 7 which exactly does that.  Because it can exactly identify what each stream is, down to who the actor is, or presumably here, who is is doing the transcoding, a name of a person or anything else.  That's probably not what we want.
          
         NEW SPEAKER:  and actually, that's one question and I don't know the answer, is that information an attribute of the stream, meaning SDP, or is it an attribute that this leg of the session, meaning SIP?   
         NEW SPEAKER:  I thought it would be an attribute of a stream, because you could have a, composite, composition stream, or you know, and can be multiple purposes.  So it can certainly be multiple.  As soon as they can be multiple, you don't want to have, you don't want to have to do the mapping from the session down to the third stream service  
         NEW SPEAKER:  Right.  Actually, now I'm imagining the invite coming in with all the SDP and that's just it.  It's all the SDP.  It's going to be associated with the stream.   
         NEW SPEAKER:  So  
         NEW SPEAKER:  So it, it's an A line.   
         NEW SPEAKER:  Yes.  It's an A line.  The only question is, do we specifically do something here, or do we basically say, we define a purpose or a source or time or whatever field, where sign language or caption, close captioning if it's transmitted as video signal, otherwise it's just text.  But, other other indications are conveyed, is this rich enough that it's worthwhile giving a generality?   
         NEW SPEAKER:  Well, now, I don't know whether it's popping up or popping down.  Looking at the requirements from the I A P, so there, there is the requirement that you must let basically both parties know that there's a trans coding device.  Now, I think --  
         NEW SPEAKER:  There was only one party.   
         NEW SPEAKER:  At least one party.  At least one party had to know.   
         NEW SPEAKER:  Yes.   
         NEW SPEAKER:  Will we run a foul of that in that the requirements here said that, I guess we're not.  Because it said that, you know, like maybe A does not want B to know that they have the trans coding services.  Well as long as A knows they are being trans coded, that's okay.   
         NEW SPEAKER:  They are being invoking the service specifically.  There's also a difference here, between, the party, it's basically, the party which is generating the media, needs to know that it's going there a trans coding service.  Because they need to know that it gets translated along the way and may not lie in the same condition --  
         NEW SPEAKER:  Well, that's the point.  So, you're hearing impaired.  I call you.  My stream is going to be trans coded.  But you don't want me to know that you're hearing impaired.   
         NEW SPEAKER:  Okay.  That's true.  I mean, that is a, which is different from the other requirement where you basically had both sides, actually, the network, is the adversary in some sense, which you want to at least know about.  Here it is, the network comes to the aid of one of the participants, which does not necessarily want to let the other one know.   
         NEW SPEAKER:  So we meet the requirement, because you know, it's one of the parties that knows.   
         NEW SPEAKER:  Yes, and I think it might be helpful in the draft, to simply at least refer to those requirements and say that, before we get into lots of heat,  
         NEW SPEAKER:  Right.   
         NEW SPEAKER:  We thought about this, and we may not have addressed it completely, but at least we thought about it.   
         NEW SPEAKER:  Right.   
         NEW SPEAKER:  Yes.  But actually, I mean, I still believe that the point you you made in the previous session, they were right.  They were so concerned about Eintermediary s, but you said if I want the E    intermediary to be there like a SIP proxy, I don't see the problem.  But I guess they didn't quite get it.   
         NEW SPEAKER:  Okay.  So do we have, we have identified a very small action item.   
         NEW SPEAKER:  The only thing with the A attribute is maybe we need a two line M music draft for this.   
         NEW SPEAKER:  Exactly.   
         NEW SPEAKER:  Well, the question again is, how generalize able is it?  And it might we will be worth simply putting it out there and see if it is more than what we have identified here.  If not, that's fine too.  Probably, it's just simply a name space registration.   
         NEW SPEAKER:  Yes.   
         NEW SPEAKER:  And a single line, which is identify able.  And that's, it's a, I mean,  
         NEW SPEAKER:  M music, I guess like 2,000 50 will get standardized.   
         NEW SPEAKER:  That's it.   
         NEW SPEAKER:  It's a P A attribute.   
         NEW SPEAKER:  That's quite far estimation.   
         NEW SPEAKER:  M music is not exactly the fastest in the world.   
         NEW SPEAKER:  Again, I'll let you have the stamp.   
         NEW SPEAKER:  We may still get it, even in this category, as soon as the FCC gets around to talking to three GP P.   
         NEW SPEAKER:  This is just a reference point.  We didn't get a correspondence note from Keith that says make sure you a account for three G T T translation, because that's going to be mandatory.   
         NEW SPEAKER:  What's G T T  
         NEW SPEAKER:  Telephone phony.  That's what they call it.  G T T global.   
         NEW SPEAKER:  That's the same as TTY  
         NEW SPEAKER:  It's not quite.  The G T T work that is at Ericsson, pretty much led accounts for doing TTY with R T P.   
         NEW SPEAKER:  Okay.  But it is similar in spirit so to say.   
         NEW SPEAKER:  But is there any document we have to read?  Or to try to comply to that?   
         NEW SPEAKER:  Because that would be great to have that.   
         NEW SPEAKER:  As a reference.   
         NEW SPEAKER:  I'll probably get a reference here.   
         NEW SPEAKER:  But let's be careful about global text telephony part, because, they are just focused on using audio channel or streaming text over R T P and they might be limited in this way.  I don't know, I did read some of that work, and one of those people was actually being involved in making that protocol.   
         NEW SPEAKER:  People from Ericsson.   
         NEW SPEAKER:  Yes.  But there's also another function in G T PP from multi media side.  But I don't have all the information.  And they, might also want to get involved.  Intention telephony.  But more from the multi media side.  More capabilities coming on.  More than we have.  But G T T is difficult, because they are just for the FCC.   
         NEW SPEAKER:  As far as, I mean, clearly, we are all in agreement, that whatever transcoders we are, are going to be only examples.  And so that basically, we only role of calling out what T might be, is simply to give people the notion that we're not just thinking of one specific type.  And that the list is not going to be exhaustive.  I see no need.  The only real need that we may need to identify here is make sure, is that if G T T, we need to identify what it is like.  Namely, whether it has an appropriate registration, as an, as a media type in SDP so that we can actually negotiate it.  Because otherwise none of this is --  
         NEW SPEAKER:  But that's it.  Anything that you can have SDP for we can translate.   
         NEW SPEAKER:  The only general issue that we need to know is, for every media type that T handles here, is do these fields, do these types have appropriate media type registrations, because they may not.  So that's maybe one thing to, since you're more familiar with all these translation techniques, is, and now that we know about it, is to simply check what do they actually do?  Is it a TCP stream, over, just a text over, ASCII over TCP, or is it ASCII.  Or text over R T P?  Or is it.   
         NEW SPEAKER:  G T T stuff is --  
         NEW SPEAKER:  I can say, it is audio tones over R T P.  That is what the global text telephony will do.  That will require the tone to know what to deal with about tones.  So instead of beep beep beep, that is in fact, what global text telephony will do.   
         NEW SPEAKER:  So it is actually modem tones, effectively?   
         NEW SPEAKER:  But they're bizarre tones.   
         NEW SPEAKER:  It's like T T M F effectively so every character has its own tone.   
         NEW SPEAKER:  No, it's like a 5 bit ugly.   
         NEW SPEAKER:  It's one of the  
         NEW SPEAKER:   
         ( I don't know what he said.   )
         NEW SPEAKER:  It's old technology.   
         NEW SPEAKER:  Yes, and there's also a variation, which is real time text over R T P.   
         NEW SPEAKER:  Yes, I'm familiar with that  
         NEW SPEAKER:   
         NEW SPEAKER:  The reference is 23 Dot 22 six if you're interested.  Of the three GP P specifications.   
         NEW SPEAKER:  Okay.  So, it's one action item for somebody in this group, is general, is to make sure that all of these have registrations and that we have, I don't think we should do this ourselves.  Generally, it's best if the domain experts do this.  That we make sure that we contact, if we can, the people who do G T T with people who do V I Z E talk.  That they should register at least a name.  And if it's R T P definition or pay load format, make sure that we can more than just talk about it, but actually negotiate it.  Do you know of the busy talk people?   
         NEW SPEAKER:  I know one person, but, he's very slow to respond to e-mail.  But Gonzalo also knows him.   
         NEW SPEAKER:  Who?   
         NEW SPEAKER:  I might know I am but I'm not sure.   
         NEW SPEAKER:  No.   
         NEW SPEAKER:  I can try to find out.  There are more people in G T T I can try to find out.   
         NEW SPEAKER:  Okay.   
         NEW SPEAKER:  Well, we do what we can.  There's not much, if we can't do it we can't do it.   
         NEW SPEAKER:  Okay.  Are there any other action items, I guess, in, between, if it takes two people to do, write a ten line SDP draft, somewhere, between between one or two of us, we can crank one out quickly.   
         NEW SPEAKER:  Yes.   
         NEW SPEAKER:  Okay.  So we have, I believe we now have a signed, unless there are other items, we're running out of time here, unless there are other technical action items, at this point, I would propose that we report back to the group, summarizing what we have done.  And I'm hopefully, within roughly let's say a month.   
         NEW SPEAKER:  Yes.  A month.   
         NEW SPEAKER:  Roughly a month, we'll have the two drafts out.  And we have the action item which I'll take on, to simply, I'll cc the group on it, to offer Dan to incorporate user preferences.   
         NEW SPEAKER:  Yes.   
         NEW SPEAKER:  Good.   
         NEW SPEAKER:  Okay.   
         NEW SPEAKER:  Happy, Dean?   
         NEW SPEAKER:  Yes.   
         NEW SPEAKER:  Whatever.   
         NEW SPEAKER:  Dean is happy, we are happy.   
         NEW SPEAKER:  That's right.   
         NEW SPEAKER:  And so, could you e-mail, if you send it to me, I can put it on the web site.  From the minutes on the meeting.   
         NEW SPEAKER:  Those are the most accurate minutes we'll ever have.  


updated 17 Jul 2002 23:41 -0500