Issue65

Title Re: [Speechsc] Speaker Verification (Section 11) reviewcomments->additional comments
Priority bug Status resolved
Superseder Nosy List dburke, dburnett, eburger, oran, sarvi
Assigned To dburnett Topics

Created on 2006-03-24.19:44:55 by dburnett, last changed 2006-03-24.20:51:37 by dburnett.

Messages
msg125 (view) Author: dburnett Date: 2006-03-24.20:51:37
65-4 superceded by 66.
65-6 superceded by 67.
65-8 superceded by 68.
65-9 superceded by 69.
65-11 superceded by 70.
65-14 superceded by 71.
msg118 (view) Author: dburnett Date: 2006-03-24.20:01:25
1. Updated in draft.
2. All START-SESSION and VERIFY-FROM-BUFFER examples corrected.
3. In section 11 clarified that simultaneous recognition and verification is 
established by allocating the resources in the same SIP dialog.
5. I agree with Dave.  I have replaced <num-frames> with <utterance-length> in 
all examples.  Removed <num-frames> from the schema.
7. Removed "et-phoned-home" as an option for <device>.
10. Removed <extensions> from all examples.
12. Corrected conditions in 11.4.3.
13. As suggested, modified voiceprint-identifier BNF to permit unlimited 
VCHARs after the period.
15. ** No changes made **
16. Renamed all "voice-print" and "voice print" occurrences to "voiceprint".
msg115 (view) Author: dburnett Date: 2006-03-24.19:44:55
Hi Sarvi,
 
Thanks for the feedback. One or two additional comments / clarifications below.
 
Dave
----- Original Message ----- 
From: Shanmugham, Saravanan 
To: Dave Burke ; speechsc@ietf.org 
Sent: Friday, March 17, 2006 1:21 AM
Subject: RE: [Speechsc] Speaker Verification (Section 11) review comments-
>additional comments


 



-------------------------------------------------------------------------------
-
From: speechsc-bounces@ietf.org [mailto:speechsc-bounces@ietf.org] On Behalf 
Of Dave Burke
Sent: Sunday, January 15, 2006 10:11 AM
To: Dave Burke; speechsc@ietf.org
Subject: Re: [Speechsc] Speaker Verification (Section 11) review comments -
>additional comments


Six more comments on Section 11:
 
11. Missing state machine diagram
[Sarvi>>] Can be added 
 
12. In section 11.4.3, it says "...voiceprint identifier headers of the VERIFY 
method". However, Voiceprint-Identifier is placed in START-SESSION not VERIFY
[Sarvi>>] It should be START-SESSION 
 
13. The BNF is restricting the Voiceprint-Identifer to have only 3 characters 
after the period. None of the examples follow this. Why the restriction in 
length? Suggest:
 
 voiceprint-identifier  =  "Voiceprint-Identifier" ":"
                                   1*VCHAR "." 1*VCHAR
                                   *[";" 1*VCHAR "." 1*VCHAR] CRLF
[Sarvi>>] I dont' see the restriction above. The above BNF should match 
AAAAAA.BBBBBB as well.  Am I looking at this wrong?
 
DB> The BNF above is the fix I'm suggesting. The one in the spec on page 129 
uses 3VCHAR in place of 1*VCHAR and would only match AAAAAA.BBB
 
 
 
14. What kind of values does <decision> take when (a) training has been 
performed or (b) for multi-verification (can more than one voice-print 
be "accepted"?)
[Sarvi>>] For training, I don't think any of the <voiceprint> elements should 
contain a <decision> element. For multiverification result, the value would 
could be rejected, accepted and undecided.  I believe there should be only one 
<voice-print> with a decision element of the above possible values. 
 
DB> Sorry - my query is only relevant to training. Perhaps 11.5.4 should then 
have a sentence clarifying that <decision> is not present for training results?
 
15. Minor inconsistency: Why does <verification-score> range from -1.0 to 1.0 
whereas confidence (for ASR) ranges from 0 to 1.0. Why not align <verification-
score> range with confidence range?
I don't remember this clearly, but I believe there was an earlier discussion 
on the meaning of this verification score and how it should be interpreted and 
decision was to go with   -negative to possible range.
 
DB> Fine - I suppose partly it's because the value if not to be treated as a 
formal probability.
 
16. Editorial: Use voice-print or voiceprint but be consistent.
[Sarvi>>] Ok I'd go with voiceprint, seems a more common occurace. 
----- Original Message ----- 
From: Dave Burke 
To: speechsc@ietf.org 
Sent: Saturday, January 14, 2006 2:26 PM
Subject: [Speechsc] Speaker Verification (Section 11) review comments


Hello,
 
Had cause to review Section 11 of MRCPv2-09. Needs editorial attention - 
please see below:
 
Dave
 
1. Typos
 
Respository-URI -> Repository-URI
Voiceprint-Identity -> Voiceprint-Identifier
[Sarvi>>] ok 
 
2. Error in examples
 
According to the spec:
 
The value of the Verification-Mode header MUST be one of either "train" 
or "verify".
 
... yet none of the examples include said header (and one erroneously places 
it in the VERIFY-FROM-BUFFER message - it is only meant to be present in the 
START-SESSION message).
[Sarvi>>] correct. 
 
3. Not well defined how to specifiy shared resources:
 
The current text for sharing sessions between a co-resident recogniser or 
recorder and a speaker verification engine is restrictive and not accurately 
specified. The key point is that the related resources are related because 
they were allocated within the same SIP dialog and not that they were 
allocated within the same (INVITE) message transaction.
 
Suggest changing:
 
   It is possible for a speaker verification resource to share the same
   session with a recognizer resource or to operate in independently.
   In order to share the same session, the SDP/SIP INVITE message for
   the verification resource MUST also include the recognizer resource
   request
 
to:
 
   It is possible for a speaker verification resource to share the same
   session with a recognizer resource or to operate independently.
   In order to share the same session, the verification and recognizer
   resources must be allocated from within the same SIP dialog.
[Sarvi>>] I believe this was the intent. The idea was that we may want to 
start with just a Recorder/Recognizer and then add/drop the verification 
engine as needed, through a re-INVITE.
This clarification will be made.  
 
4. <result-type> not defined anywhere in the spec. Doesn't appear in schema. 
Probably not necessary.
[Sarvi>>] I think the schema needs to be fixed. I believe the verification 
result carries this information to differentiate a training result for a 
verification result. Though, the client should already know that, I think it 
might help to make the distinction within the XML. 
 
5. <num-frames> not defined anywhere in the spec.
[Sarvi>>] Will add a definition for this. Will send out a proposed text for 
this to make sure, there is no objection. 
 
DB> Actually, maybe <utterance-length> is the element intended to contain this 
information in which case we just need to replace <num-frames> in the examples 
with <utterance-length>?
 
6. Not clear for some elements if they're required or optional (section 11.5.x)
[Sarvi>>] Will clarify 
 
7. Define values in section 11.5.6. Presumably "et-phoned-home" is in context 
only if we publish on 04/01/xx?
[Sarvi>>] Do not understand. could you please explain. 
 
DB> It would be good if each of the <device> value types had a brief 
explanation of what their meaning. It seems like "et-phoned-home" might be a 
joke referring to a certain Spelberg movie about an extra terrestrial ("ET 
phone home")?
 
8. Examples missing the xmlns in NLSML in VERIFICATION-COMPLETE message 
bodies. Actually, shouldn't the  http://www.ietf.org/xml/ns/mrcpv2 namespace 
apply to all NLSML documents throughout the specification not just those 
associated with verification?
 
9. What does the grammar attribute on <result> mean in the context of 
verification?
[Sarvi>>] I believe this could contain the grammar URI that was matched with 
the RECOGNIZE command. But it probably wouldn't make much sense in many cases 
where there may not be an associated RECOGNIZE operation.
I think we should say that the result attribute should be  ignored for 
verification results.  
 
10. Many examples include <extensions> in their NLSML. Presumably this needs 
to be deleted (since the element is neither defined nor specified)?
[Sarvi>>] Yes.
 
Thx,
Sarvi


-------------------------------------------------------------------------------
-


_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc



-------------------------------------------------------------------------------
-


_______________________________________________
Speechsc mailing list
Speechsc@ietf.org
https://www1.ietf.org/mailman/listinfo/speechsc
History
Date User Action Args
2006-03-24 20:51:37dburnettsetstatus: in-progress -> resolved
messages: + msg125
2006-03-24 20:01:25dburnettsetnosy: + oran, eburger
messages: + msg118
2006-03-24 19:44:56dburnettcreate