diff options
author | Volker Hilsheimer <volker.hilsheimer@qt.io> | 2022-04-09 13:27:39 +0200 |
---|---|---|
committer | Volker Hilsheimer <volker.hilsheimer@qt.io> | 2023-02-18 13:47:13 +0100 |
commit | ea5c48e518789c3387ed9c9d21978eda122e9782 (patch) | |
tree | 1962292820c408f60a0d4a047dc818c671f3ab25 /examples/speech/quickspeech/main.qml | |
parent | d90b30934beb053f5380b66e9bf089e15efa4b51 (diff) |
Emit information about speech progress
This is useful information in a UI that wants to visualize the progress
by highlighting the words and sentences as they get read. For this to
work, we ideally can emit data through a signal for each word that
allows an application to the progress information to the text that was
previously passed into QTextToSpeech::say, i.e the index and length
of the word within that text.
Implement this for all engines where we can, and add a test that
verifies that we get correct information:
On the macos and darwin backends, the delegate gets called for each
word about to be spoken, with index and length of the content relative
to the text. We don't get access to more detailed information, like
the length of the stream in second or samples, or the current playback
state.
Android provides an equivalent listener callback that tells us which
slice of the text is about to be spoken.
In the WinRT backend, we can ask the speech synthesizer to generate
track data for the generated audio, which gives us access for each
sentence and word, with the start time for each. Since we play the PCM
data ourselves, we don't get called with progress updates, but we can
use the track information to run a timer that iterates over the
boundaries with each tick. This has a risk of getting out of sync with
the actual playback though, but we can try to compensate for that.
We can use a similar strategy on flite, where the symbol tree provides
start times for each token. So we can use a timer, and follow the
progress through the input text for each token.
On speechd we don't have reliable access to anything; it theoretically
supports reporting of embedded <mark> tags when the input is SSML. So
for now, speechd cannot support this functionality.
Add highlighting of the spoken word to the Qt Quick example.
Change-Id: I36ff208b2f0112c9eb261864515ba20c4bf55f25
Reviewed-by: Axel Spoerl <axel.spoerl@qt.io>
Reviewed-by: Qt CI Bot <qt_ci_bot@qt-project.org>
Diffstat (limited to 'examples/speech/quickspeech/main.qml')
-rw-r--r-- | examples/speech/quickspeech/main.qml | 7 |
1 files changed, 7 insertions, 0 deletions
diff --git a/examples/speech/quickspeech/main.qml b/examples/speech/quickspeech/main.qml index 2ad7895..7bdcaf7 100644 --- a/examples/speech/quickspeech/main.qml +++ b/examples/speech/quickspeech/main.qml @@ -39,6 +39,12 @@ ApplicationWindow { } } //! [stateChanged] + +//! [sayingWord] + onSayingWord: (start, length)=> { + input.select(start, start + length) + } +//! [sayingWord] } ColumnLayout { @@ -52,6 +58,7 @@ ApplicationWindow { text: qsTr("Hello, world!") Layout.fillWidth: true Layout.minimumHeight: implicitHeight + font.pointSize: 24 } //! [say0] RowLayout { |