220 Commits

Author SHA1 Message Date
Vanka0051
42a73394e9 Merge pull request #537 from pandalee99/perf/gumbel_softmax_sampler
feat(sampler): enhance with greedy sampling mode
2025-11-07 15:32:31 +08:00
Vanka0051
a963ca3d0b Merge branch 'main' into perf/gumbel_softmax_sampler 2025-11-07 15:32:22 +08:00
Vanka0051
44b5ffcb5d Merge pull request #536 from storyicon/support_batch
feat(accel): add batch support
2025-11-07 15:29:53 +08:00
PAN
a3b884ff6f feat: gumbel_softmax_sampler
Signed-off-by: PAN <1162953505@qq.com>
2025-11-06 13:12:11 +08:00
storyicon
3c360273da feat: add batch support
Signed-off-by: storyicon <storyicon@foxmail.com>
2025-11-06 05:00:28 +00:00
Vanka0051
1d5d079aaa Merge pull request #517 from storyicon/gpt2_accel
feat: achieve inference acceleration for the gpt2 stage (3.79×)
2025-10-30 16:14:46 +08:00
Vanka0051
5d67f6271b Merge branch 'main' into gpt2_accel 2025-10-30 16:14:34 +08:00
Vanka0051
e42480ced8 Merge pull request #516 from storyicon/s2mel_accel
feat: achieve inference acceleration for the s2mel stage (1.61×)
2025-10-30 15:55:25 +08:00
storyicon
c1ef4148af feat: achieve inference acceleration for the gpt2 stage
Signed-off-by: storyicon <storyicon@foxmail.com>
2025-10-24 08:15:00 +00:00
storyicon
31e7e855e2 feat: optimize s2mel stage
Signed-off-by: storyicon <storyicon@foxmail.com>
2025-10-24 07:30:20 +00:00
wangyining02
bde7d0bdf0 doc: update chat groups 2025-10-10 14:01:36 +08:00
nanaoto
db5b39bb6a Merge pull request #461 from Yttrin/main
fix: Empty generator -> IndexError problem on non-streaming infer()
2025-10-02 01:09:44 +08:00
Yt Zhong
750d9d9d15 fix: Empty generator -> IndexError problem on non-streaming infer() 2025-10-01 03:25:28 +08:00
Yt Zhong
b0c6ab8a93 Simple streaming return implementation, lower latency for the first sound. (#417)
* Add stream_return switch to get wavs from yield

* Add more_segment_before arg for more segmenting.

more_segment_before is a int, for token_index < more_segment_before, more segmenting will be applied.
0: no effect; 80 is recommended for better first-wav-latency

* Uncomment silence insertion

* fix: rename quick streaming tokens argument

* fix: rename quick streaming tokens argument

* fix: Add a wrapper for the yield function. It will not return a generator in normal condition.
2025-09-30 14:05:39 +08:00
nanaoto
2ca41d738f Merge pull request #441 from coezbek/patch-3
Feat: Warn if input text contains UNK tokens
2025-09-29 16:15:48 +08:00
Christopher Özbek
34be9bfb14 feat: Warn if input text contains UNK tokens
Added warnings for unknown tokens in input text.
2025-09-27 09:08:18 +02:00
root
84a5ef97b8 update. 2025-09-23 17:52:52 +08:00
nanaoto
c7602c1f59 Merge pull request #397 from Arcitec/indextts2-arc
IndexTTS2 Documentation Update
2025-09-23 15:58:55 +08:00
Arcitec
5471d8256f docs: Install HuggingFace CLI with high-speed download feature
The Xet storage method uses de-duplication and chunked downloads to speed up transfers in some situations:

https://pypi.org/project/hf-xet/

But most importantly, installing the Xet support gets rid of some annoying HuggingFace CLI messages about missing the feature.
2025-09-19 21:39:52 +02:00
Arcitec
ae5653986c chore: Note why package build isolation was disabled for DeepSpeed 2025-09-19 02:35:27 +02:00
Arcitec
cc9c6b6cfe docs: Clarify that UV handles Python and the environment creation
- Some users have been confused and were manually creating and activating Python venvs, which is not good since it can lead to the wrong Python version or dependency conflicts.

- Therefore, we add more detailed guidance to explain that `uv` manages the whole environment, the Python version, all dependencies and automatic environment activation.

- A few users were also confused about where `uv tool` installs binaries, but instead of explaining that in depth, we now add a link to the documentation page which explains how it works, and also instruct users to carefully read the `uv tool` output since it tells them how to add the installation to the system's path.
2025-09-18 20:28:11 +02:00
kj863257rc
64cb31a6c3 Update infer_v2.py: solve the problem of persistent cache buildup (#382)
* Update infer_v2.py

clear old cache

* Update infer_v2.py: solve the problem of persistent cache buildup

clear old cache
2025-09-18 13:59:45 +08:00
nanaoto
9e391a920a Merge pull request #354 from Arcitec/indextts2-arc
IndexTTS2 Maintenance Patches
2025-09-18 13:58:23 +08:00
Arcitec
c24d73ea44 chore: Small dependency updates 2025-09-17 21:55:44 +02:00
Arcitec
ec368de932 fix(webui): Experimental checkbox bugfixes and add visual warning label
- We can't use the original "Show experimental features" checkbox implementation, because it *deeply* breaks Gradio.

- Gradio's `gr.Examples()` API binds itself to the original state of the user interface. Gradio crashes and causes various bugs if we try to change the available UI controls later.

- Instead, we must use `gr.Dataset()` which acts like a custom input/output control and doesn't directly bind itself to the target control. We must also provide a secret, hidden "all mode choices" component so that it knows the names of all "control modes" that are possible in examples.

- We now also have a very visible warning label in the user interface, to clearly mark the experimental features.

- Bugs fixed:

* The code was unable to toggle the visibility of Experimental demos in the Examples list. It was not possible with Examples (since it's a wrapper around Dataset, but Examples contains its own internal state/copy of all data). Instead, we use a Dataset and manipulate its list directly.

* Gradio crashes with a `gradio.exceptions.Error` exception if you try to load an example that tries to use an experimental feature if we have removed its UI element. This is because Examples binds to the original user interface and *remembers* the list of choices, and it *cannot* dynamically select something that did not exist when the `gr.Examples()` was initially created. This problem is fixed by switching to `gr.Dataset()`.

* Furthermore, Gradio's `gr.Examples()` handler actually remembers and caches the list of UI options. So every time we load an example, it rewrites the "Emotion Control Mode" selection menu to only show the options that were available when the Examples table was created. This means that even if we keep the "Show experimental features" checkbox, Gradio itself will erase the experimental mode from the Control Mode selection menu every time the user loads an example. There are no callbacks or "update" functions to allow us to override this automatic Gradio behavior. But by switching to `gr.Dataset()`, we completely avoid this deep binding.

* The "Show experimental features" checkbox is no longer tied to a column in the examples-table, to avoid fighting between Gradio's example table trying to set the mode, and the experimental checkbox being toggled and also trying to set the mode.

* Lastly, the "Show experimental features" checkbox now remembers and restores the user's current mode selection when toggling the checkbox, instead of constantly resetting to the default mode ("same as voice reference"), to make the UI more convenient for users.
2025-09-17 21:54:48 +02:00
Arcitec
c5f9a31127 fix(webui): Make the Emotion Control Weight slider visible again
- The emotion weight is always applied in every mode except "Same as voice reference", so we must make the slider visible so that users can control the value. Otherwise it would silently apply the last-set value without the user knowing, which is very confusing.

- Furthermore, having the slider even on the Emotion Vectors page is *very* useful, because it allows users to rapidly change the total strength of the current emotion vectors without having to manually/carefully move every individual emotion slider.
2025-09-17 19:56:07 +02:00
Arcitec
e185fa1ce7 fix(webui): Make the Advanced Settings visible by default again
- The Advanced Settings contains some very advanced features which users shouldn't tweak, but it also contains important insight into segmentation generations, and the "max tokens per generation segment" feature which users must tweak if they have low VRAM.

- Therefore it's very important that users notice the "Advanced Settings" section so that they can read the VRAM help text and reduce the segment length if they have VRAM issues. So let's make the advanced category visible by default again until a better solution is determined.
2025-09-17 19:56:07 +02:00
Arcitec
c266910cc6 refactor(webui): Remove repeated code in Examples loader 2025-09-17 19:56:07 +02:00
Arcitec
8aa8064a53 feat: Add reusable Emotion Vector normalization helper
- The WebUI was secretly squashing all emotion vectors and re-scaling them. It's a good idea for user friendliness, but it makes it harder to learn what values will work in Python when using the WebUI for testing.

- Instead, let's move the normalization code into IndexTTS2 as a helper function which is used by Gradio and can be used from other people's code too.

- The emotion bias (which reduces the influence of certain emotions) has also been converted into an optional feature, which can be turned off if such biasing isn't wanted. And all biasing values have been re-scaled to use 1.0 as the reference, to avoid scaling relative to 0.8 (which previously meant that it applied double scaling).
2025-09-17 19:56:07 +02:00
Arcitec
1520d0689b fix(webui): New default emo_alpha recommendation instead of scaling
- Silently scaling the value internally is confusing for users. They may be tuning their settings via the Web UI before putting the same values into their Python code, and would then get a different result since the Web UI "lies" about the slider values.

- Instead, let's remove the silent scaling, and just change the default weight to a better recommendation.
2025-09-17 19:56:07 +02:00
Arcitec
ef097101b7 fix(webui): Add support for Gradio 5.45.0 and higher
- We were using ".select" to detect when tabs are changed, but Gradio has modified behavior in 5.45.0 to only trigger from user clicks. They now require that we use ".change" to detect tab changes from code. This fix makes the Examples work when loading on new Gradio versions.
2025-09-17 19:56:07 +02:00
index-tts
cb5c98011f Merge pull request #378 from index-tts/tts2dev
update Contributors
2025-09-17 11:39:05 +08:00
shujingchen
d50340aa5b update Contributors 2025-09-17 11:37:20 +08:00
index-tts
12ee39996f Merge pull request #375 from index-tts/tts2dev
update Contributors
2025-09-16 20:22:52 +08:00
shujingchen
a37d808923 update Contributors 2025-09-16 20:20:50 +08:00
index-tts
02c1e5a234 Merge pull request #374 from index-tts/tts2dev
Update contributors
2025-09-16 19:45:47 +08:00
shujingchen
901a5a4111 update Contributors 2025-09-16 19:43:32 +08:00
shujingchen
1361244010 update Contributors 2025-09-16 19:38:33 +08:00
shujingchen
c2482142d6 Merge remote-tracking branch 'origin/main' into tts2dev 2025-09-16 19:28:59 +08:00
shujingchen
3e416dc598 update Contributors 2025-09-16 19:28:09 +08:00
index-tts
70aa801b25 Merge pull request #372 from index-tts/tts2dev
update readme
2025-09-16 15:55:13 +08:00
shujingchen
58f8a9d2b1 Merge remote-tracking branch 'origin/main' into tts2dev 2025-09-16 15:53:38 +08:00
shujingchen
e3595faec1 add Contributors in Bilibili 2025-09-16 15:51:46 +08:00
shujingchen
ef86774658 update Official Statement 2025-09-16 14:21:02 +08:00
shujingchen
de949be82a update Official Statement 2025-09-16 14:18:49 +08:00
index-tts
45d8d13f0b Merge pull request #368 from index-tts/tts2dev
Include usage notes for Pinyin
2025-09-16 13:22:22 +08:00
shujingchen
961dcc23f4 add pinyin.vocab 2025-09-16 13:18:55 +08:00
shujingchen
be4af061f1 update 2025-09-16 13:13:21 +08:00
shujingchen
10c1fcd3ad add tips: pinyin usage 2025-09-16 13:10:40 +08:00
shujingchen
7b4f0880d9 update modelscope demo page link 2025-09-16 11:31:15 +08:00