Skip to main content

Observing the VOICEVOX API

· 3 min read

VOICEVOX is stated to be composed of an editor, an engine, and a core.

Reference: Overall Structure

It seems the editor is the application, the engine is an HTTP server, and the core is a module that performs speech synthesis processing.

This implies that the editor makes REST API calls (hereinafter referred to as API) to the engine.

So, this article will observe the content of that API.

Wireshark was used to capture the API traffic.

Communication on Startup

Here are the results after filtering with http and tcp.port == 50021.

The following information appears to be read on startup:

  • Version information /version
  • Engine manifest information /engine_manifest
  • Speaker information /speakers (character list like Zundamon)
  • Singer information /singers (same as above)

After obtaining speaker/singer information, more detailed information for each character is retrieved (e.g., /speaker_info?speaker_uuid=xxx, /singer_info?speaker_uuid=xxx).

Communication during Speech Synthesis Request

Now, I sent a speech synthesis request with Zundamon and peeked at the API.

It seems that audio is acquired in the following flow:

  1. Accent information via /accent_phrases
  2. Speech synthesis of Zundamon's voice via /synthesis?speaker=3

The request body sent in (2.) is similar to the response from (1.).

Therefore, the flow appears to be: get accents in (1.), then synthesize speech from those accents in (2.).

Actually Calling the API

I used the httpie tool to call the API.

  1. Get Speaker Information

It was found that Zundamon (Normal) has an ID of 3.

  1. Get Accent Information

I tried to get accent information for ずんだもんなのだ (Zundamon nanoda). (Unlike speaker information, this is retrieved with a POST request.)

  1. Speech Synthesis

Create a request body like the following:

{
"accent_phrases": </data obtained from /accent_phrases>,
"speedScale": 1,
"pitchScale": 0,
"intonationScale": 1,
"volumeScale": 1,
"prePhonemeLength": 0.1,
"postPhonemeLength": 0.1,
"outputSamplingRate": 24000,
"outputStereo": false,
"kana": ""
}

Since httpie cannot handle WAV files, I will send the request using PowerShell.

# Define URL and JSON data
$url = 'http://localhost:50021/synthesis?speaker=3'
$jsonBody = @"
{
"accent_phrases": [
{
"moras": [
{
"text": "ズ",
"consonant": "z",
"consonant_length": 0.12722788751125336,
"vowel": "u",
"vowel_length": 0.11318323761224747,
"pitch": 5.773037910461426
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0.09306197613477707,
"pitch": 6.108947277069092
},
{
"text": "ダ",
"consonant": "d",
"consonant_length": 0.04249810427427292,
"vowel": "a",
"vowel_length": 0.09372275322675705,
"pitch": 6.09743070602417
},
{
"text": "モ",
"consonant": "m",
"consonant_length": 0.07012023776769638,
"vowel": "o",
"vowel_length": 0.1172478124499321,
"pitch": 5.932623386383057
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0.06496299058198929,
"pitch": 5.745952129364014
},
{
"text": "ナ",
"consonant": "n",
"consonant_length": 0.038462959229946136,
"vowel": "a",
"vowel_length": 0.08576127141714096,
"pitch": 5.5794854164123535
}
],
"accent": 1,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ノ",
"consonant": "n",
"consonant_length": 0.05504273623228073,
"vowel": "o",
"vowel_length": 0.0903041884303093,
"pitch": 5.551316261291504
},
{
"text": "ダ",
"consonant": "d",
"consonant_length": 0.05024997144937515,
"vowel": "a",
"vowel_length": 0.20450790226459503,
"pitch": 5.633930206298828
}
],
"accent": 2,
"pause_mora": null,
"is_interrogative": false
}
],
"speedScale": 1,
"pitchScale": 0,
"intonationScale": 1,
"volumeScale": 1,
"prePhonemeLength": 0.1,
"postPhonemeLength": 0.1,
"outputSamplingRate": 24000,
"outputStereo": false,
"kana": ""
}
"@

# Create HTTP headers
$headers = @{
'Content-Type' = 'application/json'
}

# Send POST request and get response
$response = Invoke-WebRequest -Uri $url -Method Post -Headers $headers -Body $jsonBody -OutFile "output.wav"

# Open and play
start output.wav

VOICEVOX: Zundamon

That's all!

A simple solution to the problem of converting 0.248 (16) to a decimal fraction (Fundamental Information Technology Engineer Examination)

· One min read

Points

Hexadecimal problems use binary.

Steps

  1. Convert the hexadecimal number to binary. 0.248(16) = 0.0010 0100 1000(2)

  2. Remove the decimal point. 0.0010 0100 1000(2) = 0.0010 0100 1000(2) × 0010 0000 0000(2) / 0010 0000 0000(2) = 0100 1001(2) / 0010 0000 0000(2)

  3. Convert the binary number to decimal. 0100 1001(2) / 0010 0000 0000(2) = (64 + 8 + 1) / 512 = 73 / 512

Common Matplotlib (Pyplot) Code Examples

· 5 min read

I'll introduce commonly used code examples with graphs.

Table of Contents

Creating Graphs

First, import the required libraries.

import matplotlib.pyplot as plt
import numpy as np

Single Graph

fig, ax = plt.subplots()

2 x 3 Graphs

fig, axs = plt.subplots(2, 3)

Plotting Graphs

Plotting a parabola

x = np.linspace(-1, 1, 201)
y = x ** 2

fig, ax = plt.subplots()
ax.plot(x, y)

Plotting a parabola with points

fig, ax = plt.subplots()

x = np.linspace(-1, 1, 21)
y = x ** 2

ax.plot(x, y, 'o')

Setting color to orange

fig, ax = plt.subplots()

x = np.linspace(-1, 1, 21)
y = x ** 2

ax.plot(x, y, color="tab:orange")

Standard colors are as follows:

ColorString
Bluetab:blue
Orangetab:orange
Greentab:green
Redtab:red
Purpletab:purple
Browntab:brown
Pinktab:pink
Graytab:gray
Olivetab:olive
Cyantab:cyan

Setting line width to 4

fig, ax = plt.subplots()

x = np.linspace(-1, 1, 21)
y = x ** 2

ax.plot(x, y, lw=4)

Setting Titles

Setting title to "Title"

fig, ax = plt.subplots()
ax.set_title("Title")

Setting Axis Labels

Setting X-Axis Labels

Setting x-axis label to "Time (s)"

fig, ax = plt.subplots()
ax.set_xlabel("Time (s)")

Setting Y-Axis Labels

Setting y-axis label to "Distance (m)"

fig, ax = plt.subplots()
ax.set_ylabel("Distance (m)")

Setting Graph Top and Bottom

Setting top to 100

fig, ax = plt.subplots()
ax.set_ylim(top=100)

Setting bottom to -100

fig, ax = plt.subplots()
ax.set_ylim(bottom=-100)

Setting top to 100 and bottom to -100

fig, ax = plt.subplots()
ax.set_ylim([-100, 100])

Setting Graph Left and Right

Setting left to -100

fig, ax = plt.subplots()
ax.set_xlim(left=-100)

Setting right to 100

fig, ax = plt.subplots()
ax.set_xlim(right=100)

Setting left to -100 and right to 100

fig, ax = plt.subplots()
ax.set_xlim([-100, 100])

Displaying Grid

fig, ax = plt.subplots()
ax.grid()

Displaying grid vertically only

fig, ax = plt.subplots()
ax.grid(axis="x")

Displaying grid horizontally only

fig, ax = plt.subplots()
ax.grid(axis="y")

Setting Ticks

Setting x-axis ticks

fig, ax = plt.subplots()
xticks = range(6)
ax.set_xticks(xticks)

Setting x-axis ticks and tick labels

fig, ax = plt.subplots()
xticks = range(6)
ax.set_xticks(xticks, [f"{xtick}m" for xtick in xticks])

Setting y-axis ticks

fig, ax = plt.subplots()
yticks = [i * 20 for i in range(6)]
ax.set_yticks(yticks)

Setting y-axis ticks and tick labels

fig, ax = plt.subplots()
yticks = [i * 20 for i in range(6)]
ax.set_yticks(yticks, [f"{ytick}%" for ytick in yticks])

Removing Ticks

Removing x-axis ticks

fig, ax = plt.subplots()
ax.tick_params(bottom=False)

Removing x-axis tick labels

fig, ax = plt.subplots()
ax.tick_params(labelbottom=False)

Removing y-axis ticks

fig, ax = plt.subplots()
ax.tick_params(left=False)

Removing y-axis tick labels

fig, ax = plt.subplots()
ax.tick_params(labelleft=False)

Setting Tick Colors

Setting x-axis tick color to red

fig, ax = plt.subplots()
ax.tick_params(axis="x", color="tab:red")

Setting x-axis tick label color to red

fig, ax = plt.subplots()
ax.tick_params(axis="x", labelcolor="tab:red")

Setting y-axis tick color to red

fig, ax = plt.subplots()
ax.tick_params(axis="y", color="tab:red")

Setting y-axis tick label color to red

fig, ax = plt.subplots()
ax.tick_params(axis="y", labelcolor="tab:red")

Adjusting Graph Spacing

Setting vertical spacing to 0.2 and horizontal spacing to 0.3

fig, ax = plt.subplots(3, 3)
fig.subplots_adjust(hspace=0.2, wspace=0.3)

Setting spacing to automatic

fig, ax = plt.subplots(3, 3)
fig.tight_layout()

Saving Images

Saving as PNG

fig, ax = plt.subplots()
plt.savefig("graph.png")

Saving as SVG

fig, ax = plt.subplots()
plt.savefig("graph.svg")

Saving as PDF

fig, ax = plt.subplots()
plt.savefig("graph.pdf")

graph.pdf

Saving at 300 dpi

fig, ax = plt.subplots()
plt.savefig("graph300.png", dpi=300)

Summary of frequently used dotnet commands

· One min read
CommandFunction
dotnet newCreate a new project
dotnet addAdd a package
dotnet removeRemove a package
dotnet publishPublish an app to a directory
dotnet runRun the project
dotnet slnManage solution files

dotnet new

Example

dotnet new wpf

dotnet add

Example

dotnet add package Microsoft.Web.WebView2

dotnet remove

Example

dotnet remove package Microsoft.Web.WebView2

Linux Network Management Commands (nmcli and nmtui)

· 2 min read

nmcli

nmcli connection: Display all connections

pi@raspberrypi:~ $ nmcli connection
NAME UUID TYPE DEVICE
preconfigured 1b29633c-51a7-42a8-8357-a23ddbb791b9 wifi wlan0
lo 37334688-5c87-47fc-87d3-8c4e31934dd2 loopback lo
有線接続 1 0df9157e-b1a9-3026-9bd5-f05234e1cf4b ethernet --

nmcli device: Display devices and their states

pi@raspberrypi:~ $ nmcli device
DEVICE TYPE STATE CONNECTION
wlan0 wifi 接続済み preconfigured
lo loopback 接続済み (外部) lo
p2p-dev-wlan0 wifi-p2p 切断済み --
eth0 ethernet 利用不可 --

nmcli connection show ...: Display properties

Run nmcli connection show <profile name> to display its properties.

pi@raspberrypi:~ $ nmcli connection show <プロファイル名>
connection.id: <プロファイル名>
connection.uuid: 1b29633c-51a7-42a8-8357-a23ddbb791b9
connection.stable-id: --
connection.type: 802-11-wireless
connection.interface-name: --
connection.autoconnect: はい
connection.autoconnect-priority: 0
connection.autoconnect-retries: -1 (default)
connection.multi-connect: 0 (default)
connection.auth-retries: -1
connection.timestamp: 1710955164
connection.read-only: いいえ
connection.permissions: --
connection.zone: --
connection.master: --
connection.slave-type: --
connection.autoconnect-slaves: -1 (default)
connection.secondaries: --
connection.gateway-ping-timeout: 0
connection.metered: 不明
connection.lldp: default
connection.mdns: -1 (default)
connection.llmnr: -1 (default)
connection.dns-over-tls: -1 (default)
lines 1-24

Check IP address

pi@raspberrypi:~ $ nmcli connection show <プロファイル名> | grep ipv4.addresses
ipv4.addresses: 192.168.10.113/24

Set IP address

In the example below, the IP address is set to 192.168.10.113 and the prefix length to 24.

sudo nmcli connection modify <プロファイル名> ipv4.addresses 192.168.10.113/24

Check DNS server

pi@raspberrypi:~ $ nmcli connection show <プロファイル名> | grep ipv4.dns:
ipv4.dns: 192.168.10.1

Set DNS server

In the example below, the DNS server is set to 192.168.10.1.

sudo nmcli connection modify <プロファイル名> ipv4.dns 192.168.10.1

Disconnect a connection

sudo nmcli connection down <プロファイル名>

Connect a connection

sudo nmcli connection up <プロファイル名>

nmtui: Configure network connections with TUI

sudo nmtui

How to Japanese the Linux Prompt

· One min read

This post introduces how to Japanese the Linux prompt.

1. Install Japanese Locale

Next, if the Japanese locale does not exist, install it with the following command.

sudo apt update
sudo apt install language-pack-ja

2. Set the Locale

Set the Japanese locale with the following command.

sudo update-locale LANG=ja_JP.UTF8

3. Restart the System

Finally, restart the system. This will reflect the new locale settings.

sudo reboot

With the above steps, the Linux prompt will be Japanese.

How to change the Linux prompt to Japanese

· One min read

This post explains how to localize the prompt on Raspberry Pi.

1. Setting the locale

Set the Japanese locale. Execute the following commands:

  1. Check the box as shown with a space and OK.
  2. Select ja_JP.UTF-8 and OK.
sudo dpkg-reconfigure locales

# [*] ja_JP.UTF-8 UTF-8

2. Reboot

Reboot the system.

sudo reboot

That's all.

How to install pyenv and Python on Ubuntu (including WSL2)

· One min read

Install Dependencies

Reference: Home · pyenv/pyenv Wiki

sudo apt update
sudo apt install build-essential libssl-dev zlib1g-dev \
libbz2-dev libreadline-dev libsqlite3-dev curl \
libncursesw5-dev xz-utils tk-dev libxml2-dev libxmlsec1-dev libffi-dev lbmzma-dev

Install pyenv

Reference: [pyenv/pyenv-installer: This tool is used to install pyenv and friends.] (https://github.com/pyenv/pyenv-installer?tab=readme-ov-file)

curl https://pyenv.run | bash

Add initialization script to ~/.bashrc

# Open ~/.bashrc
code ~/.bashrc

Add the following:

export PYENV_ROOT="$HOME/.pyenv"
[[ -d $PYENV_ROOT/bin ]] && export PATH="$PYENV_ROOT/bin:$PATH"
eval "$(pyenv init -)"

eval "$(pyenv virtualenv-init -)"

Install Python

Display a list of installable versions

pyenv install -l

Install Python

Install Python 3.12.2.

pyenv install 3.12.2

Set the Python version

Set the default version to Python 3.12.2.

pyenv global 3.12.2

python -V # Python 3.12.2

How to install .NET on Ubuntu (including WSL2)

· One min read

Reference: Install .NET on Linux without using a package manager - .NET | Microsoft Learn

Download the install script

wget https://dot.net/v1/dotnet-install.sh -O dotnet-install.sh

Make the install script executable

chmod +x ./dotnet-install.sh

Install the dot.net SDK

./dotnet-install.sh

To install the latest version

./dotnet-install.sh --version latest

Add to path

Open $HOME/.bashrc and add the following:

export DOTNET_ROOT=$HOME/.dotnet
export PATH=$PATH:$DOTNET_ROOT:$DOTNET_ROOT/tools